Sparse code multiple access on the generalized frequency division multiplexing

Recent advances in the communication systems culminated in a new class of multiple access schemes, named non-orthogonal multiple access (NOMA), where the main goal is to increase the spectrum efficiency by overlapping data from different users in a single time-frequency resource used by the physical layer. NOMA receivers can resolve the interference among data symbols from different users, increasing the overall system spectrum efficiency without introducing symbol error rate (SER) performance loss, which makes this class of multiple access techniques interesting for future mobile communication systems. This paper analyzes one promising NOMA technique, called sparse code multiple access (SCMA), where C users can share U<C time-frequency resources from the physical layer. Initially, the SCMA and orthogonal frequency division multiplexing (OFDM) integration is considered, defining a benchmark for the overall SER performance for the multiple access technique. Furthermore, this paper proposes the SCMA and generalized frequency division multiplexing (GFDM) integration. Since GFDM is a highly flexible non-orthogonal waveform that can mimic several other waveforms as corner cases, it is an interesting candidate for future wireless communication systems. This paper proposes two approaches for combining SCMA and GFDM. The first one combines a soft equalizer, called block expectation propagation (BEP), and a multi-user detection (MUD) scheme based on the sum-product algorithm (SPA). This approach achieves the best SER performance, but with the significant increment of the complexity at the receiver. In the second approach, BEP is integrated with a simplified MUD, which is an original contribution of this paper, aiming for reducing the receiver’s complexity at the cost of SER performance loss. The solutions proposed in this paper show that SCMA-GFDM can be an interesting solution for future mobile networks.

Among the several candidate waveforms available in the literature [9][10][11], GFDM is an interesting option due to its flexibility and overall performance in mobile applications [7]. GFDM is a multicarrier waveform that employs K individually filtered subcarriers to reduce the out-ofband emission (OOBE). Each subcarrier carries M data symbols, which means that each GFDM symbol transmits N = KM symbols. A critically sampled prototype filter is circularly shifted in time and frequency to carry all data symbols, resulting in a N sample-long GFDM block. A single cyclic prefix (CP) is added to protect the entire GFDM block against the channel delay profile, reducing the waveform overhead [12]. GFDM is a flexible waveform that can be tailored for different applications, and it can cover several other waveforms as corner cases, such as single carrier frequency domain equalization (SCFDE), OFDM, and filterbank multicarrier (FBMC) [13]. GFDM provides a degree of flexibility that is unseen in any waveform. Besides the flexibility, the interaction among the GFDM subsymbols and subcarriers adds diversity that can help overcoming the doubly dispersive channels when non-linear detectors are employed on the receiver side [14]. The benefits introduced by GFDM make this waveform an interesting candidate for future mobile networks, and therefore, it will be considered as the baseline timefrequency waveform for the physical layer in this paper.
Besides the waveform, the multiple access scheme also plays an important role in the overall spectrum efficiency of the mobile network. The conventional multiple access schemes used in 5G networks, called orthogonal multiple access (OMA), clearly do not address the challenges foreseen for the 6G networks [15]. In all OMA schemes, such as frequency division multiple access (FDMA) and time division multiple access (TDMA), there is no mutual interference among the users, simplifying the receiver's structure. However, these schemes cannot reach the multi-user channel capacity of the radio interface [16]. Therefore, in scenarios with massive connectivity, the OMA schemes will demand wider bandwidth to accommodate the data of all users, contradicting the premise of spectral efficiency for beyond 5G (B5G) and 6G networks. NOMA schemes take a completely different approach compared to OMA by leaving the orthogonality principle and allowing the interaction among the user's data. The most recent multiple access techniques allow for increasing the number of users sharing the available resources without increasing the bandwidth. This principle is called overloading, and it can increment the spectrum efficiency by allowing more users to share the same time-frequency resources [17]. In the overloaded RAN, the data symbols of different users can be superimposed in the timefrequency resources, enhancing the capacity and spectral efficiency of the system. NOMA schemes introduce a controllable mutual interference among the users. This interference must be processed by the receiver to avoid bit error rate (BER) performance losses. Therefore, the cost for the higher spectrum efficiency provided by the NOMA schemes is the receiver complexity to deal with the interference [18].
Several NOMA techniques have been recently presented in the literature. The main ones are pattern division multiple access (PDMA) [19], interleaved division multiple access (IDMA) [20], multi-user shared access (MUSA) [21], lattice partition multiple access (LPMA) [22], and SCMA [23]. The PDMA scheme combines code, power, and spatial domains to create non-orthogonal patterns that are designed to maximize the diversity and minimize the collisions of multiple users in the same communication resource [19]. The PDMA receiver can use message passing algorithm (MPA) [24] or successive interference cancellation (SIC) algorithms to remove the interference and harvest the diversity [19]. IDMA uses a concept of chips interleaving. This procedure is equivalent to spreading the information across the available time-frequency resources, leading to a better BER performance even in an overloaded scenario [20]. MUSA can be viewed as an overloaded code division multiple access (CDMA) scheme. MUSA provides a grant-free multiple access for massive connections of low-complexity devices [25]. The MUSA multiplexer uses low-correlation spreading sequences, while the MUSA receiver employs a SIC algorithm to mitigate the overloading multi-user interference [21]. The LPMA scheme multiplexes the data from different users in the same resources by assigning different lattice codes [22]. Each user can mitigate the multi-user interference using a modulo lattice operation with hybrid parallel/SIC algorithm [22].
Among the relevant NOMA schemes, SCMA deserves special attention because it outperforms the aforementioned schemes in terms of BER performance [18,26]. SCMA is based on different sparse codewords that are part of different sets, named codebooks. Each codebook represents an access layer, which is assigned for a user in the radio interface. The code sparsity reduces the collision among the layers for a given time-frequency resource. This feature allows the design of a near-optimal MPA detector with acceptable complexity [26]. Codebooks can be designed using multi-dimensional constellations, leading to a shaping gain [27]. This gain is responsible for the excellent SCMA performance in terms of BER [28].
The SCMA features and performance are attracting the attention of the scientific community. Several contributions have been presented in the last years. In [29], the authors enable SCMA to be used in downlink wireless access. In [27,28], the authors deal with the SCMA codebook design. The performance comparison among different NOMA techniques is presented in [16,18], where SCMA is highlighted by its good performance. In [30], the authors introduce a low-complexity SCMA receiver based on MPA. In [31], the authors propose a new hybrid automatic repeat request (HARQ) combined with SCMA. Finally, in [32], the authors present a real-time SCMA transceiver prototype that can provide up to 300% overloading in field tests. All these contributions show that SCMA is a strong candidate for modern multi-user wireless systems. In all abovementioned references, it is assumed that the physical layer has orthogonal time-frequency resources, such as OFDM. Recently, [33] combined SCMA with GFDM and analyzed the SER performance of the resulting scheme under different channel models, showing the advantages of combining NOMA with non-orthogonal waveforms.
The main aim of this paper is to propose two new receiver approaches to further improve the SCMA-GFDM SER performance under time-varying and frequencyselective channels. The first approach employs BEP channel equalizer [34][35][36] with a SPA MUD. This approach increments the system SER performance, but the receiver's complexity grows significantly. In order to reduce the complexity, this paper introduces a second approach where the BEP equalizer is combined with a simplified MUD (SMUD). The SMUD severely reduces the receiver's complexity at the cost of a small degradation in the SER performance. The results presented in this paper show that SCMA-GFDM presents high spectrum efficiency and good SER performance without significantly increasing the overall complexity of the receiver.
The remainder of this paper is organized as follows. Section 2 brings the main concepts of the SCMA-OFDM scheme. Section 3 presents the GFDM basic principles and its integration with SCMA and introduces the two receiver approaches that are the main contributions of this paper. Section 4 shows the SER performance analysis of the proposed approaches. Section 5 brings the final conclusions of this paper.

Background on SCMA-OFDM
OFDM [37] is a well-known waveform, extensively used in wireless system due to its low complexity and robustness against frequency-selective channels. On conventional OFDM, each OFDM symbol carries K quadrature amplitude modulation (QAM) symbols using K subcarriers, leading to a unitary overload factor. By combining OFDM and SCMA, the overall system capacity can be significantly improved.
An SCMA encoder can be defined as a mapper where Q = log 2 (J) bits, where J is the size of a J-QAM constellation [23], are represented by a predefined U-dimensional complex codeword. Each U-dimensional complex codeword of the codebook set is a sparse vector with Q < U non-zeroed entries. One SCMA layer is composed of a codebook employed by a user to encode its data to be allocated in a set of U OFDM subcarriers. Figure 1 shows one SCMA layer for Q = 2 bits, QAM with J = 4 and U = 4. The codebook is a U × J matrix where each column is a codework to represent one sequence of Q bits. This procedure is equivalent to spread the QAM into U subcarriers, i.e., the information that would be transmitted over a single subcarrier now is spread into U different subcarriers.
The SCMA scheme contains C > U different layers [26], each one with a specific codebook designed according to [29]. In [23], the authors present a procedure to define the maximum overloading factor in an SCMA system. The number of SCMA layers is given by: Because of the sparsity of the code, the number of nonzeroed elements of a given codeword that can collide with codewords from different codebooks is given by: Therefore, overloading factor is given by: ( 3 ) Figure 2 shows a multi-user SCMA encoder for the downlink channel in mobile networks, assuming Q = 2 and U = 4. Here, there are C = 6 different codebooks, each one employed to send data to a specific user, leading to an overloading factor of 1.5. All the codebooks used in this paper are presented in the Appendix. The bits to be sent to the users are mapped into a sequence x uc , which are added together, leading to: that is the transmit sequence to be mapped in U OFDM subcarriers. Notice that ξ(u) is a set formed by all layers that collide in the uth physical layer time-frequency resource. It is worth mentioning that the set of resources that are used by the same layer is denoted by ζ(c).
Wireless communication system that employs OFDM as the air interface usually adopts a high number of subcarriers, which means that K >> U. In this paper, it is considered that K/U parallel SCMA encoders will be used, each one with C layers. All SCMA encoders use the same codebooks, and the orthogonality among the different SCMA transmit sequences is provided by the OFDM structure, where no intersymbol interference (ISI) is introduced if the cyclic prefix is larger than the channel delay profile and intercarrier interference (ICI) is avoided by the orthogonality among the subcarriers. Figure 3 depicts the block diagram of this SCMA-OFDM system.
The ith SCMA sequence, s i = s i 1 , s i 2 , s i 3 , s i 4 T , will use U subcarriers of the OFDM symbol. All sequences are stacked into a K × 1 vector d = s 1 , s 2 , · · · , s K/U T .
This vector is applied to an inverse fast Fourier transform (iFFT) algorithm, resulting in the transmit SCMA-OFDM where F K is a K × K Fourier matrix and (·) H is the Hermitian operator. The overall payload of this system is given by: A CP is added to the OFDM symbol on the transmit side and removed on the receive side. Assuming that the CP is larger than the channel delay profile, the received signal after the CP removal is given by: where n is the additive white Gaussian noise (AWGN) vector with variance σ 2 and H is a circulant channel matrix, based on the channel impulse, response given by: The OFDM subcarriers of the received signal can be decoupled by the fast Fourier transform (FFT), leading to: where H = F K HF H K is a diagonal matrix containing the channel frequency response andñ is the AWGN vector in the frequency domain. This means that each sample of the received signal in the frequency domain is given by: The result presented in (10) shows that the SCMA decoder only needs to deal with the multi-user interference. Each set of U subcarriers must be applied to an SCMA MUD detector implemented using SPA. In this paper, the SPA has been chosen as the MUD algorithm to decouple the data from all SCMA layers because it is the most employed technique in the literature [23, 26-29, 31, 32]. The main reason for the success of the SPA as an SCMA MUD is its performance, which approaches the one achieved by the near-optimal detector [24].
It is worth to mention that SPA is a MPA-based decoder designed using a factor graph representation [18,24]. The factor graph is a bi-partitioned graph containing layer nodes and resource nodes. The interaction among the layer and resource nodes is represented by a connection among them [24]. Figure 4 shows the SCMA-OFDM factor graph. Notice that each node Y k has only one direct connection with one d k , showing the orthogonality among Once the SCMA factor graph is known and assuming that the channel state information (CSI) is available at the receive side, the iterative SPA can be employed to extract the data from each user. For each cluster, the message passed from a resource node s u to a layer node x c at the tth iteration is given by: where ξ(u) \ c represents the set of all possible elements in ξ(u) except c. The message passed from layer node x c to the resource node s u in the tth iteration is given by: where ζ(c) \ u represents the set of all possible elements in ζ(c) except u. After τ iterations, the marginal probability distribution of each layer node is given by: If the hard decision approach is employed, the codeword with highest probability of being transmitted x c is used to define the received J-QAM symbol for the cth user of each cluster.
According to [23,38,39], the SPA complexity is mainly defined by the size of the constellation alphabet in a given resource node and by the number of connections linking this resource node to the layer nodes. In other words, SPA complexity can be defined as O D θ , where D = J g is the size of the constellation alphabet in a given resource node and θ is the number of connections at the resource node. Therefore, the complexity associated with each resource node in the SCMA-OFDM scheme is O (J g ). Considering all U resource nodes in a cluster, the number of iterations τ employed at the SPA, and K/U clusters, the overall complexity of the SPA-based SCMA-OFDM receiver is O (τ KJ g ).

Proposed SCMA-GFDM Integration
GFDM is a modern waveform designed to overcome several limitations presented by OFDM. GFDM employs circular baseband filtering for each one of the K subcarriers, where each one carries M QAM symbol. This means that GFDM has a time-frequency frame structure composed by K subcarriers and M subsymbols that carries N = KM QAM data symbols. GFDM uses one prototype filter that is circularly shifted in time and in frequency. This approach adds an extra degree of freedom to GFDM when compared with OFDM, which is limited to K subcarriers. GFDM uses one CP to protect the entire block, reducing (2020) 2020:212 Page 6 of 14 the overhead in the physical layer when compared with OFDM. The GFDM filtering reduces the OOBE, allowing this waveform to be used in cognitive radio networks [40]. All versions of the prototype filter can be organized in the transmit matrix A, allowing the GFDM symbol to be defined as [41]: where d is the data vector containing N = KM QAM symbols.
After the CP removal, the received GFDM signal is given by: whereH = HA is the effective channel matrix. In a GFDM system, the prototype filter can be freely chosen, allowing the system to be non-orthogonal. In this case, ICI, ICI, or both might rise. The GFDM receiver must resolve the self-introduced interference to avoid SER performance loss. Linear equalizers, such as matched filter (MF), zero-forcing (ZF), and minimum mean square error (MMSE), can be used to recover the data symbols with controlled SER performance loss. The MF, ZF, and MMSE detectors are respectively defined as: where E is the average energy of the GFDM signal. The received data vector can be estimated as: where B can be any linear detector presented in (16). MF maximizes the signal-to-noise ratio (SNR), but it is unable to resolve the self-introduced interference. ZF removes the interference but enhances the noise. MMSE is a tradeoff between MF and ZF, acting as a MF at low SNR and as ZF at high SNR. For the MF and ZF, prior channel equalization is necessary, while MMSE performs the channel equalization and the subcarriers and subsymbols decoupling simultaneously. In this paper, only ZF and MMSE detectors will be considered as linear receivers. SCMA can be seamlessly integrated with GFDM using linear receivers, since the ZF and MMSE can successfully decouple the transmitted data on the receiver side. The SCMA-GFDM can use the same structure depicted in Fig. 3, where N/U SCMA encoders are employed and the buffer concatenates the s i vectors to generate d. Clearly, instead of the Fourier matrix, the modulation matrix A must be used in this case. Figure 6 shows the basic block diagram of the SCMA-GFDM receiver.
The interference introduced by the GFDM can be exploited as diversity, since the information of one data symbol will be spread among the adjacent subcarriers and subsymbols [42]. Although linear equalizers can harvest just some of this diversity gain, non-linear algorithms, such as expectation propagation, can be used to improve the GFDM performance under doubly dispersive channels. The factor graph for GFDM combined with SCMA is presented in Fig. 5 for U = 4, Q = 2, and J = 4. The SPA is able to resolve the interference introduced by GFDM and also perform the MUD. Since the non-linear GFDM detector is able to decouple the subcarrier and subsymbols, the SPA can be applied to each cluster individually to receive the SCMA codewords.
Because the subcarriers and subsymbols of the GFDM signal interact with each other, the number of links connecting the y n nodes to the d n node, which the weights are defined byH, is very large. Therefore, the complexity of the SCMA-GFDM receiver is high. The size of the alphabet S u , corresponding to all possible values for s u , is D. For the SCMA encoder presented in Fig. 5, D = J g = 64. The number of links connecting a resource node is θ = N = 256. The complexity for each resource node is O D θ = O 64 256 , meaning that a SPA-based receiver for SCMA-GFDM is not practical.
The complexity of the SCMA-GFDM receiver can be substantially reduced if the GFDM symbol is equalized by a detector presented in (16) and parallel SCMA decoders are used for each cluster afterwards. The GFDM detector will decouple the subcarriers while the SCMA decoder will deal with the multi-user interference, as depicted in Fig. 6. This is a suboptimal approach, because the linear GFDM detector is unable to harvest the diversity introduced by the interaction among the subcarriers and subsymbols, but the significant complexity reduction achieved with this approach makes the SCMA-GFDM scheme feasible. This procedure is equivalent to the one described in Section 2, but due to the GFDM equalizer, the effective channel gain in (11) is H u,u = 1. The complexity of the GFDM equalizer is O N 3 , while the complexity of the SCMA decoders for N/U clusters is O (τ NJ g ). The total complexity of the SCMA-GFDM is O N 3 + τ NJ g .
In [34], the authors propose a new equalizer called BEP. This equalizer is based on the iterative expectation propagation (EP) algorithm, which approximates the true posterior distribution into exponential family distributions using the minimization of the Kullback-Leibler divergence [43]. The equalizer proposed in [34] achieves a trade-off between complexity and BER performance, and it can be employed in future wireless communication systems [34][35][36]. Next, this paper introduces two new approaches that improve the SCMA-GFDM SER performance, employing the BEP equalizer. The first one uses the BEP equalizer the GFDM self-interference, while SPA decoders are employed to resolve the codeword collisions. This scheme improves the SCMA-GFDM SER performance at the cost of a high complexity. The second approach proposed in this paper uses the BEP equalizer with a SMUD that reduces the complexity of the SCMA decoder, at the cost of an affordable SER performance degradation. These new schemes are presented in the following sections.

SCMA-GFDM with BEP equalizer and SPA-based MUD
The scheme proposed for this approach is presented by Fig. 6. The BEP equalizer is used to detect the payload of the GFDM symbol while the SPA acts as the SCMA decoder. In this paper, the BEP equalizer proposed in [36] has been modified to recover the data transmitted by the SCMA-GFDM system. In [36], the posteriori probability of a given transmitted symbol vector, p d|y,H , was approximated by a non-normalized Gaussian q(d) as: where γ n ∈ C and n ∈ R + are scalar values that are iteratively updated to approximate the non-normalized Gaussian q(d) to the posteriori p d|y,H . The whole procedure of the iterative BEP equalizer is described in Algorithm 1. The iterative equalization process is repeated φ times. According to [34], when φ increases, the BEP equalizer performance approaches the optimum performance of the Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm [44], but with a tractable complexity. The whole BEP procedure can be divided in three stages. The first stage, indicated by line 3 in Algorithm 1, estimates the linear minimum mean square error (LMMSE) parameters, the mean vector μ, and the covariance matrix . The second stage, described by lines 4 to 14 in Algorithm 1, refines the estimates obtained in the first stage. The final stage, represented by lines 15 and 16, uses the outcome from the second stage to generate the complex Gaussian distributions that correspond to q(d) ∝ CN (d : μ, ). The mean vector μ and the covariance matrix , obtained in line 3, are used to compute the nth marginal of distribution q(d) given by CN d n : n,n . In the next stage, the marginal cavity distribution q (t) \n (d n ) is found. This distribution is the baseline to obtain the approximations for the In the next step, the distribution p (t) (d n ) is evaluated in line 6. This distribution will be used to perform the moment matching described in the EP technique [43]. Line 6 brings an indicator function I d n ∈S u , which returns the value one if d n ∈ S u and zero otherwise. As already mentioned, S u is the alphabet of all possible values for s u , defined by (4). Assuming U = 4, there are four different alphabets, namely S 1 , S 2 , S 3 , and S 4 . Figure 7 depicts these alphabets, where one can see that the alphabets S 1 and S 4 are identical, while the alphabets S 2 and S 3 are the same, as well. After computing p (t) (d n ), line 10 performs the moment matching.

Fig. 7 Alphabets formed by all values of s u
It is important to notice that there are two parameters in Algorithm 1 that control the numerical instabilities and the convergence rate. The parameters β ∈[ 0, 1] and → 0 are real-valued numbers that are empirically chosen, as described in [36]. The parameter is used in line 8 of Algorithm 1, where it can be seen as a minimum allowed variance for f n . The parameter β is used in line 10, and it can be seen as the updating smoothing factor. If β = 0, there is no change in the parameters γ n and n from one iteration to the next. If β = 1, the maximum updating rate is achieved for the parameters γ n and n . If the iterative updating process leads to a negative value of n , line 11 keeps the previous value for the next iteration, i.e., the parameter n is not updated. Although these steps do not guarantee the convergence of the algorithm, they increase the probability of convergence to a level that allows it to be used in practical systems [36].
Regarding the complexity, for each iteration of Algorithm 1, there is a N × N matrix inversion in line 3, which dominates the stage 1 complexity. In the second stage, all N values of γ n and n are simultaneously updated, leading to a complexity linearly proportional to NJ [34]. Hence, the overall complexity of the BEP equalizer is O(φ(N 3 + NJ)).
The vector delivered by the BEP equalizer, y , contains the estimates ford. This vector is used by the SPA algorithm to recover the information from the layers of each cluster. All SPA used as MUD can run in parallel. The complexity of the SCMA-GFDM with SPA-based MUD is The two iterative algorithms used to recover the users' information from the SCMA-GFDM signal result in a high complexity and high latency, but achieving good SER performance. Reducing the overall SCMA-GFDM complexity with manageable SER performance loss is the main goal of the approach presented in the next section.

SCMA-GFDM with BEP and SMUD
BEP is a soft or probabilistic equalizer that provides the posteriori probabilities, q(d), of the received symbols. This additional information allows the implementation of the SMUD scheme proposed in this paper. Figure 6 shows the block diagram of the proposed SCMA-GFDM using SMUD. BEP is still used as the GFDM equalizer, providing the received samples of each cluster without the self-interference. These estimates and the posteriori distribution are used to recover the data sent at each SCMA layer.
SMUD relies on a look-up table (LUT) to recover the users' data, which is built according to the SCMA codebooks. Each value transmitted in a given time-frequency resource is a combination of non-zeroed values from the different codebooks, as defined in (4). One example is shown at the top of Fig. 8, where s 1 is a combination of (2020) 2020:212 Page 9 of 14 Algorithm 1 BEP Equalizer. The notation (·) (t) indicates the value of a given variable (·) at the tth iteration. Require: Define the number of iterations φ Require: Set β ∈[ 0, 1] Require: Set ≤ β Require: Initialize γ (0) n = 0 and (0) n = 1 , ∀ n 1: for t = 0, · · · , φ − 1 do 2: for n = 1, · · · , N do 3: Compute the mean vector μ and the covariance matrix as follow: Compute the cavity marginal q (t) \n (d n ), defined as: n are, respectvely, the variance and the mean of the cavity marginal q (t) \n (d n ) and are given by Obtain the distribution defined as   CN (d : μ, ) as follow 16: Finally, the posteriori symbol probability are given by: q(d n = S u ) ∝ CN S u : μ n , n,n ∀ u . the values coming from layers 2, 3, and 5. These values are defined by the 4-QAM symbols in accordance with the codebook for each layer. The Appendix brings the set of codebooks used in this paper. For each combination of 4-QAM symbols, there will be a unique output in s 1 , and therefore, it is possible to build a LUT according to the alphabet S u shown in Fig. 7. As example, assuming that the sequence "00" is transmitted in layers 2, 3, and 5 in a given moment. Therefore, the correspondent outputs of each codebook will be 0.7851, −0.6351 + j0.4615, and −0.0055 − j0.2242 as can be seen in the Appendix. Hence, in this case, s 1 = 0.1445 + j0.2373. If the sequence "11" is transmitted in these layers, the correspondent outputs of each codebook will be −0.7851, 0.6351 − j0.4615, and 0.0055 + j0.2242, leading to the output s 1 = −0.1445 − j0.2373. These two examples of s 1 are highlighted in Fig. 7. This procedure can be used to map all possible sequences transmitted in layers 2, 3, and 5 to all possible values that s 1 can assume. Table 3 in the Appendix shows the initial entries for the LUT developed for s 1 . Each time-frequency resource of a given cluster, s u , will have a specific LUT. Figure 8 shows an example where it is possible to visualize how the LUT can be used in this context. Consider that all users of a given cluster transmit the sequence "00. " The resulting transmitted values in the U time-frequency resources are as follows: s 1 = s 4 = 0.1445 + j0.2373 and s 2 = s 3 = 0.7428 − j0.3077, as highlighted in Fig. 8. Assuming, for now, a noiseless channel and a perfect GFDM equalization, thenŝ u = s u . In this case, according to the LUT, theŝ 1 value will indicate that the sequence "00" was transmitted in layers 2, 3, and 5, and theŝ 2 value will indicate that the sequence "00" was transmitted in layers 1, 3, and 6. The same is true forŝ 3 andŝ 4 , for the corresponding layers. In this example, the content received in layer 3 can be estimated fromŝ 1 orŝ 3 . However, since both lead to the same sequence "00, " the information can be retrieved unequivocally. When a perfect equalization is performed, all duplicated sequences in each layer converge to the same value, leading to the correct decision.
Assume now that the distortions introduced by noise and channel equalization lead to an error inŝ 4 = s 4 . For example, consider thatŝ 4 = 0.9611 − j0.3560. Based on theŝ 1 value, the LUT will indicate that the received sequence in layers 2, 3, and 5 is "00. " The same is true for theŝ 2 andŝ 3 values, for the corresponding layers. However, base onŝ 4 , the LUT will indicate that the sequence "00" is received in layers 1 and 4, while the sequence "01" is received in layer 5, once this is the combination that results inŝ 4 = 0.9611 − j0.3560 at the output of the SCMA encoder. There are two different sequences recovered for layer 5, which are "00" obtained fromŝ 1 and "01" obtained fromŝ 4 . The posteriori distribution q(d) provided by the BEP soft equalizer can be used to define which sequence was transmitted with higher probability. In other words, the SMUD will use the sequence provided by the most reliable signalŝ u . In the example presented above, if max[ q(d 1 )] > max[ q(d 4 )], the SMUD will decide for the sequence obtained fromŝ 1 . Otherwise, the SMUD will favor the sequence obtained fromŝ 4 . It is interesting to observe that the SMUD is able to correct the error introduced by the communication channel, meaning that it can behave as a forward error control (FEC).
Regarding the complexity evaluation, the LUT can be applied in parallel for each cluster and its complexity is linear in UJ. The total complexity of the SCMA-GFDM with SMUD is O φ(N 3 + NJ) + UJ , which is much smaller than the SCMA-GFDM with SPA-based MUD. Table 1 compares the complexity of each scheme discussed in this paper.

SER performance analysis
The SER performance evaluation has been realized using Monte Carlo simulations.The SNR is defined as E s /N 0 , where E s is the average data symbol energy and N 0 is the noise power spectral density. A slow time-variant and frequency-selective channel with Rayleigh distribution is assumed, as described in (8). The SCMA-OFDM will be used as a benchmark for the proposed SCMA-GFDM schemes. The first step for the SER performance evaluation of the SCMA-GFDM with BEP equalizer and SPA-based MUD is to define the values for φ and τ . These parameters define the trade-off between complexity and SER performance. If they are too low, the SER performance losses will be significant. If they are too large, the complexity will increase without bringing any benefits in terms of SER performance. Figure 9 compares the SER performance of the SCMA-GFDM with BEP equalizer and SPA-based MUD, assuming the parameter presented in Table 2, for different values of φ and τ . From Fig. 9, it is possible to conclude that the worst performance is achieved when no iteration is employed, i.e., φ = τ = 1. When iteration is used only for the BEP, i.e., τ = 1 and φ = 10, it is possible to conclude that the performance improves mainly for high SNR. On the other hand, when iteration is used only for the SCMA MUD, i.e., τ = 10 and φ = 1, the performance for low SNR improves. Iterations are needed in both algorithms for good SER performance over the entire SNR range. The best performance among the simulated curves is achieved when τ = φ = 10, but the gain is marginal compared with the curve for τ = φ = 5. This allows one to conclude that values of τ and φ larger than 10 will increase the complexity without substantial increment in the SER performance. Figure 10 shows the SER performance of the schemes proposed in this paper and compares them with the well-known SCMA-OFDM, in which the parameters have  been defined in Table 2. From Fig. 10, it is possible to observe that SCMA-OFDM presents the best performance for low SNR, once this system does not suffer from self-interference and noise enhancement as the SCMA-GFDM. However, SCMA-OFDM also cannot benefit from the rich channel environment, being unable to exploit the diversity introduced by the interactions among the subsymbols and subcarriers. The ZF equalizer used for GFDM reception can harvest some diversity from the received signal, but this equalizer also introduces noise enhancement [45], resulting in a poor SER performance when compared with SCMA-OFDM. MMSE equalizer can deal with the GFDM subcarrier and subsymbol decoupling without enhancing the noise, leading to an acceptable performance for low SNR and outperforming SCMA-OFDM for high SNR. The two schemes proposed in this paper can outperform all other analyzed schemes for high SNR because non-linear equalizers can better exploit the diversity introduced by GFDM. The SCMA-GFDM with BEP and SPA-based MUD outperforms the SCMA-GFDM with MMSE equalizer for SNR>16 dB for τ and φ larger than 5. It is also possible to observe that the performance is slightly improved when τ = φ = 10, when compared with the curve for τ = φ = 5. The price paid, in this case, is the high complexity of the receiver. The performance of the SCMA-GFDM with BEP and SMUD outperforms the SCMA-GFDM with MMSE for SNR>24 dB, assuming φ = 10, or SNR>25.5 dB, assuming φ = 5. The price for the considerable complexity reduction of this scheme is the SER poor performance at low SNR. If good performance at low SNR is mandatory and the complexity of the BEP equalizer with SPA-based MUD cannot be afforded, a hybrid solution using the MMSE equalizer combined with SPA MUD can be used for low SNR, while the BEP equalizer with SMUD can be employed for high SNR.
Step 3 of the BEP Algorithm 1 can be used to evaluate the MMSE matrix at low SNR, simplifying the implementation of the hybrid scheme. In this case, the complexity between the two blocks of the receiver, equalizer and MUD, is balanced according to the SNR. For low SNR, the complexity of the equalizer is O N 3 + NJ and the complexity of the MUD is given by the SPA complexity, O (τ NJ g ). As the SNR increases, the complexity of the equalizer also increases by a multiplicative factor φ, while the complexity of the MUD is drastically reduced to O(UJ). The complexity of the combined system is still lower than the SCMA-GFDM with BEP and SPA-based MUD, and its SER performance would be acceptable for the entire SNR range.

Conclusions
The demand for higher spectrum efficiency in future mobile communication systems is pushing the development of NOMA techniques. Among the several techniques proposed in the literature, SCMA is attracting the attention because of its SER performance over mobile communication channels. Also, GFDM is an interesting candidate for the next wireless communication standards because of its flexibility to address conflicting requirements in a scenario with multiple applications. This paper has proposed two SCMA-GFDM schemes with nonlinear equalizers that are able to exploit the SCMA spectrum efficiency and the GFDM flexibility and good performance over rich multipath and time-variant channels. The first approach consists on using the BEP equalizer to recover the information from the GFDM signal, while a SPA-based algorithm is used as MUD to recover the user data at each SCMA layer. The SCMA-GFDM with BEP and SPA-based MUD presents acceptable SER performance for low SNR and outperforms all other analyzed schemes for high SNR. However, this scheme is highly complex. In order to overcome this problem, the paper also presents the SCMA-GFDM with BEP and SMUD, where instead of using the SPA as MUD, a LUT based (2020) 2020:212 Page 12 of 14 on the all possible results from the SCMA encoder is employed to recover the user data at each SCMA layer. The information received in each GFDM time-frequency resource will define the bit sequence for several layers, according to the SCMA encoder. Conflicting information about the received sequence in a given layer will be resolved using the BEP posteriori probability for each received sample of the GFDM signal. The sequence corresponding to the most reliable time-frequency resource will be considered at the SMUD output. This approach significantly reduces the overall system complexity at the cost of SER performance at low SNR, when compared with the SCMA-GFDM with BEP and SPA-based MUD.
The proposed schemes show that it is possible to have the benefits of NOMA with the flexibility of a modern nonorthogonal waveform, opening new possibilities for the future wireless communication systems.

Appendix. SCMA codebooks and table of examples of the SMUD LUT
The following codebooks are used in this paper:  Table 3 brings some numerical examples of the LUT of the alphabet S 1 . The most part of the LUT is omitted for the sake of brevity.