Skip to main content

Artificial intelligence for channel estimation in multicarrier systems for B5G/6G communications: a survey


Multicarrier modulation allows for deploying wideband systems resilient to multipath fading channels, impulsive noise, and intersymbol interference compared to single-carrier systems. Despite this, multicarrier signals suffer from different types of distortion, including channel noise sources and long- and short-term fading. Consequently, the receiver must estimate the channel features and compensate it for data recovery based on channel estimation techniques, such as non-blind, blind, and semi-blind approaches. These techniques are model-based and designed with accurate mathematical channel models encompassing their features. Nevertheless, complex environments challenge accurate mathematical channel estimation modeling, which might neither be accurate nor correspond to reality. This impairment decreases the system performance due to the channel estimation accuracy loss. Fortunately, (AI) algorithms can learn the relationship among different system variables using a model-driven or model-free approach. Thereby, AI algorithms are used for channel estimation by exploiting its complexity without unrealistic assumptions, following a better performance than conventional techniques under the same channel. Hence, this paper comprehensively surveys AI-based channel estimation for multicarrier systems. First, we provide essential background on conventional channel estimation techniques in the context of multicarrier systems. Second, the AI-aided channel estimation strategies are investigated using the following approaches: classical learning, neural networks, and reinforcement learning. Lastly, we discuss current challenges and point out future research directions based on recent findings.

1 Introduction

Multicarrier systems rely on transmitting data over several subcarrier signals, offering significant advantages compared to single-carrier systems [1, 2]. For example, multicarrier modulation (MCM) splits a wideband channel into overlapping narrowband subcarriers, yielding high spectral efficiency and throughput. In addition, these systems are resilient to multipath fading channels, impulsive noise interference, and intersymbol interference (ISI) [2, 3]. Due to the development of digital signal processing, the MCM has been implemented in different wireless communication systems. For instance, the orthogonal frequency division multiplexing (OFDM) modulation has been applied to the long-term evolution (LTE) system air interface [4]. Likewise, the 3GPP fifth-generation (5G) network technical specifications adopted the OFDM modulation in the new radio (NR) air interface for early deployment [1]. Concurrently, other multicarrier systems are also proposed for the beyond 5G (B5G) and sixth-generation (6G) mobile networks, such as filter bank multicarrier (FBMC), generalized frequency division multiplexing (GFDM), and universal filtered multicarrier (UFMC) [2, 5, 6].

The OFDM applies inverse fast Fourier transform (IFFT) and Fourier transform (FFT) to, respectively, modulate and demodulate a given signal with low complexity [1]. The conventional OFDM also adds a cyclic prefix (CP) to its symbol to mitigate ISI. Some OFDM waveform disadvantages comprise high peak-to-average power ratio (PAPR), frequency offset sensibility, and out-of-band leakage characteristics [1,2,3]. However, some techniques are introduced to OFDM systems to mitigate those drawbacks giving rise to some OFDM waveform variations, for example, wavelet OFDM, discrete Fourier transform spread OFDM, windowed OFDM, and resource block filtered [3, 7, 8].

The FBMC waveform uses non-orthogonal subcarriers generated based on distinct filtered pulses [9,10,11]. According to the filter design, a given subcarrier suffers intercarrier interference (ICI) only related to its adjacent subcarriers. Therefore, the FBMC improves spectral efficiency by removing the frequency guard band and drastically reducing the out-of-band leakage [9]. However, FBMC still has some disadvantages, like suffering from high PAPR. Cosine modulated multitone, filtered multitone, and discrete Fourier transform spread approaches are some techniques that reduce the PAPR, introducing FBMC waveform variations [3, 10].

GFDM and UFMC are seen as a variation of OFDM. The GFDM is a generalized conventional OFDM that maps different services into flexible subcarriers and CP by deploying different filters [3, 12]. GFDM is also robust to frequency offset and has low PAPR with a high ICI sensibility. On the other hand, the UFMC has been proposed to mitigate the ICI in OFDM systems by filtering a group of subcarriers to reduce the out-of-band leakage [13, 14]. It allows for relaxing the CP and carrier synchronization constraints.

The multicarrier signals are sensitive to carrier frequency offsets introduced by a mismatch between the local transmitter and receiver oscillators or due to high-mobility receivers in wireless communication systems [15]. The high-mobility receivers boost the Doppler effect phenomenon, which leads to ICI and system performance degradation. Multicarrier signals also suffer from other types of distortion, including channel noise sources and long- and short-term fading. Hence, the channel-imposed impairments must be evaluated and compensated at the receiver for data recovery. This process is accomplished through channel estimation and equalization techniques, involving a mathematical model that includes a channel matrix reflecting the relationship between the transmitted and the received signal [2, 16,17,18,19].

Traditional channel estimation techniques for multicarrier systems are classified into two main categories based on the sent signal knowledge at the receiver: blind- and non-blind-based approaches [2, 17,18,19]. Blind-based channel estimation extracts statistical properties from the received signals to avoid transmitting data training sequences during communication. Regardless, it requires a large amount of received data, resulting in performance degradation over fast-fading channels. Non-blind strategies rely on transmitting data known at the receiver for channel estimation, called pilot symbols. They outperform blind techniques at the cost of reducing spectral efficiency due to the pilots’ symbols transmission [19, 20]. A hybrid method between blind and non-blind procedures is called semi-blind channel estimation. It comprises sending training data to initialize the estimator, followed by blind detection techniques.

Radio channel estimations are challenging due to the rise of time-varying and frequency selectivity introduced by the high randomness and environment-dependent statistical features driven by multipath propagation, transmitter and receiver mobility, and local scattering [2, 19, 20]. Consequently, the conventional model-based channel estimation techniques have performance limitations under complex channel conditions, such as fast time-varying, multipath fading, and nonlinear deep fading conditions [21,22,23]. These environments challenge accurate mathematical channel estimation modeling, which might not fully encompass the channel features. This impairment lowers a multicarrier system’s performance due to the loss of channel estimation accuracy. However, AI-based learning algorithms can overcome those conditions by cramming the relationship among different system variables using either a model-driven or model-free approach.

AI enables devices to make decisions on their own based on past learning experiences. Instead of requiring hand-tuning, devices adapt their parameters to fluctuating environments to achieve the best operational state. Furthermore, the learning algorithms exploit the channel complexity without making unrealistic assumptions to outperform the conventional techniques under similar channels. Consequently, AI algorithms discard the need for accurate mathematical models for channel estimation, allowing for tracking parameter fluctuations over complex environments, undoubtedly encompassing those well-modeled channels. Thereby, AI-based channel estimation renews the channel estimation techniques and creates new ones. As a result, AI-based channel estimation approaches surpass the limitations of conventional methods, providing a high degree of estimation accuracy and improving communication systems performance [24].

AI-aided channel estimation studies are relevant to B5G/6G communications since AI itself is considered one of the foundations of future 6G networks [25, 26]. Moreover, B5G/6G networks are expected to operate in millimeter and terahertz frequencies to overcome bandwidth limitations and provide higher throughput. Hence, future radio communication systems will meet channels with grown complexity due to the increased attenuation (including the rain attenuation) and the high atmospheric absorption rates [25,26,27,28,29]. Beyond that, other required key technologies for B5G/6G, such as massive MIMO (mMIMO) and channel bandwidth improvement, will enlarge the transceiver’s complex architecture and introduce new challenges to channel estimation [24, 29,30,31].

Wideband channels can be frequency-selective compared to narrowband ones since the frequency components will face distinctive fading [29]. While multicarrier systems mitigate this effect, channel estimation techniques must be able to acquire the channel state information (CSI) under different system architectures and environments. For instance, the mMIMO architecture requires a large number of antennas while demanding a great number of pilot symbols [29, 30]. On the other hand, the worldwide spectrum availability in millimeter and terahertz frequencies can boost the adoption of frequency division duplex, dropping out the reciprocity between the downlink and uplink channel and raising the need for periodic CSI feedback [24, 30, 31].

Channel estimation techniques will undoubtedly face a renewed set of complex channel conditions in these new frequency bands, including some early mentioned ones. Therefore, the studies circumventing the extended application of AI capabilities to boost well-known channel estimation approaches and introduce new techniques are crucial to the physical layer of future communication systems. Moreover, it contributes directly to reshaping and building an intelligent physical layer to optimize the system decision through virtualized tools [22, 24, 30,31,32,33]. In this regard, this work is devoted to comprehensively and thoroughly discussing how AI algorithms play a critical role in the field of channel estimation techniques.

1.1 Related works

Several surveys and reviews are in the multicarrier systems channel estimation field [2, 5, 16,17,18,19,20, 34,35,36,37,38]. They mainly discuss the conventional channel estimation techniques without mentioning AI integration. Also, dedicated works about channel estimation for OFDM systems provide a comprehensive review of the state of the art by the time it was published [16,17,18,19,20, 35,36,37,38]. Other authors addressed the channel estimation techniques within a comprehensive review of OFDM systems [37]. An extensive review of channel estimation for waveforms of next-generation networks, including OFDM, FBMC, GFDM, and UFMC schemes, is found in [2]. Recently, channel estimation techniques have been discussed for 5G and millimeter-wave communication systems, including but not limited to OFDM systems [5, 29].

The AI-based channel estimation approach was considered for intelligent wireless communication systems for 5G/6G future networks [21, 24, 30, 32, 33, 39,40,41,42,43,44,45,46]. The channel estimation process was presented as a physical layer application employing AI algorithms to improve the CSI acquisition accuracy [21, 39, 45, 46]. There are already machine learning (ML) techniques overviews for solving different challenges in a wireless network, with a discussion concerning the ML categories and pointing out several applications [32, 33]. For instance, a regression-aided technique was indicated for channel estimation in high-mobility and nonlinear deep fading scenarios. A comprehensive survey about ML in the vehicular network context is found in [43], reviewing and discussing the AI-based channel estimation techniques in the context of high-mobility OFDM systems.

Massive MIMO channel techniques for modeling and estimation were the focus of [30, 40], which briefly exploited channel estimation in OFDM systems. In [30], the channel characteristics were handled as an image processing problem in the context of deep learning (DL) networks. Meanwhile, DL application in mobile and wireless networks was conducted concerning the channel estimation techniques in the role of signal-driven processing [41], similar to the discussion in [44]. However, the channel estimation subject was not comprehensively reviewed, nor were the multicarrier systems included. The authors in [42] presented the channel estimation subject as a general approach without focusing on multicarrier systems, investigating DL in terms of the model-based block architecture and algorithm design. A comparison between DL-based and conventional channel estimation methods has been provided in [24]. Moreover, performance analysis of ML-based channel estimation was carried out in [47], while recurrent neural network (RNN) channel estimation was studied in [48]. Finally, a short review of DL for channel estimation was provided in [49] without focusing on multicarrier systems.

Several works have been carried out DL for physical layer applications [22, 23, 31, 50, 51]. A comprehensive overview of model-driven DL for physical layer communication was provided in [50]. It briefly concerned the model-driven advantages over the data-driven to leverage low complexity algorithms for channel estimation in OFDM and MIMO-OFDM systems. DL-based block-structured functions for the physical layer were approached in [31], which investigated joining channel estimation and signal detection in the context of data-driven. In [23], the authors summarized the DL-based physical layer applications in 5G wireless, demonstrating how DL could assist the channel estimation process. DL use-cases for physical layer applications for 6G communication systems are found in [22]. It discussed, in a general manner, the essential requirements and challenges on the physical layer in 6G future communication systems, highlighting the deployment strategies and key enabling technologies to employ DL. Some works in the channel estimation field were also cited, discussing their findings.

1.2 Motivation and contributions

The analysis of the surveys, magazines, and review papers regarding AI for channel estimation has shown a direct approach to demonstrate the ML and DL application in the 5G and 6G physical layer communication systems, as summarized in Table 1. However, a few papers have covered AI for channel estimation research, and they are limited to specific scenarios, like high-mobility systems [43] or partially enfolding the subject [51]. Other papers supplied a tutorial introduction to AI-aided channel estimation, addressing the performance of a specific technique or comparing the most recent ones.

Table 1 Summary of existing surveys, magazines, and review papers related to artificial intelligence for channel estimation in multicarrier systems

Motivated by the research growth of AI-based channel estimation in multicarrier systems, this work offers a comprehensive survey that covers the recent discoveries in the field, discusses them, and addresses future research directions. Therefore, this paper’s main contributions are as follows:

  • An overview of channel estimation techniques for multicarrier systems comprising non-blind-, blind-, semi-blind-, and AI-based approaches, with the latter as a new group.

  • A tutorial discussion on different approaches to implement channel estimation based on AI. It covers the concepts and implementations impairments of the AI-aided model-based block-type, AI-aided block-type, and AI-aided block-type channel estimation joining function methods.

  • A comprehensive survey and discussion about the recent findings in AI-based channel estimation and its complexity, considering the classical learning techniques such as regression, evolutionary algorithm, dimensionality reduction, and Bayesian learning.

  • A comprehensive survey and discussion about the relevant neural network algorithms for channel estimation and their complexity, including feed-forward neural network, extreme learning machine, recurrent neural network, deep neural network, and autoencoder.

  • A discussion about the recent applications of reinforcement learning in channel estimation and their complexity.

  • A collection of open issues and future research opportunities to unwind the channel estimation for MCM communications systems, with an extension to single-carrier systems.

1.3 Organization of the paper

The research on the paper subject has shown extensive interest by the academic community in devoting ML algorithms to channel estimation techniques for OFDM and mMIMO-OFDM. This statement is mainly driven by the natural adoption of MCM for 4G and 5G mobile networks and other wireless systems. Hence, the authors have carried out OFDM principles to provide the fundamentals for guiding research after first contact with the technology. Despite the lack of research field extension, a few works were uncovered carrying out the AI-based channel estimation for FBMC, GFDM, and UFMC modulation techniques. These findings were also included in the paper discussion, presenting a complete state-of-the-art review.

Fig. 1
figure 1

Organization of the paper

Therefore, this survey is organized as shown in Fig. 1. A brief review of the OFDM principles is found in Sect. 2. Conventional, non-using AI channel estimation techniques for multicarrier systems are reviewed in Sect. 3, providing a background for further understanding of AI-aided techniques. The AI-aided channel estimation approach is discussed as a new set of techniques identifying their main aspects. Henceforth, classical learning-aided channel estimation techniques are reviewed in Sect. 4. Regression, evolutionary algorithm, dimensionality reduction, and Bayesian learning are covered in the context of supporting conventional channel estimation techniques. The neural network (NN)-aided channel estimation techniques are discussed in Sect. 5, mainly including feed-forward neural network (FFNN), extreme learning machine (ELM), RNN, and deep neural network (DNN). The relevant networks are compared concerning the AI-aided channel estimation characteristic presented in Sect. 3. End-to-end communication is also included and discussed in Sect. 5 since channel estimation is an intrinsic process learned by the autoencoder network, with the channel as a hidden layer. Finally, reinforcement learning techniques are addressed in Sect. 6. This emerging branch of AI has been recently investigated in the channel estimation context. Practical issues and open research topics are discussed in Sect. 7, and a conclusion is provided in Sect. 8.

2 OFDM principles

Due to the prevailing OFDM channel estimation techniques during the research, this section presents the OFDM fundamentals to provide a look inside this modulation technique and insights into the following sections. Regardless, we recommend looking inside the content in [2, 3, 9, 11, 12, 52] and the references therein for those also interested in reviewing the fundamentals of the FBMC, GFDM, and UFMC modulations.

Fig. 2
figure 2

General OFDM system structure

Figure 2 depicts a general OFDM system structure, where the first and last blocks are similar, being only arranged inversely [20, 35, 37]. The first block comprises the serial-to-parallel converter (S/P) and the mapping functions, while the last one includes the demapping and parallel-to-serial converter (P/S) blocks. The S/P and P/S blocks are responsible for converting the bits into parallel groups or serial streams, respectively. The mapping and demapping blocks convert the bits into quadrature and in-phase components and the opposite, respectively, according to the modulation scheme adopted by each subcarrier.

OFDM divides the available channel bandwidth into N different overlapping narrowband sub-channels. Instead of having one modulated single-carrier, N subcarriers are modulated to be the data bearers. In the time domain, single-carrier symbols of duration \(T_\text {s}\) are converted into symbols of duration \(T = N T_\text {s}\). In the frequency domain, each sub-channel is utilized by a different subcarrier so that all subcarriers are orthogonal. The following subcarrier frequency spacing equation achieves the orthogonality,

$$\begin{aligned} |f_i - f_k| = \frac{n}{T}, \end{aligned}$$

in which \(f_i\) and \(f_k\) are the ith and kth subcarrier frequencies, \(1 \le i, k \le N\), with \(i\ne k\), respectively, and n a positive integer number. The OFDM symbol is formed by summing up all of the modulated subcarriers.

In practice, the OFDM symbol to be transmitted is obtained using an inverse discrete Fourier transform (IDFT). The transmitter applies the IDFT to the in-phase and quadrature components of all subcarrier modulating symbols. Thus, the transmitted signal is represented by

$$\begin{aligned} s(t) = \mathfrak {F}^{-1}\{c_i\}, \end{aligned}$$

in which \(\{c_i\} = \{I_i + j Q_i\}\) with \(I_i\) and \(Q_i\) being the in-phase and quadrature components of the modulating symbols, respectively, \(1 \le i \le N\). At the receiver, discrete Fourier transform (DFT) is applied to the received OFDM symbol to separate each subcarrier signal.

Although the OFDM modulation does not present ICI, the symbols can interfere with one another, leading to interblock interference (IBI) [3, 7, 9]. This issue is handled using the CP, a copy of the OFDM’s symbol end inserted at its beginning [19, 37]. Concerning the time-domain, the CP adds a guard time between OFDM symbols that avoids IBI as long as the time guard introduced by the CP is longer than the channel-imposed delay. Although CP combats IBI in OFDM systems, it affects the orthogonality among the OFDM modulated subcarriers. Hence, the receiver extracts the CP from the incoming signal before applying the DFT to separate each subcarrier signal at the receiver.

After using the DFT to obtain each subcarrier signal at the receiver (\(y_i\)), there are expected differences compared to each sent symbol (\(c_i\)). The channel influence and receptor noise mainly own this contrast. Nevertheless, these phenomena have been extensively studied and found to be stochastic, meaning that their impact cannot be precisely calculated but assessed in terms of probability. Hence, the channel influence and the receptor noise over the sent signal will likely change as the communication system operates.

The channel affects the modulated transmitted symbol in a multiplicative manner. In other words, it introduces a complex gain over the symbol that can increase or decrease its magnitude and phase. The additive white Gaussian noise (AWGN), intrinsic to every communication system, is added to the received symbol. Therefore, each received subcarrier symbol sample is represented by

$$\begin{aligned} y_i = h_i \times c_i + n_i, \end{aligned}$$

in which, \(h_i\) and \(n_i\) represent the channel frequency response (CFR) and the AWGN at each subcarrier, respectively.

The channel estimation block employs different techniques to estimate each \(h_i\) value and feed the equalize block. If those techniques are robust enough, the output of the equalization block assumes the form

$$\begin{aligned} y_{i_\text {Equalized}} = c_i + \frac{n_i}{h_i}, \end{aligned}$$

meaning that perfect estimation was achieved. When the channel estimation is imperfect, many issues arise, such as high bit error rate (BER), spectral inefficiency, an increase in the outage probability, and so forth [3, 7, 9, 19, 37].

The main focus of this paper is on the techniques employed to estimate the channel in a multicarrier system. Hence, the following sections address traditional and recently proposed techniques. The former is a collection of immutable methods of system functioning that are well known in the literature. The latter comprehends self-adjustable algorithms that have the potential to surpass traditional methods.

3 Channel estimation techniques for multicarrier systems

Fig. 3
figure 3

Classification of channel estimation techniques

This section overviews different conventional channel estimation techniques for multicarrier systems. It classifies them into blind, non-blind, and semi-blind approaches, as shown in Fig. 3. The blind methods are divided into two main subgroups: statistical and deterministic. The non-blind techniques are subclassified as data-aided and decision-directed channel estimation (DDCE). The former only uses the training sequence or pilot symbols for channel estimation, while the latter also employs the detected data symbol. Combining non-blind and blind methods results in a set of techniques called semi-blind. Applying AI-based techniques to the channel estimation field gives rise to a fourth group that uses ML, including DL algorithms. Since this work is dedicated to exploring the application of AI-based methods in the channel estimation area for multicarrier systems, we discuss its general characteristics and definitions herein.

3.1 Blind-based channel estimation techniques

Blind-based channel estimation techniques are classified as statistical and deterministic. Statistical techniques explore the cyclic statistical properties of the received signal in the channel estimation process. As a result, it underperforms beneath shorter data sample sequences due to the statistical dependence of data. On the other hand, deterministic methods rely on quantities of both received signal and channel coefficients. Still, the computational complexity for deterministic methods is higher than the statistic ones and increases as the constellation order grows at the transmitter side. However, deterministic methods converge faster than statistical methods.

Statistical blind-based channel estimation methods are based on either the second-order statistics (SOS) or higher-order statistics (HOS) of the received signal [53, 54]. The SOS approach requires signals with cyclostationary characteristics or channel diversity with single-input single-output (SISO) [55, 56]. Also, it demands less amount of data to obtain reliable statistical estimates related to the HOS approach. Indeed, the HOS has the advantage of providing system phase information without the need for channel diversity at the cost of a large amount of data sampling and computational capacity [53, 54].

HOS applications mainly rely on single-carrier and MIMO systems [53, 54, 57, 58]. They leverage the functional properties of the impulse response channel matrix through third- or fourth-order cumulants. Meanwhile, earlier SOS algorithms have been applied to multicarrier systems, such as OFDM [53, 59]. The transmitter-induced cyclostationarity inserted by adding the CP evaluation of the received signal autocorrelation matrix using SOS [53, 59,60,61,62]. Transmitter-induced cyclostationarity techniques rely on filterbanks and non-redundant linear precoding [53, 63,64,65,66]. These techniques are inserted before the MCM systems, enabling blind channel estimation at the system output through cross-correlation operations.

Blind channel estimation without the use of any statistics has also been proposed. The authors in [67] have shown that the channel matrix null space defines the channel parameters, forming the basis for the subspace blind channel estimation algorithm. These algorithms handle the orthogonality of the noise and the correlation matrix subspaces of the received signal to estimate the channel coefficient. The correlation matrix is also estimated through time-averaging over received samples. This technique outperforms several statistic-based methods, especially under a limited number of data. The concept of subspace-based techniques has been investigated in multicarrier direct-sequence code-division multiple-access (DS-CDMA) and multicarrier code-division multiple-access (MC-CDMA) systems to obtain timing and channel coefficient to deploy linear minimum mean squared error (MMSE) receivers [68,69,70].

Concerning OFDM systems, the subspace-based channel estimation method is proposed to save bandwidth by removing or utilizing inherited redundancy or reducing or eliminating the CP by taking advantage of virtual subcarriers [71,72,73,74,75]. The subspace-based channel estimation method for SISO-OFDM systems is generalized for MIMO-OFDM systems in [76]. Furthermore, a subspace combined SOS approach has been proposed for CP-MIMO-OFDM systems [77]. Since the channel must remain static during the estimation process, system performance is improved by reducing the time-averaging and exploiting the frequency correlation among adjacent OFDM subcarriers [78]. Reduced received blocks in CP- and zero padding (ZP-OFDM) is achieved by obtaining the correlation matrix from the cyclostationary properties of the received signal [79]. Meanwhile, a second approach is discussed by frequency-domain calculating the covariance matrix from a selected group of subcarriers [60].

Blind channel estimations are also implemented based on the finite-alphabet property of the information-bearing transmitted symbols [80,81,82,83,84]. The finite-alphabet approach overcomes the loss of channel identifiability in a subspace-based algorithm when the channel has nulls in subcarriers. These algorithms have been proposed for multicarrier and MIMO multicarrier systems [80,81,82,83,84]. A blind shortening channel estimation algorithm might also mitigate the ICI based on an adaptive time-domain equalizer (TEQ) [85, 86]. Other blind-based channel estimations make the most of the concept of expectation maximization (EM) algorithm, maximum-likelihood principle, minimum variance principle, and orthogonal space-time block codes (OS-TBCs) [87,88,89,90,91]. Blind adaptive algorithms are implemented based on normalized least mean square (NLMS), recursive least square (RLS), and variable step size approaches [92, 93]. These algorithms adapt their filter parameters to minimize the mean squared error (MSE) between the filter output and the signal.

3.2 Data-aided channel estimation techniques

Data-aided channel estimation techniques are common in multicarrier communication systems. First, the known information is multiplexed within the data symbols at specific positions at the transmitter. Next, the receiver uses this information to estimate the related channel impulse response (CIR). Finally, it implements an interpolation process among these isolated CIRs to estimate the channel for those unknown data symbols.

Data-aided channel estimation techniques are implemented using two conventional strategies. The first is a training-based channel estimation technique that relies on periodically knowing the transmitted information over one or more symbol periods. The second method considers sending general information within the data, giving rise to pilot-assisted channel estimation. Despite the approach, knowing the information requires a fraction of the signal bandwidth to be wasted, reducing the spectral efficiency compared with other channel estimation techniques. In addition to that, the interpolation process introduces errors in channel estimation.

Regarding multicarrier systems, conventional data-aided channel estimation uses least square (LS), MMSE, or least mean square (LMS) methods to estimate the CFR in training or pilot mode. Most researches deal with OFDM and MIMO-OFDM systems due to the extensive adoption of this MCM in a wireless network. Still, recent works are dedicated to generalizing the devoted OFDM data-aided channel estimation to other MCM, such as FBMC and GFDM [94,95,96]. The training sequence, also called block-type pilots, allows for tracking only channel frequency variations (slow fading channel) due to the one-dimensional (1D) periodicity, estimating the channel response at each subcarrier. The conventional method assumes the channel is the same within the training sequence periodicity [97]. In this case, the estimated channel is used for the consecutive received symbols until another training sequence arrives. Time-domain linear interpolation or higher-order polynomials are considered under fast-fading channels, with the cost of increasing the system latency [98, 99].

The pilot-assisted or comb-type pilot methods utilize scattered pilot patterns and, therefore, tracks time–frequency variation. The channel estimation accuracy depends on the pilot pattern and the interpolating algorithm. The two-dimensional time–frequency pilot space defines the former. The frequency-domain pilot spacing must ensure the estimation of channel frequency variation, which depends on the delay spread. On the other hand, the time-domain pilot placement is related to the Doppler spread. Several works have studied the optimum time–frequency pilot pattern to reduce the number of pilots while preserving the time–frequency variation sampling capabilities. Some OFDM and MIMO-OFDM approaches rely on optimally designing the pilot pattern to minimize the MSE during the channel estimation [100,101,102,103,104,105,106,107,108,109,110,111,112,113]. For instance, that has been demonstrated to be accomplished through equipowered and equispaced pilots [100, 101], optimum power and pilot space related to the lower bound of the average channel capacity [102], heuristic algorithm [103, 104], general interpolator [105], convex optimization algorithm [106], nonuniform placement [107,108,109], optimum pilot power and phase selection [110, 111], iterative algorithm [112], and hopping pilots scheme [113]. In addition, grouping pilot tones into some equispaced clusters can also improve the channel estimation under the MMSE criterion [114].

Analyzing the different pilot placement approaches provides a comprehensive conclusion on the need for adaptive pilot allocation schemes. Pilots transmitted with a power higher than the data symbols improve the channel estimation accuracy. However, it gives rise to the power allocation issue [115]. Furthermore, the power of pilots at different subcarriers must remain equal to meet the MMSE [100]. Superimposed training consists of transmitting data and pilot symbols within the same available resources with different power values, and avoids data rate loss [116,117,118,119,120]. Nonetheless, the channel estimation performance is decreased due to the interference introduced by the superimposed data symbols. Partial superimposed data is an alternative to improve the data rate, while channel estimation takes advantage of the aforementioned pilot-assisted methods [117, 119]. Other pilot design criteria remains on bit error minimization [121], MIMO preamble pilot design [122], channel tracking performance [123], channel capacity maximization [124, 125], multiuser pilot design [126, 127], and PAPR reduction [128,129,130].

The OFDM pilot-assisted methods can be extended to GFDM schemes due to their block-based modulation, as found in [131, 132] and the references therein. Regarding FBMC systems, the channel estimation techniques for FBMC offset quadrature amplitude modulation (OQAM) have been addressed. This FBMC scheme counts on real-valued OQAM symbols, relaxing the real-domain orthogonality of the ambiguity function in the FBMC-OQAM system [133, 134]. However, due to the inherent interference problem, this characteristic is insufficient for channel estimation purposes in FBMC-OQAM. Thus, the estimation techniques are established in two main categories: scattered pilots and preamble-based approaches [94, 135,136,137,138]. The former consists of an auxiliary pilot (AP) or a pair of pilots (POP). The AP method allows for canceling the interference at the transmitter while the POP combines the two adjacent pilots to estimate the channels real and imaginary parts. The latter encompasses training symbols periodically transmitted over three symbols for interference control.

3.3 Decision-directed channel estimation techniques

The DDCE techniques use the data-aided strategy with the detected data symbols in the channel estimation process [19, 20]. First, it employs the detected symbols to estimate the channel, which, in turn, is applied to estimate the incoming data. Later, a channel estimation update uses this data, extending the process until all the symbols are counted. Finally, the decision is based on a bitwise approach or forced constellation points, defining the soft [139] or hard techniques, respectively [140, 141]. Using detected symbols introduces some disadvantages to DDCE techniques under fast-fading channel estimation. First, the estimation process is based on outdated data, decreasing the system performance. Once the current channel might not correspond to the one in which the incoming symbols have propagated, symbol error detection is introduced. The new symbols are fed back into the process to update the channel, leading to the propagation of the error estimation. In this case, the training symbols transmission periodicity must be adjusted according to the channel characteristic [19, 141].

The DDCE methods are addressed for OFDM systems with different approaches. They include joint estimation of carrier frequency offset (CFO) and sampling clock frequency offset [140], sample-spaced and fractionally spaced CIR [142], generalized M estimators for mitigating error propagation [141], LS and least MMSE estimators [143], maximum a posteriori channel estimation [144], hard decision signal-to-noise ratio (SNR)-assisted residual CFO estimation [145], joint CIR and noise variance estimation [146], subspace algorithm [147], time-domain channel equalizer [144], and EM algorithm [81]. Performing soft DDCE based on selecting reliable data tones purified by inter stream interference cancelation is proposed in [139]. In [148], it considers a DDCE channel estimator based on OFDM packets consisting of a preamble followed by data symbols. The technique leverages the temporal correlation in channel responses over adjacent OFDM symbols. Further, pilot symbols extract correlation in the CFR across nearby subcarriers to decrease the effect of decision errors in the time domain through frequency-domain averaging. Other works depend on reducing DDCE technique complexity for OFDM systems using transmit diversity [149,150,151].

3.4 Semi-blind channel estimation techniques

Semi-blind channel estimation techniques combine both non-blind and blind methods [152, 153]. The hybrid solution allows for better tracking of channel variations by sending training data at the beginning of the transmission interval to initialize the estimator. Similar to the previous techniques, most works have been dedicated to exploiting semi-blind channel estimation for OFDM and MIMO-OFDM systems. For instance, blind subspace algorithms combined with training sequences explore the signal SOS [74, 154,155,156]. Furthermore, first-order statistics of the received signal have been used for semi-blind channel estimation in pseudo-random postfix OFDM systems using weighted pseudo-random postfix sequences [157].

Semi-blind algorithms are accomplished using HOS or SOS within linear prediction like training-based LS [158]. SOS has been proven helpful in ICI and ISI suppression in a time-domain equalizer [159]. In [160], superimposed training is applied with the Gaussian maximum-likelihood criterion. Then, semi-blind estimation methods are used with sparse channels [161,162,163]. Some approaches employ the SOS signal to express the received signal’s correlation matrix utilizing the most significant taps (MST) [161, 164, 165]. Thereafter, the MST estimation is performed based on a training-based LS criterion. Concerning the GFDM and FBMC systems, there are a few semi-blind approaches for channel estimation that are similar to those discussed for OFDM [166,167,168].

3.5 AI-aided channel estimation techniques

Fig. 4
figure 4

AI-aided channel estimation aspects

The conventional channel estimation techniques use a model-based design, requiring accurate mathematical models according to the channel attributes. In addition, complex environments are challenging for designing channel estimating mathematical models, which might not correspond to reality. This impairment decreases the system performance due to the loss of accuracy in the channel estimation process. AI-based algorithms have the property of learning the relationship among different system variables without the knowledge of a mathematical model [30, 32]. This model-free approach allows leveraging the channel complexity without unrealistic assumptions, following a better performance than conventional techniques under the same channel [24]. Figure 4 shows the system implementation, training process, strategy, and supervision learning-level aspects related to the AI-aided channel estimation techniques.

The traditional wireless communication systems design depends on block-type transmitter and receiver structures. The different system functions are deployed as independent blocks and can be described as mathematical models. This approach supports block-by-block optimization to enhance the system overall performance. The channel estimation process is also deployed as an independent block functionality. Consequently, the AI algorithms can be added to the model-based design to strengthen the channel estimation through channel parameters prediction, defining the AI-aided model-based block-type channel estimation (AMBCE) approach [31]. Further, the AI algorithm can replace the model-based channel estimation, resulting in the AI-aided block-type channel estimation (ABCE) methods.

The AI learning capacity yields different joint functions at the transmitter or the receiver. For instance, combining the channel estimation process with signal detection [31]. This approach is defined as AI-aided block-type channel estimation joining function (ABCEx). An extension of this concept comprises modeling both the transmitter and receiver as a unique AI network resembling autoencoder models [30, 45, 46, 50]. From the implementation point of view, the system is seen as an end-to-end solution with a single block, whereas the channel is a hidden layer. This strategy is also considered in the review process since channel estimation is an intrinsic function.

Regarding the supervision level, AI-based channel estimation algorithms are grouped into supervised, unsupervised, and reinforcement learning, a conventional classification of ML algorithms [21, 169, 170]. The former is efficient but requires a labeled dataset for training purposes. On the other hand, unsupervised learning observes a random dataset to extract patterns to model the process and predict its behavior. This AI algorithm is quite useful when the system data are vast. Finally, reinforcement learning introduces interactions between the system and its experience performance using feedback rewards and penalties.

There are aspects related to the AI-aided channel estimation techniques concerning the training process. The standard AI networks are data-driven, with the network structure trained using a large amount of data. This approach is extended to the AI-aided channel estimation technique, giving rise to some impairments. Moreover, the standard algorithm requires a long training time, which may not be affordable for some wireless applications. Hence, the model-driven approach is an alternative to solve these drawbacks, comprising a model, an approach, and a network [22, 50, 171]. The model is based on physical mechanisms and domain knowledge to provide general solution guidance to design an algorithm as a solution. As a result, the AI network is deployed based on the unfolding algorithm process, which demands less training time and data.

Online and offline training strategies are considered for AI-aided channel estimation techniques [22, 50]. The former trains with a large amount of data as they come from different communication systems or reliable simulators, whereas the latter means training with a static dataset. Offline training is not affordable for complex environments due to the static training model, eliminating the capability of tracking channel variation effects. Also, the static characteristic reduces the AI network training for communication systems related to the training dataset. Furthermore, online training introduces real-time updating of the AI network parameters by tracking the channel effects variation and extending the application range to different practical environments.

Fig. 5
figure 5

Classification of AI-aided channel estimation techniques

The AI-aided channel estimation strategies survey has revealed three practical approaches: classical ML, NNs, and reinforcement learning (RL). The former consists of applying regression, dimensionality reduction, and Bayesian learning to improve conventional channel estimation methods performance, as shown in Fig. 5. The estimator preserves the block-type structure based on the AMBCE approach with supervised learning. The NN-based estimator comprises AMBCE, ABCE, and ABCEx structures depending on the proposed approach. The data-driven or model-driven procedures are also recurrent. Different schemes are surveyed among the NN structures, as presented in Fig. 5. The autoencoder network is classified as ABCEx, with data-driven nature and online training. The reinforcement learning branch is also covered with pioneer works studying the Q-learning technique. The following sections look at the relevant works in the AI-aided channel estimation for the multicarrier systems field.

4 Classical learning-aided channel estimation techniques

The classical learning techniques are discussed in this section, focusing on their applications to conventional model-based techniques. The linear, polynomial, and nonlinear regression algorithms are early basic applications of ML concepts for channel estimation. Support vector regression (SVR) has recently been raised as a potential regression strategy in AI-aided channel estimation techniques [172, 173]. The evolutionary algorithm has also been applied to channel estimation, whereas the genetic algorithm is more widely used than other evolutionary techniques. In parallel, dimensionality reduction has been investigated as an iterative algorithm estimator technique. It has been revisited in the AI era as an interesting technique to reduce voluminous datasets while preserving their information and refining DNN strategies [174]. Finally, the Bayesian learning techniques are also applied to iterative algorithms and are seen as primary stages of ML.

4.1 Regression

Regression algorithms consist of a statistical process defining a relationship between two dataset-related variables. The main concept is to find a function that best fits the training data behavior to perform predictions. For example, linear [175,176,177], polynomial [178], 2D nonlinear [99, 179, 180], and support vector [172, 173, 181,182,183,184,185,186,187] regressions have been employed in channel estimation for multicarrier systems. Regression algorithms go under the supervised learning paradigm. The following works considered the regression strategy applied in an AMBCE manner.

4.1.1 Linear and polynomial regression

Linear regression involves finding a linear equation to predict the value of a dependent variable (y) according to a given data value, called an independent variable (x). The linear equation is given as \(y = ax+b\), where a is the slope of the linear function and b is the intersection with the y axis [188]. The method considers simple linear regression with a single-input variable or a multiple linear regression comprising multiple inputs. The line best fitting the dataset values is obtained using an approximation based on an error criterion such as the MSE. However, there are datasets for which a linear curve does not represent the relationship between the independent and dependent variables. Therefore, the linear regression evolved into a polynomial regression by adding a polynomial of order higher or equal to two [189]. The polynomial degree is a hyperparameter that must be determined to avoid dataset over- or underfitting.

In connection with channel estimation, the linear regression algorithm enhances the interpolation process in data-aided methods, where the channel is first estimated through comb-type schemes at the pilot subcarriers. For instance, the linear regression is combined with a pilot-assisted iterative channel estimation [175], an LS estimator [176], and normalized MSE estimator [177]. An LS fitting (LSF) polynomial regression is derived from a linear MMSE to approximate the eigenvectors of the channel correlation matrix by orthogonal polynomials [178]. The MSE performance of the LSF is close to the linear MMSE when the polynomial degree is high or equals to two. The LSF advantage is the non-statistical strategy over the linear MMSE. Channel estimation and data detection are combined in a blind or semi-blind regression model approach [190]. The regression algorithm is applied to find the data sequence associated with the LS channel estimator within the set of possibilities.

4.1.2 Nonlinear regression

Nonlinear regression is a variation where the model function combines nonlinear parameters related to one or more independent variables. This regression is similar to the above-mentioned variations since they all call to find a curve or surface best fitting a dataset [191]. A look inside its application to the channel estimation for multicarrier systems has revealed that the nonlinear regression model is based on a time–frequency space (2D regression model) and combined with an initial channel estimation through an LS estimator [99, 179, 180]. First, the pilot carriers are applied to the LS estimator to calculate the channel at those taps. Next, the time–frequency plane is divided into the same block structure. Then, the 2D nonlinear regression is applied to each block to find a 2D surface function to minimize the Euclidean distance to the initial LS channel estimation at the pilot subcarriers. Finally, the regression function estimates the channel at the data symbol taps in the time–frequency domain grid. The results of BER have revealed an excellent approximation to the perfect channel estimation.

4.1.3 Support vector regression

The support vector regression is an extension of the support vector machine algorithm for regression estimation problems [181, 192, 193]. This algorithm introduces the error acceptance flexibility into the regression field [192]. By taking the linear regression as an example, the goal is to minimize the squared error, while the SVR aims at minimizing the coefficient errors. Hence, the model absolute error is managed to be lower or equal to a maximum error (\(\epsilon\)). As a consequence, the model accuracy is handled by constrained specifications [193].

The SVR has been used to estimate nonlinear channels in OFDM and MIMO-OFDM systems. The SVR was combined with a data-aided channel estimation method like the previous regression techniques. A multi-regressor SVR was proposed to track the relationship between transmitted and received data through channel estimation, with BER performance similar to the MMSE [181]. Using the same training dataset, a BER comparison between the proposal and the radial basis function network (RBFN) has shown that the multi-regressor SVR exhibits lower values.

Moreover, a complex LS-SVR channel estimator for pilot-assisted OFDM systems was formulated by observing the signals time–frequency relationship, surpassing the LS estimator [182]. Next, the nonlinear SVR-based algorithm was extended to stand highly selective channels for OFDM systems [184,185,186]. Notably, a method was proposed based on a learning and estimation phase process to get the frequency response of a MIMO-OFDM system. This approach comprises mapping trained data into a high-dimensional space and using the structural risk minimization principle to leverage the regression estimation for the CFR function [186].

By combining the MMSE with the nonlinear SVR, the authors in [187] accomplished better channel estimation than the LS-SVR. The proposal was to map the input data into a finite-dimensional space to enable a higher-dimensional Hilbert space, similar to the approaches in [184, 185]. A nonlinear SVR-based algorithm implemented with a radial basis function kernel for LTE systems leveraged the information in the pilot subcarriers to estimate the CFR [183]. The algorithm leads to lower BER under the same SNR compared with the LS and feedback estimators, from a good approximation to a perfect estimation. The wavelet transform was used to obtain weights to improve twin SVR (TSVR) channel estimation in pilot-assisted OFDM systems operating in fast selective fading channels [172, 173]. The training samples are weighted according to their distance from the mean values filtered by the wavelet transform. The TSVR algorithm was evaluated in terms of BER and compared with other approaches, resulting in the TSVR being the closest to the perfect estimation curve.

4.1.4 Complexity discussion

Complexity-wise, linear and nonlinear regression estimation shows low complexity and can be adapted to MIMO systems [99, 175, 176, 179, 180]. On the other hand, SVR algorithms demand higher computational complexity but still reserve room for improvement and outperform the other algorithms [181, 182, 184, 185]. Models combining MMSE and SVR also require high computational complexity, but the authors in [187] claim it can be reduced.

4.2 Evolutionary algorithm

Evolutionary algorithms are convectional ML methods based on biological evolution mechanisms, aiming at the global minimum while not sticking to local minima. Some evolutionary algorithms are the genetic algorithm (GA), repeated weighted boosting search (RWBS), particle swarm optimization (PSO), differential evolution algorithms (DEA), and colony optimization [194, 195]. Among those approaches, GA [196,197,198,199,200,201,202], RWBS [203,204,205], and PSO [206,207,208,209] have been applied to channel estimation in multicarrier systems. Evolutionary algorithms are also exploited mainly in pilot pattern placement optimization, which is out of the scope of this work [210, 211]. This approach indirectly improves conventional estimators performance, defining the optimal pilot pattern without supporting the channel estimation process.

4.2.1 Genetic algorithm

Fig. 6
figure 6

Genetic algorithm working principle

The GA solves a given optimization problem based on biological evolution, as shown in Fig. 6 [212]. First, the algorithm generates an initial population and evaluates each individual with a fitness value. After that, it selects the fittest individuals, discarding the others. Then, the remaining individuals are crossed-over to generate new ones, employing a mutation scheme to insert randomness. Finally, the new population is evaluated to rank the individuals for future replacement and selection. After reaching a given criterion or a predefined number of generations, the algorithm terminates.

The GA has been used for OFDM channel estimation based on AMBCE systems implementation. For instance, the GA was applied to yield weight optimization for NN, decreasing the iteration number, training time, and overall computational complexity [196]. The joint solution has overcome the MMSE estimator. Beyond that, the interpolation process has been replaced by a GA to assist a pilot-aided OFDM system in estimating the CFR of non-pilot subcarriers [197]. However, the results have proved that the approach has not outperformed the conventional techniques, claimed by the authors as a novel method. A blind channel estimator based on GA has been proposed using cyclostationarity and spectral factorization [201]. The solution has been shown to improve blind estimators by combining spectral factorization and GA compared to a subspace-based estimator from the literature.

Combining the LS and MMSE estimators has been accomplished using a GA [198]. The linear estimators generate the initial population to feed the GA and optimize the channel estimation. The GA allowed selecting the best channel estimation matrix among three candidates using a fitness function. Then, mutation operation is applied to the LS and MMSE, followed by a crossover process and a second mutation. The method was evaluated by comparing its normalized MSE with the standalone LS and MMSE implementation. In conclusion, the approach exhibited better results than the conventional estimators for binary phase-shift keying (BPSK) with a few iterations, which was overcome by the quadrature phase-shift keying (QPSK) modulation as the iteration numbers grew. Joint GA-based channel estimation and multiuser detection have also been carried out in rank-deficient scenarios [194, 199, 200, 202].

4.2.2 Repeated weighted boosting search

The RWBS is a guided stochastic global search optimization algorithm to solve complex problems [213]. It requires an initial random population related to the potential solutions. Then, the population is updated by replacing the worst individuals with a convex combination of the potential solutions [214]. Based on this concept, some channel estimation techniques were obtained for OFDM systems [203,204,205]. For example, this algorithm was used and modified to generate a candidate CIR vector approximating to the global optimum solution instead of summing the weighted candidate vectors [203]. This approach improves the convergence rate of the proposed estimator when compared with the conventional version (using the RWBS without modifying the generation process) at the cost of the worst performance. Still, under scenarios with a limited number of subcarriers, an assessment revealed the algorithms equivalent performance with faster convergence and low complexity.

Furthermore, joining channel estimation and multiuser detection for OFDM systems was accomplished by applying the RWBS algorithm to provide soft outputs to feed a forward error correction (FEC) decoder [204]. The joint solution iteratively estimates the CIR while trading information between the detector and estimator through the FEC capability. The results have shown the solution potential to equal the performance of the LS estimator and approach the maximum-likelihood multiuser detection with those using the perfect CIR. Despite the lack of comparison, the work under discussion might be a variation solution based on a GA [199]. Lately, Hanzo’s research group has proposed a quantum-assisted RWBS algorithm for channel estimation with joint data detection [199]. They have claimed their quantum RWBS-based estimator differs from their previous work by adopting a different methodology for creating the individual population and maintaining the algorithms complexity. However, an evaluation comparison of their solutions has shown superior performance of the quantum RWBS algorithm [205].

4.2.3 Particle swarm optimization

Fig. 7
figure 7

Particle swarm optimization principle

The PSO concept relies on the social behavior of insects and sociable animals. Such an approach defines group behavior while also considering individual intelligence. Since a particle finds an optimal solution, the others attempt to pursue it considering its position, as shown in Fig. 7.

Inspired by this behavioral algorithm, some works have inserted it in the channel estimation context for OFDM and MIMO-OFDM systems. The channel parameters are estimated using iterative linear estimators and delivered to the PSO algorithm that works with it to improve the BER [206]. The BER comparison against the LS and least minimum mean squared error (LMMSE) estimator has shown that the proposal equals performance, mainly with the latter. Furthermore, using superimposed training symbols, a multi-objective PSO has been designed to join channel estimation and decoding in MIMO-OFDM systems [207]. The estimator analysis showed promising results under rank-deficient scenarios. Moreover, other approaches were proposed for joint channel estimation and data detection [208], or partial parallel interference cancelation through auxiliary PSO [209], with this last one outperforming the MMSE estimator.

4.2.4 Complexity discussion

GAs present more computational complexity than regression algorithms [196,197,198,199,200]. Mainly, GA-artificial neural network (ANN) exhibits \(10\%\) less number of iterations contrary to the conventional Levenberg–Marquardt (LM) multilayer perceptron (MLP) channel estimator [202]. RWBS proved less complex and able to achieve perfect channel estimation without requiring a large dataset [203,204,205]. PSO had its computational complexity varying according to the channel coefficients demanding a few iterations (100 at maximum) to overcome LMMSE estimate [206,207,208,209]. A comparison among some of the discussed evolutionary algorithms is available in [194].

4.3 Dimensionality reduction

The dimensionality reduction ML algorithm includes techniques aiming to reduce dataset dimensions, yielding better predictions [174]. Among those strategies, principal component analysis (PCA) and independent component analysis (ICA) are applied to multicarrier systems channel estimation. The former relies on orthogonal transformation to convert correlated variables into uncorrelated variables [174]. It reduces the dataset dimension to principal components while maximizing the variance. The latter focuses on separating diverse independent sources while keeping the dataset dimension.

Fig. 8
figure 8

Channel estimation using the I-MSPCA proposed in [217]

The PCA has been applied to OFDM and MIMO-OFDM systems [174, 215,216,217]. The approaches use PCA to find the principal components of the dataset for channel estimation purposes. The data is arranged in a matrix to calculate the eigenvectors of the covariance [215], with the greatest one defining the principal component eigenvector of the dataset used for channel estimation. An improved multi-scale PCA (I-MSPCA) is accomplished by combining a wavelet transform, as shown in Fig. 8 [217]. Wavelet decomposition is performed upon the received OFDM symbols to compute the covariance matrix of the wavelet coefficients, filtered by a threshold, and applied to the PCA algorithm. The computed principal components are passed to a cross-correlation block to correlate them with the received OFDM symbols. The most outstanding value of the maximum cross-correlation values is selected to define the principal component representing the CIR. The I-MSPCA was evaluated in a frequency-selective channel and performed better than the proposal in [215] and the traditional LS estimator.

A semi-blind method used the LS criteria within pilots to estimate the initial CIR and applied the PCA to track channel variations through a two-layer NN using an RLS variable step size [216]. A recent approach applied PCA as a dimensionality reduction transformation to create an ML synthesizer for CSI [174]. The PCA is used to assist in generating of artificial samples from a real voluminous dataset while preserving the information. This approach aims to support DL models that require a large amount of training data.

Similarly to the PCA, the ICA has been studied in the context of OFDM and MIMO-OFDM systems channel estimation [218,219,220,221]. The ICA application enables blind signal separation, supporting blind equalizers design based on iterative layered space-time equalization [218] or MMSE and layered space-frequency equalization to enhance the system performance [219]. The proposal combining wavelet transformation and ICA is presented in [220], which is similar to the approach discussed in [217] for PCA. In [221], a semi-blind channel estimation strategy integrates ICA with pilot carriers. The pilots allow obtaining initial channel estimation that serves as the input data to the ICA algorithm. Notably, ICA usage has leveraged blind and semi-blind estimators outperforming the MMSE estimator even when using perfect CSI [219, 220].

Complexity-wise, dimensionality reduction ML algorithms represent a simple and fast approach to assist the multicarrier systems decision block. Moreover, they accelerate the convergence of the decision block after enriching the training dataset. However, the decision block technique mainly impacts the overall systems computational complexity.

4.4 Bayesian learning

Bayesian learning algorithms rely on the Bayes theorem, where the a posteriori probability of a variable is conditional on the observed a priori probability of a known input variable [222]. The model is initialized based on the belief that the data is updated after the learning algorithm extracts information from it. Regarding the channel estimation area, Bayesian learning estimates the channel parameters upon the received signal observation [222, 223]. This method generates a model-based design approach, which is the concept of several works, including multicarrier systems. Furthermore, some works have recently addressed the Bayesian learning theory to enable iterative channel estimation in multicarrier systems [223,224,225,226]. These strategies count on joining the Bayes theorem to an iterative technique, which can be seen as a prior stage to ML Bayesian algorithm for channel estimation. Thus, a brief discussion of their findings is addressed.

The Bayesian clustered-sparse channel estimation (BCS-CE) method is applied to frequency-selective fading channels to exploit the cluster correlation in the training matrix, improving the estimation performance [223]. The BCS-CE was compared with traditional sparse channel estimators, with an LS estimator knowing the channel using the lower bound. The average MSE results show that the proposed estimator exceeds the traditional methods and converges to the lower bound for higher SNR. Despite that, its complexity is higher than that of traditional estimators. Besides, Bayesian learning has been joined to binary particle swarm optimization to afford pilots optimal design and channel estimation through the mutual incoherence property (MIP) criterion [224]. The proposed estimator was evaluated assuming 16 and 24 pilots optimally positioned according to the MIP criterion and compared with the 124 equidistant pilots applied to the LS estimator. The estimator outputs a better performance than the LS with 16 pilots, while the 24 pilot cases showed a better performance for higher SNR.

EM and Bayesian learning algorithms have also enhanced channel estimation on OFDM systems [225]. Bayesian learning allows the construction of a prior sparse signal model in which the EM algorithm updates the parameters. In [226], a joint model- and data-driven strategy is proposed to derive a training, theoretical interpretive, and flexible model. It is accomplished using a Gaussian mixture model adapted to evolve based on the stochastic behavior of the received signal [222, 223]. In addition, Bayesian learning allows for estimating the posterior distribution of the channel parameters.

Regarding Bayesian learning algorithms, optimal Bayesian estimation involves a heavy computational load [222, 223]. Hence, the proposal used supporting algorithms to enable the Bayes theorem-based channel estimator to infer the channel parameters [222]. However, those algorithms demanded a higher computational load than LS estimator, and an optimum pilot design is required to reduce the computational complexity [224, 225]. Indeed, Bayesian learning algorithms for channel estimation can outperform traditional methods at the cost of higher computational complexity [223, 226].

5 Neural network-aided channel estimation techniques

This section addresses NN applications in the channel estimation process. The NN schemes have been grouped into different sections considering their standard features or training methods. Surveys about this subject showed an effort to design DNN and optimize the volume of the training dataset. At the same time, recent approaches are rising to circumvent the training issue by building different networks and ML dimensionality reduction strategies.

5.1 Neural network concepts

Before going into the survey on the paper subject, it is essential to define some concepts related to NN to yield better comprehension of the discussion in the following sections. First, regarding artificial NN, the direct computation units are called neurons, arranged in a layer fashion to form the network [227]. The neurons are connected through structures defined as weights that scale the neuron input and alter the function computed at the neuron. Hence, the functions employ those weights as parameters to propagate the inputs to the outputs [227, 228]. Second, NN learning comes from the weight changing at each interaction based on external stimuli referred to as training sequences or datasets. Here, the learning process is classified as supervised, unsupervised, and reinforcement learning, with the definitions presented in Sect. 3 [227,228,229]. During the training process, the output provides feedback prediction errors that allow to adjust the weight in the NN according to the learning process to pursue a better prediction in the incoming iteration [227, 230].

The weighted input sum at each neuron is applied to an activation function or transfer function responsible for introducing nonlinear operations to the prediction process based on mathematical operations [229]. This function is essential to leverage NN learning through complex tasks. Despite the layer number, it breaks through simple linear mathematical iterations and avoids getting a linear regression model. The activation function might be linear or nonlinear. There are a set of types among nonlinear activation functions, such as the sigmoid or logistic, hyperbolic tangent, rectified linear unit (ReLU), Gaussian error linear unit (GELU), softmax, and so forth. These nonlinear activation functions present advantages and limitations, which are not in this work scope and are appropriately found in [227,228,229].

Fig. 9
figure 9

Perceptron basic structure

The architecture of a NN is related to the layer design fashion. Based on this principle, a NN primary architecture definition is classified as the single layer and multilayer. The single-layer NN comprises a set of weighted (\(w_1, w_2, \dots , w_n\)) inputs (\(x_1, x_2, \dots , x_n\)) directly mapped to the output through an activation function, as shown in Fig. 9. This structure is commonly referred to as perceptron [227]. In addition, the perceptron might have an input invariant to the prediction part, defined as a bias, which defines the activation threshold. The multilayer NN architecture integrates neurons layer—arranged in an input and output layer connected by single or multiple intermediate layers defined as a hidden layer. For instance, Fig. 10 shows a multilayer structure. Once again, the neurons in the hidden layer might also have a bias weight.

Finally, another essential aspect to discuss is the algorithm used for NN training. The training algorithms are related to the function applied to update the weights among the network layers, searching to boost the learning process at each iteration [227, 229]. There are two possible concepts defined: incremental or batch training. The former updates the weights immediately after each iteration, while batch training leverages the updating process after all the inputs are inserted in the NN [230]. While the error is computed considering the network output prediction and the output expectation, the training algorithms rely on the error back-propagation mechanism. In other words, the algorithm implements a set of steps to update the weight starting from the output layer in the direction of the input layer.

Notably, the NN families are classified according to their structural aspects, such as the number of hidden layers and neuron connections. Thus, this survey considered grouping the proposed approaches regarding the NN classification while introducing each type appropriately.

5.2 Back-propagation neural network

When back-propagation algorithms are used in NN training, back-propagation neural networks (BPNNs) are created [231]. The basic algorithm concept is backward propagating the network error from the output to the input layer and adjusting the weights to reduce the network error through the steepest descent approach. This BPNN is deployed to work with real-domain data. Since the CIR is a complex-type signal, the channel estimation is also a complex-valued process.

Fig. 10
figure 10

Multilayer neural network basic structure elements

Fig. 11
figure 11

General complex-valued BPNN for channel estimation [231, 232]

Complex-valued BPNNs have been designed based on three layers (input, hidden, and output) of NN for channel estimation purposes, as shown in Fig. 11 [231, 232]. The complex signal is decomposed into real and imaginary parts to feed-forward the network. At the end, the output is summed to compose the channel estimated sample.

The BPNN in Fig. 11 has been used for channel estimation and equalization in FBMC [233] and OFDM systems [231, 232, 234,235,236,237,238], employing supervised learning through training sequences. The number of used perceptrons is specific, according to the proposal. The BPNN performance has been assessed in terms of BER and MSE compared to other conventional channel estimation approaches. Concerning the MMSE, LS, and LMS methods, the BPNN has underperformed the former while it has outperformed the others [231, 232, 236].

Complexity-wise, BPNN shows less complexity than the MMSE algorithm, although underperforms it [232]. BPNN exhibited a loss of about 2 dB compared with MMSE for the 0 dB SNR scenario. BPNN was also tested against semi-blind channel estimation and presented \(96\%\)\(97\%\) BER enhancement at the cost of an \(86\%\)\(87\%\) increase in complexity [236]. In general, BPNN estimation approaches do not require complicated matrix computations, and the optimum result happens when the size of hidden neurons is almost equal to the channel length [231, 233,234,235, 237, 238].

5.3 Feed-forward neural network

The FFNN is characterized by presenting connections among the neurons, not forming a cycle, and depending on the same layer. The data flow between the input and output layers includes single or multiple hidden layers. When there is one single hidden layer, the FFNN is known as MLP. Linear operations are realized in each perceptron, and the result is applied to an activation function before perpetuating it to the adjacent layer. The use of a radial basis activation function defines the RBFN subgroup.

Fig. 12
figure 12

Complex-valued MLP network for channel estimation proposed in [239]

FFNNs have been applied to FBMC, OFDM, and MIMO-OFDM systems. The networks are data-driven and use an ABCE and ABCEx approach. The training process is supervised by issuing pilot sequences online or offline. Concerning the MLP, a channel estimation has been implemented for a preamble-based FBMC system using a complex-valued two-hidden layer NN, which is offline trained with simulated datasets [239]. Figure 12 shows the proposed network, where the ReLU and tanh nonlinear activation functions are used in the hidden and output layers, respectively.

Furthermore, the initial MLP network was modified by inserting an MSE loss function to update the network. The two proposals were evaluated in terms of BER, with lower rates than the traditional LS. The Levenberg–Marquardt training algorithm allowed the designing of a two-layer complex-valued MLP for OFDM systems [240]. Therefore, it was extended to a MIMO-OFDM system, where the training also considers the one-step secant strategy [241]. The performance analysis showed that MLP results in more diversity gain than the conventional channel estimation approaches.

Initially, RBFN estimators were implemented with a single hidden layer and analyzed against the LMS, MMSE, and zero-forcing (ZF), outperforming the referred estimators [242]. The network structure resembles the one shown in Fig. 11, considering the discussion context. RBFNs were tested for OFDM systems by exploring the channel correlation in the time and time–frequency domains [243, 244]. The former considers estimating the channel for each subcarrier independently through the network. The latter cooperatively estimates the channels at different subcarriers. The strategies performance has shown to be similar in terms of BER. Meanwhile, the one-dimensional RBFN has been compared with an interpolation RBFN using fewer pilot subcarriers as training inputs. This second approach offered lower BER than the first one.

Tracking channel fluctuations in pilot-aided OFDM systems operating in a boisterous environment using RBFN have been shown to work well compared to traditional interpolation approaches [245]. In parallel, a Gaussian radial basis function interpolation was applied for fast-fading channel estimation. The LS method treats the initial estimation, and the channel response estimation is assisted by the Gaussian one hidden layer RBFN [246, 247]. The proposed scheme was applied to comb-type OFDM systems for analysis purposes, generating lower MSE than the LS and other RBFNs estimators. Lately, RBFN has been applied to a coherent optical OFDM system to implement an RBFN-based nonlinear equalizer [248]. The network weights are updated based on a two-step process. First, a K-means clustering algorithm is used to adjust the hidden layer weights. Further, the least mean square algorithm updates the output layer weight. Finally, a Q-factor assessment has been performed to highlight the proposal results against other works, resulting in a 4-dB performance improvement.

The MIMO-OFDM channel estimation based on RBFN was evaluated in [245, 247,248,249,250,251,252,253,254]. The RBFN structure is replicated to each antenna branch connected to the input layer. Thus, N inputs are forward connected to the next layer to demodulate the signals. A semi-blind technique has been improved by updating the function iteration based on an RBFN [249]. Further, evolutionary algorithms (PSO and GA) were employed to enhance the network parameters. Despite the mixture of techniques, there was no comparison to the conventional estimator for assessment purposes. In [250], the RBFN estimates the initial values of the MIMO channel supporting the particle filter method, which drops out the need for more training pilots since it tracks the channel variation.

Furthermore, joining the channel estimation and signal detection was done using an RBFN optimized by a genetic algorithm [251]. The approach was close to the MMSE estimator in terms of BER. Cyclic delay diversity OFDM systems were also targeted to the RBFN, which was introduced to solve interpolation problems in an uneven-pilot-based system [252]. Meanwhile, the Gaussian radial basis function has been extended to the MIMO-OFDM scenarios to leverage RBFN solutions [253, 254]. The solutions have returned better performance than the LS and LMS estimator, close to the MLP network BER.

Regarding the complexity, FFNN adds computational latency while improving the BER. For example, the proposal in [240] contributes to a gain of 1.2 dB at \(10^{-3}\). Also, [241] concludes that a training data length of 16 symbols or more produces remarkable results and better performance than the conventional LS, meaning that a compromise between performance and computational complexity must be reached [242]. Interpolation RBFN-based techniques exhibit complexity and performance trade-offs [244, 246]. The ultimate complex estimation methods are proposed in [250, 251], which achieve optimal performance in terms of BER and spectral efficiency at the cost of higher computational complexity.

5.4 Extreme learning machine

Fig. 13
figure 13

ELM NN for channel estimation in MIMO-OFDM systems

An ELM is an FFNN based on fast learning and one-shot training, reducing the training time with low computational complexity. The weights are set through the Moore–Penrose generalized inverse matrix. This learning technique has been applied in the channel estimation field for OFDM and MIMO-OFDM systems. The evaluated ELM networks are a single-hidden layer with an implementation based on the AMBCE [255], ABCE [256,257,258,259,260,261,262,263,264], and ABCEx [265, 266] approaches. The referred works employ a network comprising p input and m output neurons, as shown in Fig. 13. These network variables have different meanings according to the system design. For instance, p is equal to the number of receiving antennas, while m is related to the number of transmitting antennas for MIMO-OFDM systems. The number of hidden layer neurons (l) defines the Moore–Penrose generalized inverse matrix dimension.

Applying real-valued ELM networks has exploited joint channel equalization and symbol detection [265, 266]. This scheme has two input layer neurons corresponding to the real and imaginary parts of the received symbol. In [265], the training process uses an LS solution, while the ELM algorithm in [266] employs pilot blocks. Complex-valued ELM schemes were also investigated for channel estimation with p equal to the training sequence length [256]. The online trained network has been evaluated in a nonlinear channel condition, overcoming the LS and MMSE estimator BER results. Furthermore, the network performs similarly to the scheme without nonlinearities. The nonlinear distortion has been carried out in [258] to enhance the performance of OFDM systems with insufficient CP. The offline trained network was deployed online using an initial LS estimator to obtain the features of the CFR.

A technique to reduce the number of training pilots was developed based on the ensemble learning theory [257]. This method generates and combines different models to find an optimal predictive model. The ensemble approach comprised weighted averaging and median of the ELM model predictions based on the training error and pruning generated models, including combinations thereof. The BER results demonstrated the proposal effectiveness with a lower rate than ELM schemes and a similar performance compared to the MMSE.

A semi-supervised ELM has been proposed to channel estimation and equalization for vehicle to vehicle communications [264]. The training phase considered taking the label data training length equal to the unlabeled dataset. Afterward, the system implementation applies an LS pre-equalization after the FFT is conducted, with the output delivered to the semi-supervised ELM. The evaluation has demonstrated BER performance close to the LS and other ELM-based estimators. However, the algorithm execution time has been the longest among the compared methods. On the other hand, an ELM-based equalizer for OFDM-based radio-over-fiber systems was evaluated in [263]. The authors proposed a multilayer generalized complex-valued ELM build circumventing the ELM algorithm expansion to achieve an ELM-autoencoder. The network evaluation has outperformed other ELM from the literature, while the authors claimed that the proposal increased the computational cost.

Regarding MIMO systems, a semi-blind channel estimation process based on ELM networks has outperformed the BPNN, MLP, and RBFN. The scheme encompasses estimating the CFR at the pilot subcarriers and applying it to the training of the real-valued network. In addition, an ELM scheme with training based on symbol construction is proposed in [259]. The approach reduced the training sequence length and kept the performance, providing a better estimation than the MMSE. Another attempt to reduce the training time has combined manifold learning with ELM. Manifold learning is a nonlinear dimensionality reduction technique grouped with the PCA and ICA schemes presented in Sect. 4. This approach has also outperformed the MMSE estimator.

Recently, an ELM-based detector has been founded on online training for pilot-assisted mMIMO-OFDM systems at the millimeter-wave [262]. The network resembles that shown in Fig. 13, with the pilots being applied to the online training to leverage post-symbol detection. The BER assessment highlighted the ELM network performance over the MMSE estimator. Despite that, a lack of evaluation among the ELM network solutions has been identified.

Complexity appraisal shows that complex-valued ELM can involve only one hidden layer, outperform offline DNN in terms of complexity and performance, and reduce the training time [256,257,258, 260, 266]. Furthermore, ELM complexity was investigated to require the same number of neurons in the hidden layers as the number of antennas at the base station (BS) to achieve higher spectral efficiency than linear mMIMO receivers [261]. An attempt to leverage unsupervised learning to an ELM has been shown to increase the computational time cost with no performance improvement [264]. Besides, an ELM-autoencoder solution has significantly improved performance with a high computational cost. In contrast to complex-valued ELM, real-valued ELM demands less computation than FFNN and complex-valued ELM due to real-domain values instead of complex domain ones [255].

5.5 Recurrent neural network

Fig. 14
figure 14

RNN architecture and work principle

The RNN consists of a network structure with one-step temporal dependence among the input data [267, 268]. The hidden layers receive the incoming information from the previous ones, and its output results through a feedback loop, as shown in Fig. 14. Consequently, it can learn over time in a cumulative process. Taking the unfolded example, the output at \(t-1\) feedback the input at time t, and the output at this current instant is provided as input at time \(t+1\). Thus, this NN learns not only from the incoming input but also by considering the influence of past information.

The RNN features are suitable for tackling time variations in channel estimation. This NN has been used to estimate channel response in OFDM, FBMC, and MIMO-OFDM systems [48, 267,268,269,270,271,272,273]. It has been deployed in an ABCE approach with supervised learning. The RNN was designed as a mapping function to assist pilot-aided OFDM systems [268]. The RNN was trained with the pilot subcarriers and then used to find the channel estimation at the data position. Lately, a bidirectional RNN has been proposed to enhance the system performance. A similar approach has been considered in training an RNN to provide signal recovery in an OFDM system operating under an interference environment. For instance, the network in [269] could predict 50 lost subcarriers based on channel estimation under severe interference with a root-mean-square error (RMSE) of 0.37065 and 0.24596 after 100 iterations and training epochs.

Moreover, the RNN was applied to track channel variations in MIMO-OFDM systems [267]. The proposal attempted to design an RNN for estimating channel response using signals with tightly coupled real and imaginary parts. Thus, a split-complex activation RNN was accomplished by allowing the network to learn to estimate the real and imaginary parts separately and combining them through the time average of the input information over a time window. The work has been improved by adding a self-organized map-based optimization to obtain a complex time delay fully RNN block for MIMO-OFDM systems [270]. The BER assessment has shown that the performance of the proposed network is close to the perfect CSI, superposing the MMSE estimator.

Besides, a SoftMax RNN using frequency index modulation was proposed to perform channel estimation on MIMO-OFDM systems [271]. The network provided lower BER values than the LS estimator and the ELM algorithm found in [256]. However, the comparison lacks an evaluation of the involved complexity. Reducing the ISI in MIMO-OFDM systems has been carried out by an Elman RNN for channel estimation [272]. The network evaluation has proved its application to channel estimation providing low PAPR and BER, with high capacity and throughput. The comparison included a convolutional neural network (CNN) and DNN, with the Elman RNN outperforming those networks. The RNN has also been used to design DNNs, such as the ChanEstNet DNN, which is later discussed [273]. However, the RNN performance has been recently evaluated in MIMO-OFDM systems [48].

Fig. 15
figure 15

LSTM unit cell detail

The channel estimation field has also investigated a derivation of the RNN called long short-term memory (LSTM). The LSTM is designed to yield good performance in long sequence approaches and solve the vanishing and exploding gradient issue in conventional RNNs [274, 275]. This network can obtain long-term dependencies calling for learning based on past extended sequence information. Figure 15 shows an LSTM unit cell composed of a forget, output, and input gate responsible for the data flow regulation inside the cell. The forget gate decides what kind of information is thrown away or included in the cell state based on observing the past state and the actual data. Therefore, the \(\sigma _f\) assumes values equal to 0 (throw away) or 1 (accept the information). The candidate cell allows storing certain information in the current cell state, scaling it by the \(\sigma _c\) value. According to the decided value, the input from the gate is added to the current state. Finally, the output gate imposes management on what is computed as the output value, considering that the cell state is scaled into the range -1 to 1.

The LSTM network has been combined with conventional RNN, CNN, and MLP networks [274,275,276,277]. The inherent imaginary interference channel estimation problem in FBMC systems was approached by combining a bidirectional LSTM and an RNN [274]. The network has worked well under fast time-varying scenarios and outperformed a DNN algorithm. Meanwhile, the LSTM was joined with a CNN to support channel estimation in time-varying scenarios for OFDM systems [275]. The CBR-Net (CNN batch normalization RNN) provided lower BER than the convectional estimator and other DNN architectures. A similar hybrid solution, the CNN-LSTM algorithm, achieved lower BER than other NN [276]. An MLP-LSTM network is found in [277], with the joint solution working well under high-mobility scenarios with a velocity of up to 150km/h. Recently, bidirectional LSTM network architectures have been raised to prove their performance on MIMO-OFDM systems [278,279,280]. The evaluation has confirmed the superposition of conventional estimators. In addition, the researchers have claimed low complexity due to using a DNN architecture to combine massive LSTM units, adding a bidirectional arrangement.

Furthermore, an extension of the LSTM concept is named gated recurrent network (GRU). It comprises a cell unit in which the input and output gates are replaced by an updating gate that controls the amount of information to be retained or updated. This network type has been used to design a data-driven model for channel estimation in an OFDM system applied to a fog radio scenario [281]. The performance comparison was addressed with the orthogonal matching pursuit channel estimation strategy, showing promising results. The GRU network performance was also investigated under the FBMC system [282] to deal with the inherent imaginary interference channel estimation problem. Resembling the bidirectional LSTM architecture, a GRU network called BiGRU has been proposed for a MIMO FBMC-OQAM system [283]. The training process is based on an offline stage followed by an online prediction. The BER assessment uses different time-varying channel models to face the BiGRU performance against the interference approximation channel estimation method, with an improvement in the FBMC system employing the former.

Fig. 16
figure 16

ESN architecture for [284] and [286]

An RNN with random connections among the neurons of the hidden layer is defined as an echo state network (ESN), with a network architecture as shown in Fig. 16. This network is typically designed in a single hidden layer called a reservoir. It stands for a NN that drops out of the training process through the back-propagation mechanism. The ESN has been recently investigated to leverage the channel estimation process in OFDM and MIMO-OFDM systems [284,285,286,287,288,289,290,291].

The ESN was used for channel estimation purposes [284]. First, the real and imaginary parts of the OFDM symbol are separated and delivered to two ESNs. After that, the network outputs were combined. Then, the ESN was supervised, trained, and analyzed based on comparing the desired results and those estimated, which leak from a performance analysis regarding system implementation. Moreover, an adaptive elastic ESN has been designed for channel estimation on IEEE 802.11ah systems employing the OFDM modulation [285]. The hybrid network architecture comprises an ESN and an adaptive elastic network. The latter has been added to handle ill-conditioned solutions of the LS and applied to obtain the frequency-domain CSI. The ill-conditioned solution rises from the collinearity problem in the input of the basic ESN model [285]. Therefore, the adaptive elastic network replaces the LS method to calculate the frequency-domain CSI. The results regarded the RMSE evaluation of adaptive elastic networks against auto-regression and support vector machine algorithms, highlighting the networks superior performance.

A three-layer estimator for the MIMO-OFDM system was designed considering a feature, enhancement, and output layer [286]. The feature layer comprised a pool of parallel ESNs connected with the enhancement layer by weights and biases. These layers extract feature information to feed the output layer, leveraging the channel estimation process. Besides, a supervised learning ESN has been proposed for nonlinear MIMO-OFDM systems for joint channel estimation and symbol detection, with BER results close but inferior to the LMMSE estimator [288, 289]. Thereafter, the symbol detection was based on a deep ESN, superposing the LMMSE estimator performance and showing results close to a shallow ESN [290]. Meanwhile, an ESN was designed to detect symbols using comb and scattered patterns in a standard LTE system with MIMO. The network evaluation has demonstrated superior performance over fewer pilots [291].

Complexity-wise, RNN leverages the training dataset to overcome other NNs trade-offs between accuracy and complexity. For example, they have been shown to require 218 epochs to achieve an average precision of \(96\%\), while MLP requires 326 epochs to achieve an average precision of \(94\%\) [267]. They also demand less computation due to low overhead using layers of simple matrix–vector multiplications and nonlinear activation functions [268]. However, DL-based RNN still has a challenging complexity, although its robustness can even estimate fast time-varying channels [274, 275, 277]. As a solution to reduce RNN intricacy, reservoir computing (RC) has been used to generate random synaptic weights [284,285,286,287,288,289,290,291].

5.6 Deep neural network

Fig. 17
figure 17

DNN architecture proposed in [292]

DNNs consist of multiple layers between the input and the output layers, as shown in Fig. 17 [23, 292, 293]. The multiple layers are hidden and can contain the same number of neurons or decrease towards the output layer. The layers are fully connected because each neuron is connected to all the neurons of the subsequent layer. The input value reaching a given neuron is the summation of the weighted output and bias values from the primary layer neurons. A given neuron output is a nonlinear activation function value such as the ReLU or the Sigmoid functions. Hence, the output sequences of the DNN are a cascaded nonlinear transformation of its input sequences.

The general DNN has been used for channel estimation for multicarrier systems [292, 294, 295]. For instance, a general DNN has been proposed to estimate CSI, allowing for joint channel estimation and symbol detection in an OFDM system with performance close to the MMSE estimator [292]. In [294], DNN is applied to the received signal to yield a less noisy signal and estimate the channel based on the generated signal. It has been shown that the proposed DNN channel estimator approaches MMSE estimation to within 1 dB. The authors in [295] have combined the conventional channel estimation technique for an OFDM receiver with a DNN to surpass MMSE estimation in terms of normalized MSE.

Researchers have proposed variations of the DNN for estimating the channel in multicarrier systems [293, 296,297,298]. A deep learning residual framework (ResNet) consisting of two short-connected layers and two fully connected hidden layers was used for channel estimation and equalization in FBMC/OQAM systems [293]. The ResNet uses a long real-valued sequence of a filtered frequency-domain complex sequence of the received signal as the training dataset. Accordingly, the channel estimation performance is better than the general DNN. Meanwhile, a DNN cascading with a zero-forcing preprocessor called Cascade-Net was proposed for detecting OFDM symbols, outperforming the zero-forcing method [296]. Model-driven DNN subnets, ComNet, replaced the usual OFDM channel estimation and symbol detection receiver blocks, surpassing general DNN by offering to refine inputs [297]. A variation of the ComNet receiver includes a compensating network called SwitchNet that outperforms the ComNet [298].

Fig. 18
figure 18

CNN architecture

DNN hidden layer with only a tiny portion of its neurons connected to the previous layer neurons is called the convolutional layer [299]. In addition, the convolutional layer neurons share the same parameters. General CNNs significantly reduce the total amount of training parameters, comprising an architecture with an input and convolution layer followed by a pooling set and fully connected layers until the output layer is reached, as shown in Fig. 18 [227, 228]. The convolution layer enables the gathering of local patterns upon the input data. Meanwhile, the pooling layers summarize the given information. This network region reduces the data dimensional space while retaining the original information. Thus, the classification stage is conducted by fully connected layers.

A CNN has been exploited to recover information from OFDM signals without relying on explicit DFT or IDFT computations and performed better than channel estimators based on linear MMSE [300]. In [299], the authors added a CNN between preprocessing modules to develop a CNN-based detector that adapts to large systems or wide bands. The authors in [301] have joined CNN and image super-resolution to create a channel estimation method that, after offline training, outperforms the MMSE estimator and can potentially save spectrum.

Joining CNN and DNN can boost channel estimation. The authors of [302] have proposed intelligent signal detection comprising DNN and CNN for OFDM with index modulation. The signal detector uses pilots to achieve semi-blind channel estimation and reconstructs the transmitted symbols based on CSI. In [303], a hybrid NN-based fading channel prediction has been designed by connecting CNN and DNN layers. The hybrid channel predictor aggregates robustness to systems operating over frequency-selective channels such as MIMO-OFDM. The authors in [273] have developed a channel estimation method for high-speed scenarios using a combination of CNN and RNN. The new network, ChanEstNet, extracts the channel response feature vectors for channel estimation, exhibiting low computational complexity compared to traditional channel estimation methods.

Regarding the complexity issue, DNNs depend on extensive training datasets and apply matrix multiplication between sequential layers. For example, the adaptive DNN complexity investigated in [295] is equivalent to the accurate LMMSE channel estimation scheme, but its performance is much better. To reduce DNN complexity, the authors in [294] have combined the deep image prior (DIP) model, diminishing the training overhead and only needing pilot symbols during channel estimation. Also, a sliding structure based on the signal-to-interference power has been designed for computational complexity reduction compared to a single deep detection network [296]. Furthermore, by splitting the receiver into different subnets, DNNs demand less memory and computation than LMMSE-MMSE methods [297,298,299]. Instead of reducing the DNN-aided detector complexity, some researchers have traded it for better capabilities. For instance, the complexity has been swapped for the ability to replace DFT with a linear transformation [300]. Finally, merging LSTM and CNN creates a hybrid network that was shown to be able to predict channel characteristics [273].

5.7 Autoencoder-aided end-to-end systems

Fig. 19
figure 19

Autoencoder architecture

Autoencoders apply unsupervised learning to replace an end-to-end communication system. Hence, from the block-structure communication system point of view, autoencoders substitute the whole structure composed of the serial-to-parallel converter, lookup table, modulator, detector, symbol estimation, parallel-to-serial converter, and so forth. Autoencoders take advantage of the input data statistics to communicate them through the channel so that the fewest possible data is sent. Still, it allows the receiver to understand the input data completely [304]. Autoencoders reconstruct the input data through a series of latent representations, typically using an MMSE objective and a stochastic gradient descent (SGD) solver to find the network weights, achieving a practical regression [305]. Figure 19 depicts a general autoencoder architecture, which is taken as the basis for autoencoder systems implementation in the following discussion.

DNN and CNN are used to construct autoencoders. On the transmitter side, they learn the mapping from bits to waveforms. At the receiver side, they learn the synchronization, parameter estimation, and demapping from waveforms to bits. Some channel impairments are considered to train the autoencoder: noise, time and rate of the signal arrival, carrier frequency, phase offset, and the received signal delay spread [305]. Although it may seem that an extensive dataset is required for training autoencoders, they usually require a tiny portion of the code space, the ratio being even \(2.9387359 \times 10^{-34}\). Thus, autoencoders contribute to the used resources [306]. The trained autoencoder results in a transmit and receive signal that resembles those of MCM communication systems.

The end-to-end autoencoder-based communication system can compete with mature systems such as OFDM, FBMC, GFDM, and UFMC without any prior mathematical modeling or analysis [307, 308]. In [307], the DNN and CNN-based autoencoder of [305] has been enhanced to deal with synchronization and ISI. For synchronization, an introduced NN is responsible for separating the infinite sequence of the received samples into different probable block groups and estimating each group probability. For ISI, during the training phase, the autoencoder assumes the received messages present ISI interference in learning to solve this impairment. The enhanced autoencoder has been tested against real channels and demonstrated a performance 2 dB worse than that of the MMSE method. In [308], the proposed DNN-based autoencoder exhibited fast convergence when operating over an aggressive Rayleigh fading channel. The autoencoder transmitter and receiver parts were alternatively trained until the loss stopped decreasing. The authors claimed that the autoencoder could be applied to any channel without analysis.

Instead of competing with well-established MCM systems, autoencoders can be combined with them, bringing more reliability [309, 310]. DNN-based autoencoders have been proposed to mitigate synchronization errors and simplify equalization over multipath channels [309]. The proposed model has also shown flexibility regarding imprecise knowledge about the channel and reduced complexity compared to conventional OFDM systems. The authors in [310] have combined autoencoders to an OFDM under single-bit quantization. The OFDM data detection loss under that constraint was reduced using an unsupervised autoencoder, competing with unquantized OFDM at SNR values smaller than 6 dB.

Autoencoders have also been compared with MIMO systems [311, 312]. The authors in [311] have obtained an autoencoder that outperforms Alamouti space-time block code (STBC) [313] operating over the Rayleigh fading channel for SNR values greater than 15 dB. It is considered perfectly known, quantized, and none CSI information scenarios. The optimum autoencoder was achieved using NN-based regression, considering channel estimation on both the transmitter and receiver sides. In [312], the authors combined autoencoders and ELM and proposed a novel detection scheme for MIMO-OFDM. In this approach, the autoencoders refine the input data before transmitting it and ELM is employed to classify the received signal based on regular features. The BER performance of the novel MIMO-OFDM detector is similar to the maximum-likelihood detection (MLD).

The extension of MIMO, mMIMO, has also been targeted to use autoencoders. The proposed network in [314] employs CNN to learn the channel structure effectively from training samples to recover CSI even in low compression regions. This autoencoder is mainly investigated for multicarrier systems where the BS receives the CSI from the users. The autoencoder can transform the channel matrix into a shorter-dimensional vector and vice versa. Even though executing new sensing and recovery mechanism beats existing compressive sensing-based methods, the authors claimed it could be enhanced by applying advanced DL strategies.

In terms of complexity, autoencoders require a large dataset for training and to reach the optimum solution, thus resulting in a trade-off between performance and computation. Some works have addressed power demand reduction as the attractiveness of their proposed method. For example, tensor-based processing can reduce power requirements by lowering clock rates, increasing algorithm concurrency, and adapting, as pointed out in [305]. The PAPR could also be reduced using a network based on an autoencoder architecture of DL [306, 309]. Other works implement different training strategies to reduce the intrinsic trade-off between the performance and computation of autoencoders [308, 310]. For example, in [307], the authors have used a two-phase training: the architecture is trained with simulated channels in the first phase, and the receiver is fine-tuned over realistic channels in the second phase.

5.8 Other neural networks

Generative adversarial network (GAN) [315,316,317,318], general regression neural network (GRNN) [319, 320], and fuzzy neural network (FNN) [321, 322] have also been investigated in the channel estimation subject. Likewise, the least mean error [323], meta-learning [324], k-means clustering [325], and LS [326] techniques were applied to leverage NN training. Regarding these training techniques, the survey has shown that ML might also be an interesting approach to overcoming the voluminous training dataset problems in DNN.

5.8.1 Generative adversarial network

Fig. 20
figure 20

GAN working principles

A GAN comprises two networks: generative and adversarial networks. These networks operate competitively, as shown in Fig. 20. The generative network aims to retrieve the original information utilizing training. On the other hand, the adversarial network discriminates the incoming labeled fake samples of the first network by comparing them with accurate data. In other words, the adversarial must learn to recognize false and true patterns and the generative to deceive the former. In this way, the generative network is later trained to fool the adversarial network by passing through samples as true [316].

This concept was applied to reshape the ResEsNet [315, 327] by considering the channel response with known pilot positions as a low-resolution image. Thus, the GAN was applied to estimate the CSI in a super-resolution approach. First, the generator comprises convolution layers and residual blocks with pre-residual activation units. Then, batch normalization is applied to the beginning and the end to map/remap the data to the scale model. Finally, the fake samples feed the discriminator, also formed by convolution, batch normalization, and Leaky ReLU layers [315]. The super-resolution GAN has outperformed the ResEsNet estimation while presenting better performance than the LMMSE estimator. Furthermore, a GAN-based channel estimation approach was proposed for high-speed mobile scenarios [317]. The method goal was to reduce the complexity of the channel estimation process by training a discriminator to learn and extract channel time-varying features. After, the generator acts upon the samples to generate and restore the channel information.

The GAN approach has also been modeled to reduce the number of pilots in MIMO-OFDM and OFDM systems [316, 318]. The first network proposal exploited the generative network to learn how to produce channel samples based on training on real data [316]. After that, the trained model was used to get current channel samples according to the received signal. The results have been compared with a supervised learning ResNet mode, exhibiting better performance. However, it could not overcome the LMMSE estimator. Meanwhile, the GAN has been devoted to mapping low-dimensional channel space into a high-dimensional one, reducing the pilots number in an OFDM system [318]. As a result, the designed network could track the CIR at different channels after training, outperforming the LMMSE and ChannelNet estimators.

5.8.2 General regression neural network

Fig. 21
figure 21

GRNN architecture

The GRNN has been proposed as an enhanced version of the RBFN founded on nonparametric regression [319, 320, 328]. The network falls into the probabilistic NN category. The GRNN architecture comprises four layers known as the input, pattern, summation, and output layers, as shown in Fig. 21. The former and the latter are classical structures of NN architecture. The pattern layer is the single learning layer of the network and it is fully connected with the neurons of the input layer [328]. The pattern output is fully connected to the s-summation and the d-summation neurons of the summation layer. In contrast, the former computes the weighted sum from the previous layer and the latter the unweighted values. Thereafter, the output layer divides the s-summation results by the d-summation.

This neural network approach has been applied in channel estimation using partial CSI information obtained from data-aided decision feedback channel estimation, showing more accurate interpolation results [319, 320, 329]. The network structure has four layers: input, pattern, summation, and output layer. The pattern layer includes the radius of the radial basis function that can control the smoothness level of the regression results. The summation layer sums the neuron pattern outputs by multiplying them by the desired results and, after, by their own, which are further combined in the output layer. This network was first applied to time-domain [319] smoothness and extended to a frequency-domain strategy [320]. The latter has outperformed the former and conventional pilot-aided channel estimation.

5.8.3 Fuzzy neural network

The fuzzy logic was applied to leverage a fuzzy controller to periodically adjust the step size in an LMS algorithm for OFDM systems [321]. The results showed a faster convergence and robust tracking of channel variations when compared with the LMS under different channel conditions. Furthermore, a functional link FNN estimator was developed [322]. The network comprises a functional link NN integrated with fuzzy rules, whereas each one is a sub-functional link NN with a function expansion of input variables. The network performance was close to the MMSE estimator.

5.8.4 Reduction training techniques for neural networks

Regarding the training approach, the least mean error algorithm was applied to a NN with two sub-networks to identify amplitude gain and phase variation [323]. Moreover, the LS algorithm was integrated into a black box NN [326]. The process uses the LS to estimate the channel at the pilot subcarriers, then apply it to the network to predict the channel response at the data subcarriers. This approach might be seen as an ML interpolation strategy with results similar to the MMSE, and some other discussed NN. A proposed similar channel estimation method using a multiple variable regression approach to design an ML algorithm that does not require any initial information or statistics about the channel is found in [330]. It uses the SGD algorithm for parameter optimization purposes. This proposal has been compared with the LS and MMSE estimators, outperforming the conventional estimator while providing performance similar to the perfect estimation.

The K-means clustering algorithm was proposed to support a semi-blind channel estimator for cell-free mMIMO [325]. The algorithm allows clustering of the received signal to optimize the channel estimation process. In the meantime, the meta-learning has been exploited in a two-stage method named robust channel estimation with meta-learning neural networks (RoemNet) for OFDM symbols [324]. The proposed network can learn general characteristics from multiple channels, gathering meta-knowledge for training purposes. Furthermore, this approach allows applying the RoemNet to different unknown channels and fast refinement of its weights by using a few pilot symbols through the meta-update process. The RoemNet performance has proved its ability to learn and better estimate the channel with a few pilots, outperforming the MMSE estimator. However, the increase in the pilots quantity leads to similar results. Also, it was shown that with 8 pilot-long sequences, training the RoemNet yields lower BER than the LS estimator with 128 pilot-long sequences.

5.8.5 Complexity discussion

Regarding the complexity, GANs can reduce it during training while improving the performance compared with residual NN [315, 316]. Additionally, the GAN-based estimation proposed in [316] does not require retraining, even if the number of clusters and rays changes considerably, and lowers the number of necessary pilot tones. Complexity-wise, the network approaches in [316, 318] have the lowest value compared to the LS and LMMSE. Meanwhile, the network algorithm complexity of [318] was compared with the MMSE estimator, resulting in a linear and cubic relationship with the number of pilots, respectively. FNN could not reduce the complexity of well-known estimators while improving performance [322]. In [321], the used FNN showed a steeper learning curve than MSE but increased the computation load slightly. GRNN demanded only 0.0534ms of processing time for channel estimation at SNR, such as 30 dB to achieve a BER of \(1.2 \times 10^{-4}\), as an example of its computational complexity [319]. However, it kept the trade-off between performance and complexity, requiring 0.4206ms to reach a BER of \(1.8 \times 10^{-5}\). GRNN could reduce this trade-off for other NN-based estimation methods. For example, GRNN replaces ANN in [320] to eliminate the iterative training process and diminish the computational complexity as the BER decreases. Other techniques, such as least mean error, meta-learning, k-means clustering, and LS, focus on reducing the training overhead to demand less computation.

6 Reinforcement learning

Fig. 22
figure 22

Reinforcement learning working problem

Reinforcement learning is a training approach, as mentioned in Sect. 3, that defines an emerging branch of ML. The algorithms under this classification learn from the reward maximization hypothesis principle [331]. An agent executes actions in an environment that has its states modified over time [332], as shown in Fig. 22. The state-changing according to taken actions results in a reward or a penalty for the agent. The algorithm must establish a strategy, also known as policy, to define actions to achieve a specific goal and maximize the expected cumulative reward. The environment is commonly molded as a Markov decision process (MDP) that describes the agent sequence of actions, the present reward, and the future state and reward [331, 333]. The Q-learning approach releases quality rather than optimal learning [332, 333]. Unlike other ML algorithms, reinforcement learning succeeds without an explicit training process, learning through a mix of exploration and exploitation of the environment in a trial-and-error manner.

The investigation of channel estimation from the multicarrier systems perspective has been addressed for OFDM schemes. A model-free Q-learning technique is applied to select the best CIR predictor [332]. First, the CIR prediction is constructed using an adaptive RLS estimator without pilot signals. The RLS estimator predicts one or more future CIR block coefficients using previously estimated ones. Then, the agent interacts with the algorithm (environment) to enable dynamic reinforcement learning in this context. The results have shown the dominance of the Q-learning-based estimator over the conventional RLS. Besides, a denoising method for channel estimation in MIMO-OFDM systems has been modeled as an MDP based on channel curvature computation [333]. The channel curvature allows for identifying the unreliable estimation for the future MDP. The reward function is defined to reduce the MSE. Finally, Q-learning is used for the channel estimation process. This estimator has shown better results related to the LS estimator and poor BER values when faced with the MMSE estimator performance.

Combining DNN to approximate the strategy (i.e., the policy), and MDP, deep reinforcement learning (DRL) algorithms arise [334,335,336]. Compared with reinforcement learning, weights of the DNN are used as extra input parameters, and the SGD optimizer is employed to update the weights. Although DRL might yield instability and divergence, their recent upgrades, deep Q-network [337] and AlphaGo [337], have been able to represent the environment even with high-dimensional sensory inputs, e.g., pixels of an image. Those two developments were based on games, the former achieving a level comparable to that of a professional human gamer across a set of 49 games of Atari 2600 [338] and the latter defeating a human professional player in the full-sized game of Go for the first time.

A few attempts have been addressed regarding the channel estimation for multicarrier systems [339, 340]. Double deep Q learning (DDQL) has been proposed for channel estimation in industrial wireless networks as an alternative to DNN and Q-learning approaches [339]. It aims to circumvent the DNN long data sequence training while eliminating the overestimation of action values of the Q-learning. Therefore, the DDQL has been exploited for channel estimation to adapt to the Rician channel model for the dynamic industrial wireless network. The DDQL comprised five hidden layers of fully connected neurons with tanh activation functions, and a linear activation layer on the output. Lately, the DDQL performance has been compared against some MMSE-based estimators. The authors’ proposal estimates the channel better than the other estimators, except for ideal MMSE [339].

The pilot contamination problem in mMIMO systems was addressed using DRL to leverage a pilot assignment strategy to adapt to the channel variations and keep a modest pilot contamination effect [340]. First, the system model was considered an OFDM system; consequently, the sub-channel was assessed as a flat-fading model. Next, the reward was modeled as a cost function based on the user’s angle of arrival information. The channel characteristic and the maximum cost function allowed to define the states, actions, and rewards. Thereafter, the agent learned the pilot assignment policy, adapting them to the channel variations to minimize the cost function. Finally, the DRL was leveraged based on a six-hidden layer deep residual network (ResNet) as a Q-neural network (QNN). The proposed DRL results have demonstrated lower system overhead against other approaches, such as soft pilot reuse (SPR) [340].

Concerning complexity, the training dataset does not need to be labeled, making reinforcement learning practical and adaptive to time-varying channels. However, those advantages and other issues may increase the complexity when combining reinforcement learning and channel estimation, as shown in [332]. For example, dominant CIR tap index identification adds to the overall load computation. Regardless, other solutions succeeded in reducing the complexity, as the strategy proposed in [333] is based on the frequency domain instead of the time domain and reduces the requirement to perform DFT. Meanwhile, DRL has been raised as an approach to reduce the complexity of DNN in channel estimation, comprising a field of opportunities in the context of multicarrier systems and their extension to MIMO schemes.

7 Discussions and research directions

This section aims to nourish a discussion about how AI-aided channel estimation strategies have been proposed for multicarrier systems, highlighting some learned lessons. Afterward, future research directions based on recent findings are pointed out. Therefore, classical ML, NNs, and RL are carried out in the sense of how the plethora of works has modeled them to leverage channel estimation in multicarrier system scenarios. At the same time, how those works have striven to improve the results against standalone conventional channel estimation techniques, such as blind, data-aided, decision-directed, and semi-blind estimators.

Regarding classical ML, regression algorithms have been joined with conventional estimators, such as pilot-assisted iterative channel estimation, an LS estimator, and normalized MSE estimator. Meanwhile, the estimator block-type structures were preserved based on the AMBCE approach with supervised learning. The regression algorithms mainly enhanced the interpolation process in data-aided methods. Therefore, the channel was first estimated through data-aided schemes and then delivered to the trained algorithms to estimate the channel at the data subcarriers. Under this assumption, OFDM and MIMO-OFDM were the system models to apply the estimators based on regression techniques. Some target channels were the fast time-varying, highly selective, and doubly selective fading environment.

Recently, the research on estimators based on regression algorithms has been scarce, which is understood due to the growth of NNs and RL solutions. However, some research has accomplished promising results regarding the SVR for OFDM and MIMO-OFDM systems [172, 173]. Therefore, it is a research direction to apply these estimators for OFDM variations or other multicarrier systems, extending them to MIMO schemes. In addition, joint regression algorithms with blind and semi-blind estimators is an open and remotely exploited field, which may pursue methods that do not require the information of the channel statistics [190].

Evolutionary algorithms are mainly exploited for channel estimation in OFDM and MIMO-OFDM systems in the sense of the GA, RWBS, and PSO. The GA has conceived means to design estimators based on AMBCE systems implementation. Some approaches replace the interpolation process to aid a pilot-aided channel estimation scheme, while others leverage a blind channel estimator based on GA. Beyond that, combining the LS and MMSE estimators has also been accomplished by using GA. At the same time, the RWBS-based algorithms were devoted to aid a combination between channel estimation block with multiuser and data detection functions, comprising an ABCEx approach. Likewise, PSO allowed for joint channel estimation and decoding MIMO-OFDM systems while also being used to enhance iterative estimator performance. These evolutionary algorithms may also be extended to other multicarrier systems and their variants to unveil their potential and validate them against other classical ML techniques, NNs, and RL. Therein, computational complexity and processing time may be assessed along with an MMSE performance comparison.

Bayesian learning has been recently considered for channel estimation purposes in OFDM and MIMO-OFDM systems addressing sparse channels [222,223,224,225,226]. However, the approaches stand mostly for model-based design, with some works including a joint model- and data-driven strategy. On the other hand, combining Bayesian learning with PSO has also been accomplished to join pilots optimal design and channel estimation [224]. Alike, the Bayesian learning-based channel estimator performance has only faced the conventional estimators, outperforming them at the cost of higher computational complexity. Hence, comparing the Bayesian learning performance with those of NN and RL algorithms under the same channel assumptions is necessary to validate its computational complexity since some works claimed their proposal was close to the MMSE estimator [226].

Taking NNs into account, they have been mainly employed using AMBCE and ABCE approaches. In other words, they have been used to aid the channel estimation block or to replace it. Furthermore, they have proved able to assist with semi-blind and data-aided channel estimation techniques, even in scenarios involving only a few pilots [244]. Also, the inputs used for training and estimation can be complex- or real-valued symbols. Overall, these capabilities rendered NN adaptive to fast-fading, high mobility, and vehicular-to-vehicular communication cases.

This survey has provided several different NN models utilized for channel estimation in multicarrier systems. They were discriminated between the hidden layer structures and training methods. Extensive work was found using the following NN models: BPNN, FFNN, ELM, RNN, DNN, autoencoders, GAN, GRNN, and FNN. In general, they all exhibited more complexity than classical learning AI algorithms and accompanied a trade-off with the performance. Hence, strategies to overcome this impairment are welcomed and represent a research direction. Specifically, some NNs lack complexity analysis, such as the joint FFNN and GA and ELM for mMIMO-OFDM. Also, a complexity comparison between RNN and ELM would enrich this topic. These issues are all considered an open field to be investigated. In addition, NNs have been implemented to estimate the channel in the following multicarrier systems: OFDM, FBMC, and MIMO-OFDM. However, only ELM and autoencoders have been used with other communication systems, such as vehicular-to-vehicular, OFDM-based radio-over-fiber, mMIMO, GFDM, UFMC, and MIMO. This last observation signifies another open area for investigation, which would help consolidate NN applicability for channel estimation in multicarrier systems.

Regarding the RL, this survey points out its usage only to enable channel estimation in OFDM and MIMO-OFDM systems. A few approaches have been considered using an AMBCE or an ABCE design. RL estimators relied on model-free Q-learning by exploiting a highly mobile and dynamic propagation environment. In the meantime, DRL was proposed based on DDQL, aiming at avoiding the DNN long data sequence training and the overestimation action values of the Q-learning. In addition, a deep residual network (ResNet) was also used as a QNN to accomplish channel estimation in the mMIMO-OFDM system.

Although RL channel estimation handled OFDM and MIMO systems, this survey concludes that the RL application for channel estimation in multicarrier systems still indicates an open research field, barely exploited. Hence, there are opportunities to address its variation along with other multicarrier systems, including a performance comparison with the MMSE and other NN-based estimators. Note that there are recent surveys devoted to investigating the RL usage with MIMO systems that have also confirmed the lack of work in the channel estimation field [341].

Besides estimating, iteration and human brain-inspired networks are capable of predicting and equalizing the channel in multicarrier systems. They all approximate the MMSE estimation. Indeed, most works compared the respective AI-aided channel estimation technique performance with the MMSE estimation [178, 231, 241, 294]. Although comparisons might consider other estimators, the MMSE estimator is the most popular due to its performance in minimizing the mean error. Some iterative methods also depend on the channel model probabilistic knowledge to exhibit a suitable performance. Neither NN-based channel estimation techniques nor the RL approach relied on those probabilistic models. Therefore, those learning algorithms can be better candidates for channel estimation in complex and fluctuating environments.

Regarding RL, there is a wide-open field on applying this strategy to MCM to evaluate its performance and complexity over the NN. The latter estimators are the best option when the multicarrier system operates in a hard-to-model channel or when the goal is to provide a less human-dependent channel estimation method. Also, they are more capable of imitating real-world data. On the other hand, RL techniques leverage training by exploiting the environment in a trial-and-error manner, eliminating the need for explicit training processes and labeled datasets.

Combining different ML algorithms can outperform the strategies that use only one of them. They can be employed to work so that one algorithm treats the incoming data and provides the new input to another or one controls the overall multicarrier system working instead. Configuring the algorithms so that one feeds the another can reduce the processing time required by only one algorithm [303]. On the other hand, controlling the system means allocating power, pilots, and other resources [299]. Combination of different ML algorithms for channel estimation in multicarrier systems remains a wide-open field that could unravel new strategies and models to solve such an issue.

A common trade-off among ML algorithms is that the estimation accuracy increases at the cost of the training dataset expansion, which increases the computational complexity. These learning strategies call attention to the necessity of a large number of training samples to achieve an approximate MMSE estimator’s performance. However, note that after training, the AI model can be less complex than the MMSE, which requires regular estimates of channel parameters (e.g., noise variance). However, dimensionality reduction [174], Bayesian learning [223], ELM networks [257], k-mean clustering [325], meta-learning [324], LS-based ML algorithms [326], and DNN [296] have been investigated as solution candidates for reducing training sequences. Further research can consider reducing the computational work required by each AI-aided channel estimation. Moreover, combining the ELM network with distinct dimensionality reduction techniques is missing investigation.

Although conventional OFDM presents some drawbacks, it is still largely used as the primary multicarrier system to assess AI-aided channel estimation techniques’ performance. OFDM variations remain multicarrier systems that haven’t been densely investigated yet for AI-aided channel estimation techniques. Therefore, applying the methods presented throughout this survey for the OFDM variations can lead to discoveries that can result in mature versions of the aforementioned AI-aided channel estimation strategies. The performance of the AI-aided channel estimation approaches employed by conventional OFDM can be compared with one of the OFDM variations. The performance comparisons might include the MMSE estimation, but other metrics, such as computational complexity, processing time, and manufacturing cost, can be analyzed.

Multicarrier systems employing variations of OFDM, FBMC, GFDM, and UFMC can also be used for testing more AI channel estimation methods. New performance results can be obtained, and even better multicarrier systems can be designed. Simpler models can arise by investigating the combination between OFDM variations and AI mechanisms created to solve the same drawbacks as the OFDM variations. Different AI models can join FBMC to improve spectral efficiency or reduce its intrinsic high PAPR. AI can control the allocation of pilots or reduce the ICI sensibility of the GFDM and UFMC multicarrier systems.

Finally, channels can be better explored in multicarrier systems with AI-aided channel estimation. Impulsive noise, flexible short-term fading, arbitrarily correlated short-term fading, shadowed fading, arbitrarily correlated shadowed fading, and cascaded fading channels can be used as different fluctuating environments. They can bring more robustness to the AI-aided channel estimation methods or help address limitations. For example, the needed number of input parameters can indicate a drawback or impact the processing time when the multicarrier system operates in a more aggressive environment.

8 Conclusion

This paper extensively investigates AI applications for estimating the channel for MCM systems. Previous surveys on the same subject have been reviewed, but only a few have addressed AI usage in estimating the channel. In addition, most of them have been devoted to analyze OFDM and mMIMO-OFDM systems. Therefore, the present survey first contribution was detailing AI techniques used for channel estimation in MCM systems. Generally, the following families of AI methods have been presented: classical learning, neural networks, and reinforcement learning. Specifically, the following AI models have been described: regression, evolutionary algorithm, dimensionality reduction, Bayesian learning, FFNN, ELM, RNN, DNN, CNN, RBFN, autoencoders, GAN, FNN, GRNN, and Q-learning. The survey second contribution was to carry use-case examples of AI for channel estimation in MCM systems that do not include OFDM but others, such as FBMC, GFDM, UFMC, STBC, MIMO-OFDM, FBMC-OQAM, and mMIMO-OFDM. A third contribution encompassed collecting conventional channel estimation techniques for MCM systems, such as non-blind, semi-blind, and blind techniques. Lastly, this survey points out open issues and highlights future research topics that can help evolve the channel estimation for not only MCM communication systems but also single-carrier communication systems. Due to the immense number of references herein, the paper main contribution is to serve as a basis for guiding researchers about the current development and opening for new and enhancement works of AI-aided channel estimators for MCM communication systems.

Availability of data and materials

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.


  1. 5G NR: Architecture, Technology, Implementation, and Operation of 3GPP New Radio Standards

  2. O.E. Ijiga, O.O. Ogundile, A.D. Familua, D.J.J. Versfeld, Review of channel estimation for candidate waveforms of next generation networks. Electronics (2019).

    Article  Google Scholar 

  3. L. Jiang, H. Zhang, S. Cheng, H. Lv, P. Li, An overview of FIR filter design in future multicarrier communication systems. Electronics (2020).

    Article  Google Scholar 

  4. A. Racz, A. Temesvary, N. Reider, Handover Performance in 3GPP Long Term Evolution (LTE) Systems, in 2007 16th IST Mobile and Wireless Communications Summit (2007), pp. 1–5.

  5. N. Shaik, P.K. Malik, A comprehensive survey 5G wireless communication systems: open issues, research challenges, channel estimation, multi carrier modulation and 5G applications. Multimed. Tools Appl. 80, 28789–28827 (2021).

    Article  Google Scholar 

  6. S. Research, 6G: The Next Hyper-connected Experience for All, Technical report (2020)

  7. A. Sahin, R. Yang, E. Bala, M.C. Beluri, R.L. Olesen, Flexible DFT-S-OFDM: solutions and challenges. IEEE Commun. Mag. 54(11), 106–112 (2016).

    Article  Google Scholar 

  8. G. Berardinelli, K.I. Pedersen, T.B. Sorensen, P. Mogensen, Generalized DFT-spread-OFDM as 5G waveform. IEEE Commun. Mag. 54(11), 99–105 (2016).

    Article  Google Scholar 

  9. B. Farhang-Boroujeny, OFDM versus filter bank multicarrier. IEEE Signal Process. Mag. 28(3), 92–112 (2011).

    Article  Google Scholar 

  10. K. Choi, Alamouti coding for DFT spreading-based low PAPR FBMC. IEEE Trans. Wirel. Commun. 18(2), 926–941 (2019).

    Article  Google Scholar 

  11. B. Farhang-Boroujeny, Filter bank multicarrier modulation: a waveform candidate for 5G and beyond. IEEE Signal Process. Mag. 2014, 1–26 (2014).

    Article  Google Scholar 

  12. C.-L. Tai, T.-H. Wang, Y.-H. Huang, An overview of generalized frequency division multiplexing (GFDM). ArXiv abs/2008.08947 (2020)

  13. Z. Guo, Q. Liu, W. Zhang, S. Wang, Low complexity implementation of universal filtered multi-carrier transmitter. IEEE Access 8, 24799–24807 (2020).

    Article  Google Scholar 

  14. L. Zhang, A. Ijaz, P. Xiao, K. Wang, D. Qiao, M.A. Imran, Optimal filter length and zero padding length design for universal filtered multi-carrier (UFMC) system. IEEE Access 7, 21687–21701 (2019).

    Article  Google Scholar 

  15. Y.-Y. Wang, C.-A. Lai, On the cfo estimation of the ofdm: a frequency domain approach. J. Franklin Inst. 351(5), 2489–2503 (2014)

    Article  MATH  Google Scholar 

  16. V. Savaux, Y. Louet, LMMSE channel estimation in OFDM context: a review. IET Signal Proc. 11(2), 123–134 (2017).

    Article  Google Scholar 

  17. Y. Liu, Z. Tan, H. Hu, L.J. Cimini, G.Y. Li, Channel estimation for OFDM. IEEE Commun. Surv. Tutor. 16(4), 1891–1908 (2014).

    Article  Google Scholar 

  18. F.A. Dietrich, W. Utschick, Pilot-assisted channel estimation based on second-order statistics. IEEE Trans. Signal Process. 53(3), 1178–1193 (2005).

    Article  MathSciNet  MATH  Google Scholar 

  19. M.K. Ozdemir, H. Arslan, Channel estimation for wireless OFDM systems. IEEE Commun. Surv. Tutor. 9(2), 18–48 (2007).

    Article  Google Scholar 

  20. O.O. Oyerinde, S.H. Mneney, Review of channel estimation for wireless communication systems. J. Theor. Appl. Inf. Technol. 29(4), 282–298 (2012)

    Google Scholar 

  21. R. Shafin, L. Liu, V. Chandrasekhar, H. Chen, J. Reed, J.C. Zhang, Artificial intelligence-enabled cellular networks: a critical path to beyond-5G and 6G. IEEE Wirel. Commun. 27(2), 212–217 (2020).

    Article  Google Scholar 

  22. S. Zhang, J. Liu, T.K. Rodrigues, N. Kato, Deep learning techniques for advancing 6G communications in the physical layer. IEEE Wirel. Commun. (2021).

    Article  Google Scholar 

  23. H. Huang, S. Guo, G. Gui, Z. Yang, J. Zhang, H. Sari, F. Adachi, Deep learning for physical-layer 5G wireless techniques: opportunities, challenges and solutions. IEEE Wirel. Commun. 27(1), 214–222 (2020).

    Article  Google Scholar 

  24. Q. Hu, F. Gao, H. Zhang, S. Jin, G.Y. Li, Deep learning for channel estimation: interpretation, performance, and comparison. IEEE Trans. Wirel. Commun. 20(4), 2398–2412 (2021).

    Article  Google Scholar 

  25. V.P. Rekkas, S. Sotiroudis, P. Sarigiannidis, S. Wan, G.K. Karagiannidis, S.K. Goudos, Machine learning in beyond 5g/6g networks-state-of-the-art and future trends. Electronics 10(22), 2786 (2021)

    Article  Google Scholar 

  26. A.I. Salameh, M. El Tarhuni, From 5g to 6g-challenges, technologies, and applications. Future Internet 14(4), 117 (2022)

    Article  Google Scholar 

  27. M.Z. Chowdhury, M. Shahjalal, S. Ahmed, Y.M. Jang, 6g wireless communication systems: applications, requirements, technologies, challenges, and research directions. IEEE Open J. Commun. Soc. 1, 957–975 (2020)

    Article  Google Scholar 

  28. A. Dogra, R.K. Jha, S. Jain, A survey on beyond 5g network with the advent of 6g: architecture and emerging technologies. IEEE Access 9, 67512–67547 (2020)

    Article  Google Scholar 

  29. K. Hassan, M. Masarra, M. Zwingelstein, I. Dayoub, Channel estimation techniques for millimeter-wave communication systems: achievements and challenges. IEEE Open J. Commun. Soc. 1, 1336–1363 (2020).

    Article  Google Scholar 

  30. Z. Liu, L. Zhang, Z. Ding, Overcoming the channel estimation barrier in massive MIMO communication via deep learning. IEEE Wirel. Commun. 27(5), 104–111 (2020).

    Article  Google Scholar 

  31. Z. Qin, H. Ye, G.Y. Li, B.-H.F. Juang, Deep learning in physical layer communications. IEEE Wirel. Commun. 26(2), 93–99 (2019).

    Article  Google Scholar 

  32. H. Yang, X. Xie, M. Kadoch, Machine learning techniques and a case study for intelligent wireless networks. IEEE Netw. 34(3), 208–215 (2020).

    Article  Google Scholar 

  33. C. Jiang, H. Zhang, Y. Ren, Z. Han, K.-C. Chen, L. Hanzo, Machine learning paradigms for next-generation wireless networks. IEEE Wirel. Commun. 24(2), 98–105 (2017).

    Article  Google Scholar 

  34. B. Hassan, S. Baig, H.M. Asif, S. Mumtaz, S. Muhaidat, A survey of FDD-based channel estimation schemes with coordinated multipoint. IEEE Syst. J. (2021).

    Article  Google Scholar 

  35. P. Sure, C.M. Bhuma, A survey on OFDM channel estimation techniques based on denoising strategies. Int. J. Eng. Sci. Technol. 20(2), 629–636 (2017).

    Article  Google Scholar 

  36. A. Angelo Missiaggia Picorone, T. Rodrigues Oliveira, M. Vidal Ribeiro, PLC channel estimation based on pilots signal for OFDM modulation: a review. IEEE Lat. Am. Trans. 12(4), 580–589 (2014).

    Article  Google Scholar 

  37. T. Hwang, C. Yang, G. Wu, S. Li, G. Ye Li, OFDM and its wireless applications: a survey. IEEE Trans. Veh. Technol. 58(4), 1673–1694 (2009).

    Article  Google Scholar 

  38. S.G. Kang, Y.M. Ha, E.K. Joo, A comparative investigation on channel estimation algorithms for OFDM in mobile communications. IEEE Trans. Broadcast. 49(2), 142–149 (2003).

    Article  Google Scholar 

  39. Q. Mao, F. Hu, Q. Hao, Deep learning for intelligent wireless networks: a comprehensive survey. IEEE Commun. Surv. Tutor. 20(4), 2595–2621 (2018).

    Article  Google Scholar 

  40. M. Zamanipour, A survey on deep-learning based techniques for modeling and estimation of massive MIMO channels 1910, 03390 (2020)

  41. C. Zhang, P. Patras, H. Haddadi, Deep learning in mobile and wireless networking: a survey. IEEE Commun. Surv. Tutor. 21(3), 2224–2287 (2019).

    Article  Google Scholar 

  42. L. Dai, R. Jiao, F. Adachi, H.V. Poor, L. Hanzo, Deep learning for wireless communications: an emerging interdisciplinary paradigm. IEEE Wirel. Commun. 27(4), 133–139 (2020).

    Article  Google Scholar 

  43. F. Tang, B. Mao, N. Kato, G. Gui, Comprehensive survey on machine learning in vehicular network: technology, applications and challenges. IEEE Commun. Surv. Tutor. 23(3), 2027–2057 (2021).

    Article  Google Scholar 

  44. Q.-V. Pham, N.T. Nguyen, T. Huynh-The, L. Le Bao, K. Lee, W.-J. Hwang, Intelligent radio signal processing: a survey. IEEE Access 9, 83818–83850 (2021).

    Article  Google Scholar 

  45. T. O’Shea, J. Hoydis, An introduction to deep learning for the physical layer. IEEE Trans. Cognit. Commun. Netw. 3(4), 563–575 (2017).

    Article  Google Scholar 

  46. D. Gunduz, P. de Kerret, N.D. Sidiropoulos, D. Gesbert, C.R. Murthy, M. van der Schaar, Machine learning in the air. IEEE J. Sel. Areas Commun. 37(10), 2184–2199 (2019).

    Article  Google Scholar 

  47. K. Mei, J. Liu, X. Zhang, N. Rajatheva, J. Wei, Performance analysis on machine learning-based channel estimation. IEEE Trans. Commun. 69(8), 5183–5193 (2021).

    Article  Google Scholar 

  48. W. Jiang, H.D. Schotten, Neural network-based fading channel prediction: a comprehensive overview. IEEE Access 7, 118112–118124 (2019).

    Article  Google Scholar 

  49. Y. Fan, D. Dan, Y. Li, Z. Wang, Z. Liu, Intelligent communication: application of deep learning at the physical layer of communication, in 2021 IEEE 4th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), vol. 4 (2021), pp. 1339–1345.

  50. H. He, S. Jin, C.-K. Wen, F. Gao, G.Y. Li, Z. Xu, Model-driven deep learning for physical layer communications. IEEE Wirel. Commun. 26(5), 77–83 (2019).

    Article  Google Scholar 

  51. T. Wang, C.-K. Wen, H. Wang, F. Gao, T. Jiang, S. Jin, Deep learning for wireless physical layer: opportunities and challenges. China Commun. 14(11), 92–111 (2017).

    Article  Google Scholar 

  52. L. Sakkas, E. Stergiou, G. Tsoumanis, C.T. Angelis, 5g ufmc scheme performance with different numerologies. Electronics 10(16), 1915 (2021)

    Article  Google Scholar 

  53. G.B. Giannakis, Filterbanks for blind channel identification and equalization. IEEE Signal Process. Lett. 4(6), 184–187 (1997).

    Article  Google Scholar 

  54. J. Liang, Z. Ding, Blind MIMO system identification based on cumulant subspace decomposition. IEEE Trans. Signal Process. 51(6), 1457–1468 (2003).

    Article  MathSciNet  MATH  Google Scholar 

  55. L. Tong, G. Xu, T. Kailath, Blind identification and equalization based on second-order statistics: a time domain approach. IEEE Trans. Inf. Theory 40(2), 340–349 (1994).

    Article  Google Scholar 

  56. H.H. Zeng, L. Tong, Blind channel estimation using the second-order statistics: asymptotic performance and limitations. IEEE Trans. Signal Process. 45(8), 2060–2071 (1997).

    Article  Google Scholar 

  57. S. Chen, Y. Wu, S. McLaughlin, Genetic algorithm optimization for blind channel identification with higher order cumulant fitting. IEEE Trans. Evol. Comput. 1(4), 259–265 (1997).

    Article  Google Scholar 

  58. J.K. Tugnait, Identification and deconvolution of multichannel linear non-Gaussian processes using higher order statistics and inverse filter criteria. IEEE Trans. Signal Process. 45(3), 658–672 (1997).

    Article  Google Scholar 

  59. B. Muquet, M. de Courville, Blind and semi-blind channel identification methods using second order statistics for OFDM systems, in 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing Proceedings. ICASSP99 (Cat. No.99CH36258), vol. 5 (1999), pp. 2745–27485

  60. H. Bolcskei, R.W. Heath, A.J. Paulraj, Blind channel identification and equalization in OFDM-based multiantenna systems. IEEE Trans. Signal Process. 50(1), 96–109 (2002).

    Article  Google Scholar 

  61. R.W. Heath, G.B. Giannakis, Exploiting input cyclostationarity for blind channel identification in OFDM systems. IEEE Trans. Signal Process. 47(3), 848–856 (1999).

    Article  Google Scholar 

  62. M. de Courville, P. Duhamel, P. Madec, J. Palicot, Blind equalization of OFDM systems based on the minimization of a quadratic criterion, in Proceedings of ICC/SUPERCOMM ’96 - International Conference on Communications, vol. 3 (1996), pp. 1318–13223.

  63. A. Petropulu, R. Zhang, R. Lin, Blind OFDM channel estimation through simple linear precoding. IEEE Trans. Wirel. Commun. 3(2), 647–655 (2004).

    Article  Google Scholar 

  64. S. Yatawatta, A.P. Petropulu, Blind channel estimation in MIMO OFDM systems with multiuser interference. IEEE Trans. Signal Process. 54(3), 1054–1068 (2006).

    Article  MATH  Google Scholar 

  65. F. Gao, A. Nallanathan, Blind channel estimation for MIMO OFDM systems via nonredundant linear precoding. IEEE Trans. Signal Process. 55(2), 784–789 (2007).

    Article  MathSciNet  MATH  Google Scholar 

  66. J. Gao, X. Zhu, A.K. Nandi, Non-redundant precoding and PAPR reduction in MIMO OFDM systems with ICA based blind equalization. IEEE Trans. Wirel. Commun. 8(6), 3038–3049 (2009).

    Article  Google Scholar 

  67. E. Moulines, P. Duhamel, J.-F. Cardoso, S. Mayrargue, Subspace methods for the blind identification of multichannel FIR filters. IEEE Trans. Signal Process. 43(2), 516–525 (1995).

    Article  Google Scholar 

  68. J. Namgoong, T.F. Wong, J.S. Lehnert, Subspace multiuser detection for multicarrier DS-CDMA. IEEE Trans. Commun. 48(11), 1897–1908 (2000).

    Article  Google Scholar 

  69. F. Verde, Subspace-based blind multiuser detection for quasi-synchronous MC-CDMA systems. IEEE Signal Process. Lett. 11(7), 621–624 (2004).

    Article  Google Scholar 

  70. H. Cheng, S.C. Chan, Blind linear MMSE receivers for MC-CDMA systems. IEEE Trans. Circuits Syst. I Regul. Pap. 54(2), 367–376 (2007).

    Article  MathSciNet  Google Scholar 

  71. S. Roy, C. Li, A subspace blind channel estimation method for OFDM systems without cyclic prefix. IEEE Trans. Wirel. Commun. 1(4), 572–579 (2002).

    Article  Google Scholar 

  72. S. Wang, J.H. Manton, Blind channel estimation for non-CP OFDM systems using multiple receive antennas. IEEE Signal Process. Lett. 16(4), 299–302 (2009).

    Article  Google Scholar 

  73. S. Wang, J.H. Manton, A cross-relation-based frequency-domain method for blind SIMO-OFDM channel estimation. IEEE Signal Process. Lett. 16(10), 865–868 (2009).

    Article  Google Scholar 

  74. B. Muquet, M. de Courville, P. Duhamel, Subspace-based blind and semi-blind channel estimation for OFDM systems. IEEE Trans. Signal Process. 50(7), 1699–1712 (2002).

    Article  Google Scholar 

  75. C. Li, S. Roy, Subspace-based blind channel estimation for OFDM by exploiting virtual carriers. IEEE Trans. Wirel. Commun. 2(1), 141–150 (2003).

    Article  Google Scholar 

  76. C. Shin, R.W. Heath, E.J. Powers, Blind channel estimation for MIMO-OFDM systems. IEEE Trans. Veh. Technol. 56(2), 670–685 (2007).

    Article  Google Scholar 

  77. F. Gao, Y. Zeng, A. Nallanathan, T.-S. Ng, Robust subspace blind channel estimation for cyclic prefixed MIMO ODFM systems: algorithm, identifiability and performance analysis. IEEE J. Sel. Areas Commun. 26(2), 378–388 (2008).

    Article  Google Scholar 

  78. C.-C. Tu, B. Champagne, Subspace-based blind channel estimation for MIMO-OFDM systems with reduced time averaging. IEEE Trans. Veh. Technol. 59(3), 1539–1544 (2010).

    Article  Google Scholar 

  79. J.-G. Kim, J.-H. Oh, J.-T. Lim, Subspace-based channel estimation for MIMO-OFDM systems with few received blocks. IEEE Signal Process. Lett. 19(7), 435–438 (2012).

    Article  Google Scholar 

  80. S. Zhou, G.B. Giannakis, Finite-alphabet based channel estimation for OFDM and related multicarrier systems. IEEE Trans. Commun. 49(8), 1402–1414 (2001).

    Article  MATH  Google Scholar 

  81. C.H. Aldana, E. de Carvalho, J.M. Cioffi, Channel estimation for multicarrier multiple input single output systems using the EM algorithm. IEEE Trans. Signal Process. 51(12), 3280–3292 (2003).

    Article  MathSciNet  MATH  Google Scholar 

  82. I. Ghaleb, O.A. Alim, K. Seddik, A new finite alphabet based blind channel estimation for OFDM systems, in IEEE 5th Workshop on Signal Processing Advances in Wireless Communications, vol. 2004 (2004), pp. 102–105.

  83. Z. Hou, V.K. Dubey, Improved finite-alphabet based channel estimation for OFDM systems, in The Ninth International Conference on Communications Systems, 2004. ICCS 2004 (2004). pp. 155–159.

  84. Z. Chen, T. Zhang, Z. Gong, Finite-alphabet and decision-feedback based channel estimation for space-time coded OFDM systems, in Joint IST Workshop on Mobile Future, 2006 and the Symposium on Trends in Communications. SympoTIC ’06 (2006). pp. 64-67.

  85. R.K. Martin, J. Balakrishnan, W.A. Sethares, C.R. Johnson, A blind adaptive TEQ for multicarrier systems. IEEE Signal Process. Lett. 9(11), 341–343 (2002).

    Article  Google Scholar 

  86. J. Balakrishnan, R.K. Martin, C.R. Johnson, Blind, adaptive channel shortening by sum-squared auto-correlation minimization (SAM). IEEE Trans. Signal Process. 51(12), 3086–3093 (2003).

    Article  MathSciNet  MATH  Google Scholar 

  87. G.A. Al-Rawi, T.Y. Al-Naffouri, A. Bahai, J. Cioffi, Exploiting error-control coding and cyclic-prefix in channel estimation for coded OFDM systems. IEEE Commun. Lett. 7(8), 388–390 (2003).

    Article  Google Scholar 

  88. M.C. Necker, G.L. Stuber, Totally blind channel estimation for OFDM on fast varying mobile radio channels. IEEE Trans. Wirel. Commun. 3(5), 1514–1525 (2004).

    Article  Google Scholar 

  89. T.-H. Chang, W.-K. Ma, C.-Y. Chi, Maximum-likelihood detection of orthogonal space-time block coded OFDM in unknown block fading channels. IEEE Trans. Signal Process. 56(4), 1637–1649 (2008).

    Article  MathSciNet  MATH  Google Scholar 

  90. H. Li, Blind channel estimation for multicarrier systems with narrowband interference suppression. IEEE Commun. Lett. 7(7), 326–328 (2003).

    Article  Google Scholar 

  91. N. Sarmadi, S. Shahbazpanahi, A.B. Gershman, Blind channel estimation in orthogonally coded MIMO-OFDM systems: a semidefinite relaxation approach. IEEE Trans. Signal Process. 57(6), 2354–2364 (2009).

    Article  MathSciNet  MATH  Google Scholar 

  92. X.G. Doukopoulos, G.V. Moustakides, Blind adaptive channel estimation in ofdm systems. IEEE Trans. Wirel. Commun. 5(7), 1716–1725 (2006).

    Article  Google Scholar 

  93. L. Deng, Y.M. Huang, Q. Chen, Y. He, X. Sui, Collaborative blind equalization for time-varying OFDM applications enabled by normalized least mean and recursive square methodologies. IEEE Access 8, 103073–103087 (2020).

    Article  Google Scholar 

  94. W. Li, D. Qu, T. Jiang, An efficient preamble design based on comb-type pilots for channel estimation in FBMC/OQAM systems. IEEE Access 6, 64698–64707 (2018).

    Article  Google Scholar 

  95. V.K. Singh, M.F. Flanagan, B. Cardiff, Generalized least squares based channel estimation for FBMC-OQAM. IEEE Access 7, 129411–129420 (2019).

    Article  Google Scholar 

  96. D. Ren, J. Li, G. Lu, J. Ge, Per-subcarrier RLS adaptive channel estimation combined with channel equalization for FBMC/OQAM systems. IEEE Wirel. Commun. Lett. 9(7), 1036–1040 (2020).

    Article  Google Scholar 

  97. C.-S. Yeh, Y. Lin, Channel estimation using pilot tones in OFDM systems. IEEE Trans. Broadcast. 45(4), 400–409 (1999).

    Article  Google Scholar 

  98. S. Coleri, M. Ergen, A. Puri, A. Bahai, Channel estimation techniques based on pilot arrangement in OFDM systems. IEEE Trans. Broadcast. 48(3), 223–229 (2002).

    Article  Google Scholar 

  99. M.-X. Chang, Y.T. Su, Model-based channel estimation for OFDM signals in Rayleigh fading. IEEE Trans. Commun. 50(4), 540–544 (2002).

    Article  Google Scholar 

  100. R. Negi, J. Cioffi, Pilot tone selection for channel estimation in a mobile OFDM system. IEEE Trans. Consum. Electron. 44(3), 1122–1128 (1998).

    Article  Google Scholar 

  101. I. Barhumi, G. Leus, M. Moonen, Optimal training design for MIMO OFDM systems in mobile wireless channels. IEEE Trans. Signal Process. 51(6), 1615–1624 (2003).

    Article  Google Scholar 

  102. S. Ohno, G.B. Giannakis, Average-rate optimal PSAM transmissions over time-selective fading channels. IEEE Trans. Wirel. Commun. 1(4), 712–720 (2002).

    Article  Google Scholar 

  103. J.K. Moon, S.I. Choi, Performance of channel estimation methods for OFDM systems in a multipath fading channels. IEEE Trans. Consum. Electron. 46(1), 161–170 (2000).

    Article  Google Scholar 

  104. H. Steendam, On the pilot carrier placement in multicarrier-based systems. IEEE Trans. Signal Process. 62(7), 1812–1821 (2014).

    Article  MathSciNet  MATH  Google Scholar 

  105. J.-W. Choi, Y.-H. Lee, Optimum pilot pattern for channel estimation in OFDM systems. IEEE Trans. Wirel. Commun. 4(5), 2083–2088 (2005).

    Article  Google Scholar 

  106. R.J. Baxley, J.E. Kleider, G.T. Zhou, Pilot design for OFDM with null edge subcarriers. IEEE Trans. Wirel. Commun. 8(1), 396–405 (2009).

    Article  Google Scholar 

  107. D. Hu, L. Yang, Y. Shi, L. He, Optimal pilot sequence design for channel estimation in MIMO OFDM systems. IEEE Commun. Lett. 10(1), 1–3 (2006).

    Article  Google Scholar 

  108. P. Fertl, G. Matz, Channel estimation in wireless OFDM systems with irregular pilot distribution. IEEE Trans. Signal Process. 58(6), 3180–3194 (2010).

    Article  MathSciNet  MATH  Google Scholar 

  109. Q. Li, M. Wen, Y. Zhang, J. Li, F. Chen, F. Ji, Information-guided pilot insertion for OFDM-based vehicular communications systems. IEEE Internet Things J. 6(1), 26–37 (2019).

    Article  Google Scholar 

  110. J.-H. Oh, J.-G. Kim, J.-T. Lim, On the design of pilot symbols for OFDM systems over doubly-selective channels. IEEE Commun. Lett. 15(12), 1335–1337 (2011).

    Article  Google Scholar 

  111. Y. Chen, L. You, A.-A. Lu, X. Gao, X.-G. Xia, Channel estimation and robust detection for IQ imbalanced uplink massive MIMO-OFDM with adjustable phase shift pilots. IEEE Access 9, 35864–35878 (2021).

    Article  Google Scholar 

  112. Z. Sheng, H.D. Tuan, H.H. Nguyen, Y. Fang, Pilot optimization for estimation of high-mobility OFDM channels. IEEE Trans. Veh. Technol. 66(10), 8795–8806 (2017).

    Article  Google Scholar 

  113. M.R. Raghavendra, S. Bhashyam, K. Giridhar, Exploiting hopping pilots for parametric channel estimation in OFDM systems. IEEE Signal Process. Lett. 12(11), 737–740 (2005).

    Article  Google Scholar 

  114. K. Kim, H. Park, H.M. Kwon, Optimum clustered pilot sequence for OFDM systems under rapidly time-varying channel. IEEE Trans. Commun. 60(5), 1357–1370 (2012).

    Article  Google Scholar 

  115. J. Wang, H. Yu, Y. Wu, F. Shu, J. Wang, R. Chen, J. Li, Pilot optimization and power allocation for OFDM-based full-duplex relay networks with IQ-imbalances. IEEE Access 5, 24344–24352 (2017).

    Article  Google Scholar 

  116. K. Chen-Hu, M.J.F.-G. Garcia, A.M. Tonello, A.G. Armada, Pilot pouring in superimposed training for channel estimation in CB-FMT. IEEE Trans. Wirel. Commun. 20(6), 3366–3380 (2021).

    Article  Google Scholar 

  117. H. Zhang, B. Sheng, An enhanced partial-data superimposed training scheme for OFDM systems. IEEE Commun. Lett. 24(8), 1804–1807 (2020).

    Article  Google Scholar 

  118. J.C. Estrada-Jimenez, B.G. Guzman, M.J. Fernandez-Getino Garcıa, V.P.G. Jimenez, Superimposed training-based channel estimation for MISO optical-OFDM VLC. IEEE Trans. Veh. Technol. 68(6), 6161–6166 (2019).

    Article  Google Scholar 

  119. J.C. Estrada-Jimenez, M.J. Fernandez-Getino Garcıa, Partial-data superimposed training with data precoding for OFDM systems. IEEE Trans. Broadcast. 65(2), 234–244 (2019)

    Article  Google Scholar 

  120. Q. Wang, G. Dou, X. He, R. Deng, J. Gao, Novel OFDM system using data-nulling superimposed pilots with subcarrier index modulation. IEEE Commun. Lett. 22(10), 2164–2167 (2018).

    Article  Google Scholar 

  121. X. Cai, G.B. Giannakis, Error probability minimizing pilots for OFDM with M-PSK modulation over Rayleigh-fading channels. IEEE Trans. Veh. Technol. 53(1), 146–155 (2004).

    Article  Google Scholar 

  122. E.G. Larsson, J. Li, Preamble design for multiple-antenna OFDM-based WLANs with null subcarriers. IEEE Signal Process. Lett. 8(11), 285–288 (2001).

    Article  Google Scholar 

  123. M. Dong, L. Tong, B.M. Sadler, Optimal pilot placement for channel tracking in OFDM. Proc. MILCOM 1, 602–6061 (2002).

    Article  Google Scholar 

  124. S. Adireddy, L. Tong, H. Viswanathan, Optimal placement of training for frequency-selective block-fading channels. IEEE Trans. Inf. Theory 48(8), 2338–2353 (2002).

    Article  MathSciNet  MATH  Google Scholar 

  125. X. Ma, L. Yang, G.B. Giannakis, Optimal training for MIMO frequency-selective fading channels. IEEE Trans. Wirel. Commun. 4(2), 453–466 (2005).

    Article  Google Scholar 

  126. M. Dong, L. Tong, Optimal design and placement of pilot symbols for channel estimation. IEEE Trans. Signal Process. 50(12), 3055–3069 (2002).

    Article  Google Scholar 

  127. C. Budianu, L. Tong, Channel estimation for space-time orthogonal block codes, in ICC 2001. IEEE International Conference on Communications. Conference Record (Cat. No.01CH37240), vol. 4 (2001), pp. 1127–11314.

  128. A. Aggarwal, T.H. Meng, Minimizing the peak-to-average power ratio of OFDM signals using convex optimization. IEEE Trans. Signal Process. 54(8), 3099–3110 (2006).

    Article  MATH  Google Scholar 

  129. X. Guo, J. Zhang, S. Chen, C. Zhu, J. Yang, Optimal uplink pilot-data power allocation for large-scale antenna array-aided OFDM systems. IEEE Trans. Veh. Technol. 69(1), 428–442 (2020).

    Article  Google Scholar 

  130. N. Chen, G.T. Zhou, Peak-to-average power ratio reduction in OFDM with blind selected pilot tone modulation. IEEE Trans. Wirel. Commun. 5(8), 2210–2216 (2006).

    Article  Google Scholar 

  131. S. Ehsanfar, M. Matthe, M. Chafii, G.P. Fettweis, Pilot- and CP-aided channel estimation in MIMO non-orthogonal multi-carriers. IEEE Trans. Wirel. Commun. 18(1), 650–664 (2019).

    Article  Google Scholar 

  132. Z. Na, Z. Pan, M. Xiong, X. Liu, W. Lu, Y. Wang, L. Fan, Turbo receiver channel estimation for GFDM-based cognitive radio networks. IEEE Access 6, 9926–9935 (2018).

    Article  Google Scholar 

  133. M.D. Nisar, W. Anjum, F. Junaid, Preamble design for improved noise suppression in FBMC-OQAM channel estimation. IEEE Wirel. Commun. Lett. 9(9), 1471–1475 (2020).

    Article  Google Scholar 

  134. A.I. Perez-Neira, M. Caus, R. Zakaria, D. Le Ruyet, E. Kofidis, M. Haardt, X. Mestre, Y. Cheng, MIMO signal processing in offset-QAM based filter bank multicarrier systems. IEEE Trans. Signal Process. 64(21), 5733–5762 (2016).

    Article  MathSciNet  MATH  Google Scholar 

  135. M. Fuhrwerk, S. Moghaddamnia, J. Peissig, Scattered pilot-based channel estimation for channel adaptive FBMC-OQAM systems. IEEE Trans. Wirel. Commun. 16(3), 1687–1702 (2017).

    Article  Google Scholar 

  136. W. Liu, S. Schwarz, M. Rupp, T. Jiang, Pairs of pilots design for preamble-based channel estimation in OQAM/FBMC systems. IEEE Wirel. Commun. Lett. 10(3), 488–492 (2021).

    Article  Google Scholar 

  137. D. Kong, P. Liu, Q. Wang, J. Li, X. Li, X. Cheng, Preamble-based MMSE channel estimation with low pilot overhead in MIMO-FBMC systems. IEEE Access 8, 148926–148934 (2020).

    Article  Google Scholar 

  138. W. Cui, D. Qu, T. Jiang, B. Farhang-Boroujeny, Coded auxiliary pilots for channel estimation in FBMC-OQAM systems. IEEE Trans. Veh. Technol. 65(5), 2936–2946 (2016).