A multi-layered OFDM system with parallel transmission for multicell cooperative cellular networks

Recent development in multicell cooperation poses significant technical challenges to the design of robust and flexible transmission techniques. A new multi-layered orthogonal frequency-division multiplexing (ML-OFDM) system is proposed in this article to provide a dynamic platform for multicell cooperation with efficient base station coordination capability. The proposed enhanced layers (ELs), which are overlaid with the cellular communication data (the base layer) in both frequency and time domains, can be used for several specific purposes indispensable to multicell cooperation. It provides an efficient way of sharing the necessary information, e.g., channel state information, user data and other transmission parameters, between the collaborative BSs without the requirement of additional signaling or control channels. Overall network efficiency is substantially enhanced due to the reduction of radio resource overhead. Furthermore, cross BS synchronization and multimedia broadcast multicast service for next generation cellular networks can be simultaneously achieved by the proposed parallel orthogonal ELs. The transceiver design for the ML-OFDM system, particularly the modulation/demodulation of the ELs and EL-induced interference cancelation is presented. Overall system performance is further optimized by proposing a power distribution scheme with a set of practical constraints. The performance of the ML-OFDM system is analyzed and verified through numerical simulations.


I. Introduction
The ever-increasing demand for broadband mobile multimedia applications brings significant challenges to the design of future-generation cellular networks. Due to the significant users' requirements for mixed services, the concept of cellular hybrid [1][2][3][4] is becoming an intriguing requirement to simultaneously support different services, e.g., the two main types: point-to-point unicast (conventional cellular communications) and point-tomultipoint broadcast. However, today's design methodology of cellular networks is characterized by uncoordinated approaches in several aspects. On one hand, the integration of unicast and broadcast services are usually realized by utilizing separate wireless infrastructures or orthogonal multiplexing techniques including time/ frequency-division multiplexing (TDM/FDM), a leading to inefficient hardware or resource utilization.
On the other hand, the conventional cellular communications based on single cell processing (SCP), have very limited sharing of spectrum resources due to the resultant large inter-cell interference, and therefore, preventing the potential enhancement of network throughput and coverage [5]. Although the SCP generally served well in the past 2G/3G networks, the growing popularity of high-speed wireless applications in recent years poses a looming challenge due to the performance limitation of the existing methodology, necessitating a new transmission paradigm referred to as multicell cooperation, which exploits the inter-cell interference cooperatively by enabling joint signal processing among several interfering base stations (BSs).
Multicell cooperation, sometimes also known as distributed antenna system or multicell multiple-input-multiple-output (MIMO), is a revolutionary technique which aims to eliminate the capacity-limiting factor of conventional cellular networks and remarkably improve the overall system performance [6][7][8][9]. This intelligent wireless system prescribes coordinated signaling strategies such as power allocation, beamforming directions, user scheduling, and joint encoding/decoding of the transmitted/received signals at the BSs depending on the required levels of cooperation [6]. Recently, it has attracted lots of attention from both industrial and academic communities. For instance, the 3GPP LTE-Advanced [10] standard, where the network coordination is known as coordinated multi-point (CoMP) transmission has been calling for standardization of signaling schemes for this technique since September of 2010 for possible consideration in future Releases of LTE-Advanced. Although many pioneering studies have been done in the literature, which evaluate the performance of multicell cooperation through various informationtheoretic models with simplified assumptions [11][12][13][14][15][16][17][18][19][20][21], the real implementation-related issues will still result in significant technical challenges in the design of transmission schemes for this new technique.
(1) Backhaul issues: Current multicell cooperation techniques are enabled by the presence of a backhaul network with unlimited capacity and free delay, which connects the BSs with each other or with a central processor. Since this assumption is quite unrealistic, specially for large scale networks, the limited-capacity backhaul within the pre-existing infrastructure may be unable to reliably transmit channel state information (CSI) and user data information among the collaborative BSs. As a result, the desired transmission technique should be capable of providing a robust signaling of transmitting the required information between collaborative BSs. The desired signaling can be used supplementarily to reduce the burden of the existing limited backhaul network or as stand-alone scheme when the backhaul network is unavailable.
(2) Network latency: Due to the large overhead of global CSI/user data and the limited transmission capability of the backhaul network, the distribution of the necessary information among BSs must be achieved by well-designed cross-layer algorithm including media access control (MAC) layer scheduling as well as physical (PHY) layer transmission strategies [22][23][24][25]. The communication between the PHY and upper layers protocols, and the traffic routing will naturally bring excessive time delay, causing dramatic performance degradation especially when the delay exceeds the coherence time of the downlink channels.
(3) BS synchronization: To guarantee the mitigation of inter-cell interference, the desired signal components transmitted from different collaborative BSs to the target mobile user must arrive synchronously. Efficient and accurate cross BS synchronization is another fundamental enabling technology for multicell cooperation since the imperfect timing advance will inevitably have negative impacts, e.g., power degradation of the desired signal and additional intersymbol inter-ference (ISI) [6]. Hence, tight synchronization between BSs by exploiting alternative signaling scheme is quite challenging, specially in situations where the global positioning system (GPS) signal is unreliable (e.g., indoor or dense urban areas).
The motivation of this article is to address the aforementioned challenges with the proposed multi-layered orthogonal frequency-division multiplexing (ML-OFDM) system. OFDM is envisioned as a key technology for broadband wireless communications due to its high spectral efficiency, robustness to multipath distortions, and simple receiver design [26]; most of the broadband systems (DVB-T, WiMAX and LTE) are already OFDM-based, and therefore, we propose a ML-OFDM system which provides a robust, efficient and flexible platform specially tailored for the newly conceptual multicell cooperative cellular networks. The principle of the ML-OFDM system is shown in Figure 1, where the base layer (BL) provides conventional orthogonal frequency-division multiple access (OFDMA)based unicast services for cellular users and the enhanced layers (ELs) offer several other important functionalities for the cellular network. First, the proposed EL can provide a dedicated over-the-air link among different BSs of exchanging the available information including the CSI pertaining to all relevant direct and interfering links, data symbols sent to the target mobile user and other transmission parameters such as power, beamforming coefficients, time slot, subcarrier usage and etc. These information can be sent concurrently with the user data-carrying signal (signal of BL) for dynamic BS coordination purposes. Such coordination protocol can be realized by solely exploiting the proposed EL or using the signaling to enhance the pre-existing finite-capacity backhaul. Second, another highly desired feature called multimedia broadcast multicast service (MBMS) for the upcoming 3GPP can be supported by other parallel ELs. Target applications include mobile TV, radio broadcasting, as well as public messaging and emergency alerts for all the silent receivers within the cell. As the ELs and the BL are overlaid across both the entire frequency and time domains, the tedious procedure to establish separate infrastructures or design orthogonal multiplexing could be avoided, which significantly reduces the implementation cost and radio resource overhead. In addition, cross BS synchronization can be efficiently designed based on the alternative EL. Each BS sends its unique beacon signal which carries timing and frequency information to its surrounding BSs. By detecting the signals transmitted from other BSs and comparing the timing and frequency information with its local reference, the offsets are frequently compensated [27]. The major difference between the proposed ELs and conventional control channels for the aforementioned purposes is that the ELs can support multiple functionalities simultaneously without occupying additional radio resources.
The rest of the article is structured as follows. The principle and architecture of the proposed ML-OFDM system are presented in Sections 2 and 3. A new modulation scheme for the proposed ELs and the corresponding transmitter/receiver design are studied. Based on the interference analysis for the proposed ML-OFDM, an efficient EL induced interference cancelation algorithm is also proposed. In Section 4, we analyze the system performance including the error probability of the proposed EL and its impact on the capacity of the BL. Based on these analysis, a power distribution scheme is proposed which optimizes the overall system performance with some practical constraints in Section 5. Simulation results are provided and discussed to evaluate and validate the performance and feasibility of the proposed system in Section 6. The article is finally concluded in Section 7.

II. Transmitter design for the proposed system
A. Overall signal structure The transmitter's block diagram of the proposed ML-OFDM system is shown in Figure 2a. Let X(k) denote the kth user's complex-valued data on the kth subcarrier of the BL and N denote the total number of subcarriers. The corresponding OFDM block of the BL is given by Without loss of generality, we assume a total of K ELs and the BL are overlaid across both the frequency and time domains. The K ELs are designed to provide multiple functionalities including BS cooperation signaling, synchronization and MBMS, etc. The data streams on these ELs are first modulated by the proposed scheme (described in the following subsection) and then superimposed onto X with different power. Therefore, the overall frequency-domain signal can be formulated according to where E i denotes the signal vector of the ith EL and P i denotes its corresponding transmit power. Each timedomain data block is then generated by N point inverse discrete Fourier transform (IDFT), Note that by assuming that the cyclic prefix (CP) in the system is longer than the maximum channel delay spread, the transmitted symbols are free of ISI and therefore, the insertion and removal of the CP will not be included in the following discussions throughout the article.

B. Modulation of the ELs
In this subsection, we propose a modified code shift keying (MCSK) for the ELs. For the conventional CSK, M different cyclic phase shifts of a signature sequence with length longer than M is employed as M-ary signaling to transmit data symbols [28][29][30]. The CSK can offer very high noise and interference immunity such that extremely robust transmission can be achieved which is highly desired by multicell cooperation and MBMS applications. However, the main challenge of the conventional CSK is its limited data rate by the length of the signature sequence. Hence, we propose to increase the data rate by using a shorter signature sequence such that multiple data symbols can be transmitted within one data block.
Without loss of generality, we describe the modulation of the jth EL. Denoting the signature sequence used on this layer as z (j) with length M and assuming N/M = is an integer, we can collect total Ψ data symbols within one data block with each symbol mapped to a modulated sequence. As shown in Figure 3, the input data stream is first grouped into k-bit k = log 2 M data symbols and thus can be represented as According to the symbol value of d (j) m , the signature sequence z (j) is cyclicly shifted by a unique phase O m determined by the symbol value and denoted by z The phase O m can take value of {0, 1, ..., M − 1} and therefore, log 2 M bits can be transmitted within each data symbol and a total number of · log 2 M within one data block. The related transmission parameters of the MCSK are summarized in Table 1.
After one-to-one mapping between each data symbol and a modulated sequence, the output signal on the ith EL can be represented by The above steps can be repeated for the other ELs except that the signature sequence z (j) is now replaced by z (i) , i = 1,..., K, i = j for modulation purpose.
We now consider the design of the signature sequences. Denote the set of signature sequences used by the K ELs by, (1) , z (2) , ..., z (K−1) .
Due to the superposition of parallel ELs, the mutual layer interference is inevitably introduced. To eliminate the impact, the signals on different ELs should be designed to be orthogonal. Therefore, the ideal design of the signature sequences should meet the following criteria, Eq. (8) indicates that ideally the signature sequence should be orthogonal to its cyclically shifted versions and other sequences regardless of any shift in the set such that the mutual interference between different ELs will completely be eliminated. However, sequences with such perfect correlation properties do not exist in mathematics. Recently, Zadoff-Chu sequence has received considerable attention and has been adopted by 3GPP LTE air interface as primary synchronization signal (PSS) and random access preamble (PRACH). One important property of Zadoff-Chu sequence is that it has perfect cyclic auto-correlation and small cyclic cross-correlation values. By assigning nearly-orthogonal sequences to different ELs, the mutual layer interference is significantly reduced while different ELs can be 11100101 11001001  uniquely identified at the receiver. In addition, it improves over other sequences due to its constant amplitude which reduces the complexity in the design of radio power amplifier. Therefore, it is preferred for CSK modulation and used as the signature sequence in the proposed system. For clarity of exposition, an example is given where N = 1024 and M = 64, then it is possible transmit 6 bits for each sequence and a maximum number of 6 × 16 = 96 bits can be transmitted for the EL within one data block. In the case of 3GPP-LTE evolved universal terrestrial radio access (EUTRA) air interface where the symbol duration is 66.7 μs, data rate for this EL can be as high as R b = 96 bits/66.7μs ≈ 1.4 Mbps. This rate is sufficiently high for MBMS, i.e., video conferencing quality stream (128 -384 kbps) and VCD quality stream (1.15 Mbps max), as well as information sharing for multicell cooperation purposes.

C. Advantages of the proposed modulation
The proposed MCSK is capable of providing a robust, efficient and flexible way to transmit additional data on the ELs. Its advantages can be summarized as follows: 1) The MCSK offers significantly higher robustness than the conventional quadrature amplitude modulation (QAM) and based on which, effective ELinduced interference cancelation can be designed to reduce the impact of the ELs on the cellular communications.
2) No dedicated radio resources are needed additionally as the transmitted signals on the ELs and the cellular users' data are fully overlapped.
3) The data rate of the ELs can be flexibly adjusted according to the system specified requirements. For example, in emergency communications which requires low data rate but very high reliability, each data symbol can be repeated by several times in one data block, e.g., (4) can be reformulated as The benefit of this strategy is that at the receiver side, coherent combining can be performed prior to data detection to enhance the signal-to-noise ratio (SNR) of the link. The same mechanism can also be applied to the muticell cooperation, when the overhead of the shared CSI/user data is reduced due to the use of quantization techniques [31,32].

III. Receiver design for the proposed system
In order to implement the proposed ML-OFDM system, some modifications are necessary to traditional OFDM receiver. As the overlay ELs' signals appear to be large interference to the BL's signal, to guarantee the service quality of the cellular communications, the first step is to demodulate the data on the ELs and the receiver is then capable of removing the interference induced by the previous demodulated ELs based on the estimated channel and the regenerated signals on the ELs.

A. Data detection of the ELs
Consider a block fading multipath channel h = [h 0 , h 1 , h 2 , ..., h L-1 ] and its frequency response can be denoted by H whose elements are given by, The received signal after time to frequency domain conversion can be represented as where W(k) denotes the kth subcarrier additive white Gaussian noise (AWGN) sample with zero mean and variance σ 2 n . For analytical simplicity, we assume that the channel is estimated and compensated and therefore, the signal after frequency-domain equalization can be written as where W'(k) ≜ W(k)/H(k) represents the kth subcarrier AWGN scaled by the channel frequency response. To be consistent with Section 2.B, we now discuss the demodulation of the mth data symbol of the jth EL. The cyclic phase embedded in the sequence is detected by computing the frequency domain cross-correlations between the corresponding signal segment and the local generated signature sequence with all the possible cyclic phase shifts: .., M − 1 is the signature sequence of the jth EL with all the possible cyclic phase shifts.
Substituting (12) into (13), (13) can be further expanded as The mutual layer interference between the ELs can be assumed to be negligible due to the use of the Zadoff Chu sequences when compared with the strong interference from the BL and the AWGN, which is The local signature sequence with cyclic phase shift that leads to the maximum correlation output is the phase encoded from the transmitted data bits, With the one to one mapping between the decimal value of the input data symbol and the cyclic phase shift O m , the original data d (j) m in (4) can be retrieved.

B. Interference cancelation of ELs
The superposition of ELs' signals may cause large interference to the demodulation of the OFDM data of the BL. However, due to the significantly large gap between the operating SNR ranges of the ELs and the BL, the bit error rates (BERs) of the ELs can already be very low at the SNR of traditional OFDM receiver such that the probability that the regenerated ELs' signals E' i = E i is close to unity. Hence, E' i can directly be subtracted from the received signal according to, In practical wireless communications, the interference cancelation can be imperfect due to the channel estimation and data detection errors, resulting in certain residual interference to the BL after the subtraction. If we denote the estimated channel frequency response as H' and the residual interference can be written as where (·) denotes the element-by-element multiplication. ΔH ≜ H -H' and E i E i − E' i denote the error of channel estimation and regenerated signal, respectively. Note that the last term in (18) is small in magnitude and can be neglected. Then the variance of I can be calculated as, where σ 2 H and σ 2 H denotes the variance of channel estimation error and the total channel power, respectively. P e,i represents the symbol error rate (SER) of the ith EL. The residual interference is influenced by both channel estimation and data detection results. Nevertheless, as P e, i is already sufficiently small (≈ 10 -6 ), the second term on the right-hand side of (19) can be significantly weaker than the first term. (19) also implies that the accumulated residual interference from all the ELs may dramatically reduce the BL's performance. As a result, further power distribution scheme is definitely required to optimize the overall system performance as we will discuss later in the following sections.
Based on the interference-suppressed signal S'', hard decisions are made upon S'' and the original data on the BL can be obtained.

IV. Performance evaluation of the proposed system
A. Error performance analysis of the ELs As the proposed MCSK modulation is very similar to conventional M-ary orthogonal signaling, the BER expression given by [33, can be used to evaluate the performance of the proposed MCSK. However, it is difficult to obtain a closed-form expression and thus extremely high computational complexity for the potential power distribution design is expected if the exact BER expression is used as a constraint. Considering the efficiency of further power distribution scheme, a simple BER upper bound is derived in this subsection.
In order to evaluate the robustness of the proposed modulation scheme, the peak-to-noise ratio (PNR) of the correlation output is first analyzed. For analytical simplicity, the correlation output in (14) can be rewritten as where O ϕ (k) * denote the ideal peak gain and the associated interference and noise term, respectively. n can be considered a Gaussian random variable with the distribution, where For analytical simplicity, we consider the signal propa-gation environments where one dominant path exists in the multipath channel and the maximum channel delay spread is short when compared with the OFDM symbol duration. In this case, the channel frequency response can have relatively low frequency selectivity and σ 2 w can be approximated to where γ σ 2 n H(0) 2 σ 2 d is defined as the inverse SNR. Therefore the PNR of the correlation output can be represented by It is worth mentioning that in the case of (9), where each segment on the EL is repeated by R times for robustness enhancement, the corresponding segments can be coherently combined prior to correlation detection. The correlation output in (14) can be rewritten as where X(k) is the averaged OFDM symbol with variance σ 2 d /R and W (k) is the averaged noise with variance σ 2 w /R. The corresponding PNR can then be reformulated by, .R.
As we can see from (20), the correct detection of the cyclic phase shift which matches the maximum correlation output should meet the criterion A > n + n for the M − 1 comparisons. Now let us consider a new variable y = 2n and its probability density can be derived as follows The correct detection should meet the criteria that y < A for all the M − 1 comparisons of the correlation output and therefore, the false detection probability for one comparison is given by The overall error probability of peak detection is then upper bounded by its union bound, Assume that all the data bits in one data symbol d i are equally likely. Since one symbol is composed of k bits, the BER of the jthe EL is therefore upper bounded by where we define the normalized PNR of the jth EL as NPNR j = PNR j / P j .

B. Capacity loss analysis of BL
The average channel capacity of the BL is derived in this subsection to study the impact of EL's transmission on the overall system performance. The effective received signal after interference cancelation can be represented as Assume that the least square or minimum mean square error estimators using training sequences are used for channel estimation in the proposed system and if the length of the training sequence is N t , when L/N t is sufficiently small, the residual interference term I in is essentially uncorrelated with W. Therefore, I + W in (30) can be considered as a Gaussian vector with zero mean and covariance matrix of where I N is an identity matrix of order N. Based on the analysis of the average channel capacity for flat fading channels given in [34], we derive the average channel capacity of the BL C BL by averaging the capacity of each subcarrier over all the subcarriers, where P B denotes the transmit power of the BL. For analytical simplicity, by introducing the following Gaussian random variable with zero mean and unit variance, g H k / Var H k , (31) can be reformulated where P total = P B + P E,total represents the total transmit power of the overall system and P E,total = K i=1 P i denotes the total transmit power of the ELs. Furthermore, the assumption that the normalized channel estimation error σ 2 H /σ 2 H is sufficiently small holds when accurate channel estimation techniques are adopted and therefore, Var H k can be approximated to σ 2 H . Then the capacity can be further approximated as In the meantime, the upper bound of the BL's capacity in the absence of ELs' transmission can be written as By introducing the maximum allowed capacity loss ΔC, the ELs' transmission is enabled only if the following constraint is satisfied The above constraint is referred to as the BL's capacity loss constraint and can be reformulated as where C C BL − C . The above constraint is essential for the ML-OFDM system design as it reflects the impact of ELs' transmission on the BL. If the capacity loss is sufficiently large that the BL cannot tolerate, no ELs' transmission is allowed. Therefore, the constraint will further be used in the power distribution scheme as we will discuss in the following section.

V. Power distribution for ELs' transmission
Power distribution scheme for the ELs is discussed in this section. The objective is to optimize the overall system performance by balancing the tradeoff between the proposed ELs and the BL. Therefore, we propose to minimize the overall BER of the ELs with the constraint on the BL's capacity loss. Furthermore, by taking the different reliability/coverage of different ELs into account, we introduce the proportional reliability constraints into our system. The benefit of these constraints is that we can flexibly control the reliability of different ELs for different purposes and therefore, ensuring that each signaling is able to achieve its target service quality given sufficient available transmit power and tolerable BL's capacity loss.
The power distribution problem can be expressed mathematically as, subject to, C BL ≥ C (39) P b,1 : P b,2 : · · · : P b,K = γ 1 : γ 2 : · · · : γ K , where P b,i denotes the BER upper bound of the ith EL as derived in (29). k i denotes the total effective number of transmitted bits of the ith EL. P E,total is the total transmit power budget allocated to the ELs. {γ i } K i=1 is the set of the predefined values which are used for the proportional reliability constraints. Note that the nonlinear inequality constraint -C BL ≤ -C makes the optimization problem in (37) nonconvex. Iterative methods, such as Newton-Raphson or quasi-Newton methods can be used to obtain the solutions; however, with a large amount of computational complexity. Fortunately, under certain approximations, the optimization problem can be relaxed to convex problem and therefore, the optimal or near-optimal solutions of problem can be found with low complexity. Due to the MCSK used on ELs, the operating SNR range for the ELs is much lower than that of the BL. Therefore, we analyze the low SNR case where certain approximations can be made.
Under relatively low SNR range, the SER of the ELs P e,i , i = 1,2,...,K are already very low and significantly smaller than the variance of channel estimation error (33) can be neglected and the capacity of BL given by (33) can further be simplified to It is straightforward that the approximated capacity is now a concave function w.r.t. P i , which makes the optimization problem convex. Its global optimal solution can then be obtained by Karush-Kuhn-Tucker (KKT) conditions [35] as follows, where λ, μ, and ϖ i are the Lagrange multipliers. λ ≥ 0, μ ≥ 0, and ϖ i ≥ 0 ∀i {1,...,K}.
Note that the capacity given by (33) is a monotonously increasing function w.r.t. P i and therefore, the constraint in (39) can be reformulated by where the right-hand side of the above inequality is denoted by c. The condition in (44) can then be rewritten by From (43) and (47), we note that λ and μ are not allowed to be synchronously nonzero which means when c ≠ P total . Therefore, the optimization problem can be discussed in the following circumstances:

A. When c < P E,total
With this condition, we can obtain According to (43), we must attain λ = 0. Therefore, (42) can furthered be expanded and solved by where [x, y] + ≜ max {x, y} and η i Note that when μ = λ = 0, then it is easy to obtain P i ∞, which is impossible for real implementation. Therefore, this circumstance is not allowed to occur. μ and λ must not be synchronously zero.

B. When c ≥ P E,total
Similarly, we can obtain that Thus, μ = 0 in this circumstance. The solution is thus given by with μ replaced by λ Also μ and λ are not allowed to be synchronously zero.
From the optimal power distribution solution for the ELs, it implies that the power level for different ELs depends on the parameters λ, μ, and ϖ i , i = 2,..., K. First, λ is the dual variable associated with the total transmit power budget. It is straightforward that a larger transmit power budget will result in a smaller λ and thus a high power level, and vice versa. Second, μ is the dual variable associated with the tolerable capacity loss of the BL. If the BL can accommodate a larger residual interference introduced by the transmission of ELs, μ would be smaller, and therefore a higher power level, and vice versa. For instance, in an extreme case where the BL cannot accommodate any interference or in other words, the capacity loss constraint for the BL is zero, then μ would be approaching infinity and the resultant zero power level indicates that no ELs' transmission is allowed in this condition. Similarly, the analysis can be also applied to ϖ i , i = 2,..., K which are associated with the proportional reliability constraints for the ELs.

VI. Simulation results and discussions
Computer simulations have been carried out to evaluate the performance of the proposed ML-OFDM system. The OFDM system with 1024 subcarriers and CP of length 1/8 of the symbol duration is considered in the simulation. The modulation scheme for the BL is chosen as 4-QAM. In addition, two ELs are considered in the demonstration system which are designated to the multicell cooperation signaling and the MBMS, respectively. According to Section 2.2, two nearly orthogonal Zadoff-Chu sequences with length M = 64 are used as the signature sequences for the two ELs. To show the flexibility of the ML-OFDM system, we assume that the Global vector quantization approach [32] is adopted to reduce the overhead of the shared CSI/user data between the collaborative BSs, and therefore, the requirement of data rate for the multicell cooperation signaling is very low. For power-saving purpose, each data symbol is repeated by 8 times for the first EL and then the effective number of transmitted bits becomes k 1 = 12bits. The second EL is intended to provide MBMS at a relatively high data rate, e.g., mobile TV. The signal structures of the two ELs can then be formulated according to 1 , ..., d (2) 14 , d (2) 15 . (52)

A. The impact of BL's capacity loss constraint
We first examine the impact of the BL's capacity loss constraint on the ELs' transmission. Figure 4 shows the analytical results of the maximum allowed total transmit power for the ELs with different BL's capacity loss constraints. It can be observed that as the capacity loss constraint becomes looser, higher transmit power can be 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0. 16   allocated to the ELs, which will significantly improve the reliability/coverage of each EL; however, the performance of cellular communications will be correspondingly degraded. It is also apparent that an increase in SNR will result in a decrease in the attained total transmit power. This is because as the variance of the AWGN reduces, the capacity loss is dominated by the If there exists large channel estimation error, then a small variation in P E,total will cause dramatic reduction in BL's capacity and therefore no ELs' transmission is beneficial in this case. Therefore, we subsequently assess the consequences of different accuracy of channel estimation on the BL's capacity. From Figure 5, it is observed that the capacity loss increases as the channel estimation becomes less accurate. In particular, at the SNR level of 0 dB, when the variance of channel estimation error is larger than 0.1, the capacity loss can be larger than 35%.

B. Power distribution using the proposed algorithm for the ELs
To show the near-optimality of the proposed power distribution in Section 5, the solution is compared with an optimal power distribution scheme under different accuracy of channel estimation in Figure 6. We assume that the EL's total available transmit power budget is sufficiently large such that the optimization problem is constrained by the BL's capacity loss and the ELs' proportional reliability. We further set the BL's capacity loss to 10% and proportional reliability constraint to g 1 : g 2 = 1 : 2. The optimal power distribution solves the problem by using the exact BER expressions of the ELs [33] instead of the derived BER upper bounds. The exact BER expression can be given by  The above expressions can only be calculated numerically and therefore, an exhaustive searching must be carried to to find the optimal solution P 1 and P 2 for the two ELs with extremely high computational complexity. Figure 6 shows that the gap between the proposed and the optimal power distribution is almost invisible and therefore, it confirms the near-optimality of the proposed power distribution.
The BERs of the ELs obtained by the proposed power distribution are plotted in Figure 7. It can be seen that the BERs of the two ELs are well differentiated according to the proportional constraints. For comparison, we also show the BERs derived from equal power distribution and the conventional optimal power distribution without the proportional reliability constraints [36]. It can be observed that for equal power distribution, due to the different data rate (hence different PNRs), the first EL achieves too much gain in BER over the second EL which leads to unfair resource allocation. Meanwhile, for the conventional waterfilling scheme in [36], it tends to allocate power such that the performance of the two ELs are similar. It is worth mentioning that the BER ratio of the two ELs may not be strictly equivalent to the predefined proportional constraints due to the use of the BER upper bounds; however, large computational burden is reduced which is more meaningful to realistic cellular network scenario.
To evaluate more intuitively how good the proposed power distribution scheme satisfies the proportional reliability constraints, a new metric is defined as follows.  Figure 6 The allocated power to different ELs with the proposed power distribution scheme and the optimal power distribution scheme. The BL's capacity loss is 10% and the ELs' proportional reliability constraint is g 1 : g 2 = 1 : 2. The total available transmit power budget for the ELs is assumed to be sufficiently large. The variance of channel estimation error is σ 2 H = 0.1. The optimal power distribution uses the exact BER expressions given by [33] instead of the derived BER upper bounds for the proportional reliability constraints.
Then the variance of the reliability proportion for the kth Monte Carlo run is defined as The average deviation over total I Monte Carlo runs, denoted by D = I k=1 √ V k /I, is reported in Table 2.
Note that the ideal D is supposed to be close to zero if the allocated power strictly satisfies the constraints. It can be observed that the deviations of the reliability proportion obtained by the proposed power distribution scheme is orders of magnitude smaller than those obtained by equal power distribution and conventional waterfilling algorithm.

C. The impact of ELs' transmission on BL
After the power distribution of the ELs, the SER of the BL which indicates the service quality of cellular communications is examined in the presence of ELs' transmission with different channel estimation accuracy in Figure 8. The curve labeled "ideal coherent detection" refers to the SER obtained by perfect channel estimation and interference cancelation. When highly accurate channel estimation scheme is used, the SER degradation of the BL is almost indistinguishable which indicates that the proposed ELs for multicell cooperation signaling and other purposes have virtually no impact on the cellular users. BER EL #1 -con waterfilling in [36] EL #2 -con waterfilling in [36] EL #1 -equal power distribution EL #2 -equal power distribution EL #1 -proposed power distribution EL #2 -proposed power distribution D. An option of channel estimation for accuracy improvement It can be seen from the previous simulation results that the accuracy of channel estimation has large impact on the overall system performance. In slot transmissionbased cellular networks, a preamble signal is periodically inserted in the transmitted data stream for initial synchronization and channel estimation purposes. Usually the overhead of the preamble signals should be kept small in order to the improve the transmission efficiency of the network and therefore, the accuracy of channel estimation can be limited by the length of the preamble signal. We propose an iterative decision-directed channel estimation where the accuracy of initial estimation is improved with the aid of the detected data. The process is briefly described as follows: First, the initial channel estimate is obtained by the preamble signal, which is then applied to the coherent detection of the subsequent OFDM symbols. The time-domain transmitted signal is then regenerated by IDFT and combined with the original preamble signal to formulate an extended preamble such that the length and total power of this extended preamble is substantially improved. This new "preamble" is utilized to update the channel estimation with  improved accuracy. The above process is iterated to simultaneously provide more accurate channel and data estimation. Figure 9 presents the mean square error of the iterative decision-directed channel estimation under a 10-tap multipath channel with uniform power delay profile and the total channel energy σ 2 H is normalized to 1. The preamble is a pseudo random sequence with 40 samples. In Figure 9, the curve labeled "0 iteration" represents the conventional channel estimation by utilizing only the preamble signal. The lower bound represents the case where the transmitted signal is perfectly regenerated and subsequently used as part of the extended preamble. It can be observed that the iterative decision-directed channel estimation substantially outperforms the conventional preamble-based channel estimation. With an increase in iterations, the mean square error of channel estimation reduces and achieves the lower bound at approximately 20 dB.

VII. Conclusion
A new ML-OFDM supporting multicell cooperative networks is proposed in this article. Besides the basic function of the BL as a multiple access technique for the cellular users, multiple functionalities for multicell cooperation purposes are derived from overlay ELs. By encoding the required information into these layers, concurrent transmission of cooperation-related parameters can be achieved with the cellular users' data, and therefore, complicated procedure to establish backhaul network or additional signaling can be substantially simplified. Cross BS synchronization can also be obtained by  Mean square error of channel estimation using the proposed iterative decision-directed scheme. A slot transmission mode is assumed where a preamble signal is periodically multiplexed with data-carrying OFDM symbols for initial synchronization and channel estimation purposes. A pseudo random sequence with 40 samples is used as the preamble signal. A 10-tap multipath channel with normalized uniform average power delay profile is used. the proposed EL, which significantly reduce the complexity in the design of signal transmission strategies for multicell cooperation. The corresponding transceiver is designed for the proposed ML-OFDM system based on the proposed multi-layer interference cancelation. Practical power distribution scheme is also proposed to optimize the overall system performance. The performance of the ML-OFDM is theoretically analyzed and verified through computer simulations. With the ML-OFDM platform, multicell cooperation as well as various wireless demands will become more efficient and flexible and can be easily achieved.
Endnotes a For instance, the broadcast capabilities are achieved by allocating dedicated broadcast time slots in WiMAX [37,38].