### A. Error performance analysis of the ELs

As the proposed MCSK modulation is very similar to conventional \mathcal{M}-ary orthogonal signaling, the BER expression given by [33, Sec. 4.4-1] can be used to evaluate the performance of the proposed MCSK. However, it is difficult to obtain a closed-form expression and thus extremely high computational complexity for the potential power distribution design is expected if the exact BER expression is used as a constraint. Considering the efficiency of further power distribution scheme, a simple BER upper bound is derived in this subsection.

In order to evaluate the robustness of the proposed modulation scheme, the peak-to-noise ratio (PNR) of the correlation output is first analyzed. For analytical simplicity, the correlation output in (14) can be rewritten as

R\left(\phi \right)=\left\{\begin{array}{cc}\hfill \mathcal{A}+n\hfill & \hfill \mathsf{\text{if}}\phantom{\rule{2.77695pt}{0ex}}{O}_{m}={O}_{\phi}\hfill \\ \hfill n\hfill & \hfill \mathsf{\text{if}}\phantom{\rule{2.77695pt}{0ex}}{O}_{m}\ne {O}_{\phi}\hfill \end{array}\right.

(20)

where \mathcal{A}=\sqrt{{P}_{j}}{\sum}_{k=0}^{\mathcal{M}-1}{\left|{Z}_{{O}_{m}}^{\left(j\right)}\left(k\right)\right|}^{2}=\sqrt{{P}_{j}}\mathcal{M} and n={\sum}_{k=m\mathcal{M}}^{\left(m+1\right)\mathcal{M}-1}\left[X\left(k\right)+{W}^{\prime}\left(k\right)\right]{\left({Z}_{{O}_{\phi}}^{\left(j\right)}\left(k\right)\right)}^{*} denote the ideal peak gain and the associated interference and noise term, respectively. *n* can be considered a Gaussian random variable with the distribution,

n~\mathcal{C}\mathcal{N}\left(0,{\sigma}_{w}^{2}\right),

(21)

where {\sigma}_{w}^{2}=\mathcal{M}{\sigma}_{d}^{2}+{\sum}_{m\mathcal{M}}^{\left(m+1\right)\mathcal{M}-1}\frac{{\sigma}_{n}^{2}}{{\left|H\left(k\right)\right|}^{2}}. For analytical simplicity, we consider the signal propa-gation environments where one dominant path exists in the multipath channel and the maximum channel delay spread is short when compared with the OFDM symbol duration. In this case, the channel frequency response can have relatively low frequency selectivity and {\sigma}_{w}^{2} can be approximated to

{\sigma}_{w}^{2}\approx \mathcal{M}{\sigma}_{d}^{2}+\mathcal{M}{\sigma}_{n}^{2}{\left|H\left(0\right)\right|}^{2}=\mathcal{M}{\sigma}_{d}^{2}\left(1+\gamma \right),

(22)

where \gamma \triangleq {\sigma}_{n}^{2}{\left|H\left(0\right)\right|}^{2}{\sigma}_{d}^{2} is defined as the inverse SNR.

Therefore the PNR of the correlation output can be represented by

\mathsf{\text{PN}}{\mathsf{\text{R}}}_{j}\triangleq \frac{{\mathcal{A}}^{2}}{{\sigma}_{w}^{2}}=\frac{{P}_{j}{\mathcal{M}}^{2}}{\mathcal{M}{\sigma}_{d}^{2}\left(1+\gamma \right)}.

(23)

It is worth mentioning that in the case of (9), where each segment on the EL is repeated by \mathcal{R} times for robustness enhancement, the corresponding segments can be coherently combined prior to correlation detection. The correlation output in (14) can be rewritten as

R\left(\phi \right)=\sqrt{{P}_{j}}\sum _{k=0}^{\mathcal{M}-1}{\left|{z}_{{O}_{m}}^{\left(j\right)}\left(k\right)\right|}^{2}+\sum _{k=0}^{\mathcal{M}-1}\left[\overline{X\left(k\right)}+\overline{{W}^{\prime}\left(k\right)}\right]{\left({Z}_{{O}_{\phi}}^{\left(j\right)}\left(k\right)\right)}^{*}\phantom{\rule{2.77695pt}{0ex}}\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{if}}\phantom{\rule{2.77695pt}{0ex}}{O}_{m}={O}_{\phi}

(24)

where \overline{X\left(k\right)} is the averaged OFDM symbol with variance {\sigma}_{d}^{2}/\mathcal{R} and \overline{{W}^{\prime}\left(k\right)} is the averaged noise with variance {\sigma}_{w}^{2}/\mathcal{R}. The corresponding PNR can then be reformulated by,

\mathsf{\text{PN}}{\mathsf{\text{R}}}_{j}=\frac{{P}_{j}{\mathcal{M}}^{2}}{\mathcal{M}{\sigma}_{d}^{2}\left(1+\gamma \right)}.\mathcal{R}.

(25)

As we can see from (20), the correct detection of the cyclic phase shift which matches the maximum correlation output should meet the criterion \mathcal{A}>n+n for the \mathcal{M}-1 comparisons.

Now let us consider a new variable *y* = 2*n* and its probability density can be derived as follows

\begin{array}{ll}\hfill {p}_{Y}\left(y\right)& =\underset{-\infty}{\overset{\infty}{\int}}{p}_{N}\left(n\right){p}_{N}\left(y-n\right)dn\phantom{\rule{2em}{0ex}}\\ =\underset{-\infty}{\overset{\infty}{\int}}\frac{1}{\sqrt{2\pi {\sigma}_{w}^{2}}}{e}^{-\frac{{n}^{2}}{2{\sigma}_{w}^{2}}}\frac{1}{\sqrt{2\pi {\sigma}_{w}^{2}}}{e}^{-\frac{{\left(y-n\right)}^{2}}{2{\sigma}_{w}^{2}}}dn\phantom{\rule{2em}{0ex}}\\ =\frac{1}{\sqrt{2\pi {\sigma}_{w}^{2}}}{e}^{-\frac{{y}^{2}}{2{\sigma}_{w}^{2}}}\underset{-\infty}{\overset{\infty}{\int}}\frac{1}{\sqrt{2\pi {\sigma}_{w}^{2}}}{e}^{-\frac{2{\left(n-y/2\right)}^{2}-{y}^{2}/2}{2{\sigma}_{w}^{2}}}\phantom{\rule{2em}{0ex}}\\ =\frac{1}{2\sqrt{\pi {\sigma}_{w}}}{e}^{-\frac{{y}^{2}}{4{\sigma}_{w}^{2}}}.\phantom{\rule{2em}{0ex}}\end{array}

(26)

The correct detection should meet the criteria that y<\mathcal{A} for all the \mathcal{M}-1 comparisons of the correlation output and therefore, the false detection probability for one comparison is given by

\begin{array}{ll}\hfill {P}_{c}& =\underset{\mathcal{A}}{\overset{\infty}{\int}}\frac{1}{2\sqrt{\pi}{\sigma}_{w}}{e}^{-\frac{{y}^{2}}{4{\sigma}_{w}^{2}}}dy\phantom{\rule{2em}{0ex}}\\ =\mathcal{Q}\left(\frac{\mathcal{A}}{\sqrt{2}{\sigma}_{w}}\right)=\mathcal{Q}\left(\sqrt{\frac{\mathsf{\text{PNR}}}{2}}\right).\phantom{\rule{2em}{0ex}}\end{array}

(27)

The overall error probability of peak detection is then upper bounded by its union bound,

{P}_{f}=\left(\mathcal{M}-1\right){P}_{c}.

(28)

Assume that all the data bits in one data symbol **d**_{
i
}are equally likely. Since one symbol is composed of *k* bits, the BER of the *j* the EL is therefore upper bounded by

\begin{array}{ll}\hfill {P}_{b,j}& =\frac{{2}^{k-1}}{{2}^{k-1}}{P}_{f}\phantom{\rule{2em}{0ex}}\\ ={2}^{k-1}\mathcal{Q}\left(\sqrt{\frac{{P}_{j}\cdot \mathsf{\text{NPN}}{\mathsf{\text{R}}}_{j}}{2}}\right)\phantom{\rule{2em}{0ex}}\\ \le \frac{\mathcal{M}-1}{2}\text{exp}\left(-\frac{{P}_{j}\cdot \mathsf{\text{NPN}}{\mathsf{\text{R}}}_{j}}{4}\right),\phantom{\rule{2em}{0ex}}\end{array}

(29)

where we define the normalized PNR of the *j* th EL as \mathsf{\text{NPN}}{\mathsf{\text{R}}}_{j}=\mathsf{\text{PN}}{\mathsf{\text{R}}}_{j}/\sqrt{{P}_{j}}.

### B. Capacity loss analysis of BL

The average channel capacity of the BL is derived in this subsection to study the impact of EL's transmission on the overall system performance. The effective received signal after interference cancelation can be represented as

\mathbf{Y}=\mathbf{X}\cdot \mathbf{H}+\mathbf{I}+\mathbf{W}.

(30)

Assume that the least square or minimum mean square error estimators using training sequences are used for channel estimation in the proposed system and if the length of the training sequence is *N*_{
t
}, when *L*/*N*_{
t
}is sufficiently small, the residual interference term **I** in is essentially uncorrelated with **W**. Therefore, **I** + **W** in (30) can be considered as a Gaussian vector with zero mean and covariance matrix of \left({\sum}_{i=1}^{K}{P}_{i}{\sigma}_{\Delta H}^{2}+2{P}_{e,i}{P}_{i}{\sigma}_{H}^{2}+{\sigma}_{n}^{2}\right){\mathbf{I}}_{N}, where **I**_{
N
}is an identity matrix of order *N*. Based on the analysis of the average channel capacity for flat fading channels given in [34], we derive the average channel capacity of the BL *C*_{
BL
}by averaging the capacity of each subcarrier over all the subcarriers,

{C}_{BL}=\frac{1}{N}\sum _{k=0}^{N-1}\mathsf{\text{E}}\left[\text{log}\left(1+\frac{{P}_{B}\cdot {\left|{H}_{k}^{\prime}\right|}^{2}}{{P}_{B}{\sigma}_{\Delta H}^{2}+{\sum}_{i=1}^{K}\left({P}_{i}{\sigma}_{\Delta H}^{2}+2{P}_{e,i}{P}_{i}{\sigma}_{H}^{2}\right)+{\sigma}_{n}^{2}}\right)\right],

(31)

where *P*_{
B
}denotes the transmit power of the BL. For analytical simplicity, by introducing the following Gaussian random variable with zero mean and unit variance, g\triangleq {H}_{k}^{\prime}/\sqrt{\mathsf{\text{Var}}\left({H}_{k}^{\prime}\right)}, (31) can be reformulated as

{C}_{BL}=\frac{1}{N}\sum _{k=0}^{N-1}\mathsf{\text{E}}\left[\text{log}\left(1+\frac{{P}_{B}\cdot \mathsf{\text{Var}}\left({H}_{k}^{\prime}\right){\left|g\right|}^{2}}{{P}_{total}{\sigma}_{\Delta H}^{2}+\sum _{i=1}^{K}2{P}_{e,i}{P}_{i}{\sigma}_{H}^{2}+{\sigma}_{n}^{2}}\right)\right],

(32)

where *P*_{
total
}= *P*_{
B
}+ *P*_{
E,total
}represents the total transmit power of the overall system and {P}_{E,total}={\sum}_{i=1}^{K}{P}_{i} denotes the total transmit power of the ELs. Furthermore, the assumption that the normalized channel estimation error {\sigma}_{\Delta H}^{2}/{\sigma}_{H}^{2} is sufficiently small holds when accurate channel estimation techniques are adopted and therefore, \mathsf{\text{Var}}\left({H}_{k}^{\prime}\right) can be approximated to {\sigma}_{H}^{2}. Then the capacity can be further approximated as

\begin{array}{ll}\hfill {C}_{BL}& \approx \frac{1}{N}\sum _{k=0}^{N-1}\mathsf{\text{E}}\left[\text{log}\left(1+\frac{{P}_{B}\cdot {\sigma}_{H}^{2}{\left|g\right|}^{2}}{{P}_{total}{\sigma}_{\Delta H}^{2}+\sum _{i=1}^{K}2{P}_{e,i}{P}_{i}{\sigma}_{H}^{2}+{\sigma}_{n}^{2}}\right)\right]\phantom{\rule{2em}{0ex}}\\ =\text{log}\left(1+\frac{{P}_{B}\cdot {\sigma}_{H}^{2}}{{P}_{total}{\sigma}_{\Delta H}^{2}+{\sum}_{i=1}^{K}2{P}_{e,i}{P}_{i}{\sigma}_{H}^{2}+{\sigma}_{n}^{2}}\right).\phantom{\rule{2em}{0ex}}\end{array}

(33)

In the meantime, the upper bound of the BL's capacity in the absence of ELs' transmission can be written as

{\stackrel{\u0304}{C}}_{BL}=\frac{1}{N}\sum _{k=0}^{N-1}\text{log}\left(1+\frac{{P}_{B}\cdot {\sigma}_{H}^{2}}{{P}_{B}{\sigma}_{\Delta H}^{2}+{\sigma}_{n}^{2}}\right).

(34)

By introducing the maximum allowed capacity loss Δ*C*, the ELs' transmission is enabled only if the following constraint is satisfied

{\stackrel{\u0304}{C}}_{BL}-{C}_{BL}\le \Delta C.

(35)

The above constraint is referred to as the BL's capacity loss constraint and can be reformulated as

where C\triangleq {\stackrel{\u0304}{C}}_{BL}-\Delta C. The above constraint is essential for the ML-OFDM system design as it reflects the impact of ELs' transmission on the BL. If the capacity loss is sufficiently large that the BL cannot tolerate, no ELs' transmission is allowed. Therefore, the constraint will further be used in the power distribution scheme as we will discuss in the following section.