For a given transmitter–receiver link with N subcarriers in one OFDM symbol, let
{\mathit{X}}_{i}={\left[{X}_{i}\right[0],{X}_{i}[1],\dots ,{X}_{i}[N1\left]\right]}^{T}
(18)
be the amplitudes of N subcarriers from the i th transmit antenna. At the transmitter, each X_{
i
} undergoes\mathcal{L}Npoints IDFT to produce the\mathcal{L}times oversampledtime domain baseband signals expressed as
{\mathit{x}}_{i}=\mathit{\Gamma}{\mathit{X}}_{i}
(19)
where Γ is an\mathcal{L}N\times N DFT matrix with
\begin{array}{ll}{\Gamma}_{t,k}=\frac{1}{\sqrt{\mathcal{L}N}}{e}^{j\frac{2\Pi \mathit{\text{kt}}}{\mathcal{L}N}}& ,\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}t\in [0,\mathcal{L}N1],\phantom{\rule{2em}{0ex}}\\ \phantom{\rule{1em}{0ex}}k\in [0,N1].\phantom{\rule{2em}{0ex}}\end{array}
(20)
Note that,\mathcal{L} denotes the oversampling factor sufficient to make the signal x_{
i
}as close as possible to the continuous signal[5]. The average power of the signal is defined as
E\left\{{\left{\mathit{x}}_{i}\right}^{2}\right\}=\frac{1}{\mathcal{L}N}{\int}_{0}^{\mathcal{L}N}{\left{x}_{i}\left(t\right)\right}^{2}\mathit{\text{dt}}=\sum _{n=0}^{N1}{\left{X}_{i}\left[n\right]\right}^{2},
(21)
where E{·} denotes expectation operation. The peaktoaverage power (PAPR) of the transmitted signal in (19) is defined as
\mathit{\text{PAP}}{R}_{i}=\frac{{\left\left{\mathit{x}}_{i}\right\right}_{\infty}^{2}}{E\left\{{\left{\mathit{x}}_{i}\right}^{2}\right\}},
(22)
where\left\right{\mathit{x}}_{i}{}_{\infty} is the infinity norm of the time domain signals. The above definition clarifies that the PAPR is the maximum instantaneous power normalized by the average power among all possible signal patterns.
To avoid nonlinear distortion in the power amplifiers and in turn the generation of undesired outofband radiation, the PAPR of all N_{
t
} transmit signals should be simultaneously as small as possible[20]. Since the performance is governed by the worstcase PAPR, we define PAP R_{MIMO} as the maximum of all PAPR related to all N_{
t
}MIMO path[4, 20]. Thus,
\mathit{\text{PAP}}{R}_{\mathit{\text{MIMO}}}=\underset{i=1,\dots ,{N}_{t}}{\text{max}}\mathit{\text{PAP}}{R}_{i}.
(23)
Note that PAPR is a random variable and the suitable description is the complementary cumulative distribution function (CCDF), which gives the probability υ_{
o
} of exceeding a specified threshold γ, i.e.,
{\upsilon}_{o}=\mathit{\text{Pr}}\phantom{\rule{0.3em}{0ex}}[\mathit{\text{PAPR}}>\gamma ].
(24)
In the following, we will demonstrate some techniques that can be used for PAPR reduction of OFDM signals.
PAPR reduction using phase information of the pilot symbols and TR techniques
For an OFDM symbol that consists of unused subcarriers, pilot and data subcarriers, phase information of the pilot tones together with a certain number of unused subcarriers can be utilized to mitigate the problem of high PAPR in OFDM systems. Careful design of the phase information to the pilot symbols can substantially minimize the peak levels of the time domain OFDM signals[21]. In line with the phase information of the pilot symbols, tone reservation (TR) technique which makes use of some reserved or unused subcarriers and insert dummy symbols that simultaneously minimize the peak levels of the sampled time domain OFDM signals can be utilized.
In this article, a method that mixes TR technique and phase information of the pilot symbols to efficiently reduce the PAPR of the MIMO–OFDM signals is presented. First, transmitters that reduce the PAPR by utilizing phase information of the pilot symbols is introduced. We adopt the techniques proposed in[21] to design phase information of the pilot symbols primarily dedicated for channel estimation (see Section IV and[3]) to reduce the PAPR of an OFDM symbol. Then, TR technique is employed to design dummy symbols that effectively reduce the PAPR. The optimal power distribution and phase information of the dummy symbols are determined by the solution of a convex optimization problem.
Let{\mathcal{K}}_{v} be a set of dummy symbols for each transmit antenna link, then we denote the number of dummy symbols as{N}_{v}=\left{\mathcal{K}}_{v}\right. Suppose that, we define Γ_{
d
},{\mathbf{\Gamma}}_{{\mathbf{p}}_{i}} and Γ_{
v
} as the DFT submatrix of Γ corresponding to{\mathcal{K}}_{d},{\mathcal{K}}_{{p}_{i}} and{\mathcal{K}}_{v} subcarriers, respectively. Then, we can decompose the expression in (19) as
{\mathit{x}}_{i}={\mathit{\Gamma}}_{d}{\mathit{X}}_{{d}_{i}}+{\mathit{\Gamma}}_{{p}_{i}}{\mathit{X}}_{{p}_{i}}+{\mathit{\Gamma}}_{v}{\mathit{X}}_{{v}_{i}},
(25)
where{\mathit{X}}_{{d}_{i}},{\mathit{X}}_{{p}_{i}} and{\mathit{X}}_{{v}_{i}} are the vectors containing data, pilot, and dummy symbols, respectively. In the following sections, we will discuss these techniques used to counteract the PAPR problem.
Pilot phase design for PAPR reduction
For channel or carrier frequency offset estimation, it is sufficient to design the placement and power of each pilot symbol, and there are no particular requirements for the phase information. Thus, we can utilize phase information of the pilot symbols to enhance a reasonable PAPR reduction. We consider a set of frequency domain pilot symbols designed for channel estimation of the received OFDM symbol from the i th transmit antenna and design plausible phase information that lower the peak amplitudes of the pilot symbols in time domain.
From Equation (25), for a given set of pilot symbols in frequency domain, the corresponding time domain representation of the pilot symbols with phase information can be written as
{\mathit{x}}_{{p}_{i}}={\mathit{\Gamma}}_{{p}_{i}}\text{diag}\left({\mathit{X}}_{{p}_{i}}\right){e}^{j{\varphi}_{\mathit{i}}},
(26)
where ϕ_{
i
} is an N_{
p
}×1 vector containing phase information of the pilot symbols from the i th transmit antenna link.
From (26), we can write the peak minimization problem as
\underset{{\varphi}_{i}}{min}\left\right{\mathit{x}}_{{p}_{i}}{}_{\infty}=\underset{{\varphi}_{i}}{min}\left{\mathit{\Gamma}}_{{p}_{i}}\text{diag}\right({\mathit{X}}_{{p}_{i}}\left){e}^{j{\varphi}_{\mathit{i}}}\right{}_{\infty}
(27)
which is the minimization of the maximum amplitude of the time domain signals{\mathit{x}}_{{p}_{i}}.
Phase design to reduce PAPR is a nonconvex and nonlinear optimization problem[21]. The nonconvex optimization problem is addressed in[22], where exhaustive search method is employed to design phase information of the pilot symbols. However, the exhaustive search scheme becomes computationally prohibitive especially for pilot sets with a large number of subcarriers. Furthermore, the performance of the scheme depends on the searching granularity. In[21], phase information of the pilot symbols are obtained by the CE optimization techniques. Compared to the exhaustive search method, the algorithm in[21] converges fast to the near optimal solution. Due to its high convergence rate, the scheme has a potential to make practical design of phases for different applications. Thus, we resort to the later approach and slightly modify it to be used for the design of pilot phase information when multiple transmit antennas are employed, while limiting the number of iterations without improvement.
CEbased phase optimization techniques
We utilize CE optimization method to design random phase information to a given set of pilot symbols to reduce the PAPR of the time domain pilot symbols. In most practical cases, the power levels loaded to the pilot symbols in frequency domain is relatively higher than the power loaded to the data subcarriers. Thus, by lowering the peak levels of the pilot symbols, the PAPR of the whole OFDM symbol can be slightly reduced.
The basic idea behind the CE method is to transform the original (combinatorial) optimization problem to an associated stochastic optimization problem, and then to tackle the stochastic problem efficiently by an adaptive sampling algorithm. By doing so, one constructs a random sequence of solutions which converges (probabilistically) to the optimal or at least a reasonable solution. Once the associated stochastic optimization is defined, the CE method alternates the following two phases:

1.
Generation of a sample of random data according to a specified random mechanism.

2.
Update of the parameters of the random mechanism, on the basis of the sample data, in order to produce a better sample in the next iteration.
From the problem formulated in (27), if we generate M sample vectors of random phases, then the optimization problem can be expressed as
\mathit{\Psi}\left({\varphi}_{i}\right)=\underset{{\varphi}_{i}\in \mathit{\Phi}}{min}\left\right{\mathit{x}}_{{p}_{i}}{}_{\infty}=\underset{{\varphi}_{i}\in \mathit{\Phi}}{min}\left{\mathit{\Gamma}}_{{p}_{i}}\text{diag}\left({\mathit{X}}_{{p}_{i}}\right){e}^{j{\varphi}_{\mathit{i}}}\right{}_{\infty}.
(28)
Once the samples have been generated, the next step is to use the sample to modify the parameters of the random mechanism, in order to produce a better sample in the next iteration. The methodology is focusing on the observations of the best objective function values in order to bias the sampling process.
The proposed CE method employs M number samples, the cutoff point for highquality observations ρ, the smoothing constants α for updating the mean and standard deviation of the samples, the limit on the number of iterations without improvement\mathcal{Z}, the limit on the total number of iterations\mathcal{J} and the limit on standard deviation ε.
Algorithm 1. CE algorithm
1: Initialize{\widehat{\mu}}_{0} and{\widehat{\sigma}}_{0} and the CE parameters α,β and ρ.
2: Initialize\mathit{\Psi}\left({\varphi}^{\ast}\right)=\left\right{\mathit{x}}_{p}^{0}{}_{\infty} and set the iteration counter t = 0 and{t}^{\prime}=0.
3: while ({t}^{\prime}<\mathcal{Z} andt<\mathcal{J} and\underset{k}{\text{max}}\left({\widehat{\sigma}}_{k,t}\right)<\epsilon)
{
4: Generate random sample ϕ_{1},…,ϕ_{
M
}from the distributions\mathcal{N}({\widehat{\mu}}_{k,t1},{\widehat{\sigma}}_{k,t1}^{2}).
5: Compute Ψ(ϕ) and order the sample in such a way that Ψ(ϕ^{1})≤Ψ(ϕ^{2})…≤Ψ(ϕ^{M})
6: Select\mathcal{I} indices of the best performing samples, and calculate mean and standard deviation.{\mu}_{k,t}=\frac{1}{{M}_{\ell}}\sum _{i\in \mathcal{I}}{\varphi}_{k,i} and{\sigma}_{k,t}^{2}=\frac{1}{{M}_{\ell}}\sum _{i\in \mathcal{I}}{({\varphi}_{k,i}{\mu}_{k,t})}^{2}
7: Smoothen the mean and standard deviation of the best performing samples using{\widehat{\mu}}_{k,t}=\alpha {\mu}_{k,t}+(1\alpha ){\widehat{\mu}}_{k,t1} and{\widehat{\sigma}}_{k,t}=\alpha {\sigma}_{k,t}+(1\alpha ){\widehat{\sigma}}_{k,n1} for k = 0,…,N_{
p
}−1.
8: if Ψ(ϕ^{∗})<Ψ(ϕ^{1}) then
9:{t}^{\prime}\leftarrow {t}^{\prime}+1
10: else
11: Ψ(ϕ^{∗})=Ψ(ϕ^{1}) and{t}^{\prime}=0
12: end if
13: Incrementt\leftarrow t+1
14: }
The pseudocode of the CE method is as presented in Algorithm 5.2.1. The algorithm starts by initializing the mean and standard deviation as μ_{0} and σ_{0}, respectively. The best objective function value is initialized to\left\right{\mathit{x}}_{p}^{0}{}_{\infty}, which is the peak amplitude of the zero phase pilot symbols and the iteration counters t (total number of iterations) and{t}^{\prime} (iterations without improvement) are initialized to zero (see steps 1–2 in the pseudocode). The main loop includes the two main tasks that every CE method must perform, namely, the generation of a sample and the updating of the parameters associated with the chosen probability distribution. Step 4 generates M sample vectors{\left\{{\Phi}_{m}\right\}}_{m=1}^{M} of length N_{
p
} using a family of normal probability density functions (PDF)\mathcal{N}({\mu}_{k},{\sigma}_{k}^{2}) for k = 0,…,N_{
p
}−1.
The ordering of the sample in step 5 is such that the best observation is placed in the first position of the list and the worst is placed in the last position. The mean and standard deviation values calculated in step 6 corresponds to the variables of the top M_{
ℓ
}solutions in the current sample. Note that, a fixed number of the best performing samples M_{
ℓ
} are referred to as the elite samples expressed as M_{
ℓ
}=ρM. In step 7, we obtain the smoothed mean{\widehat{\mu}}_{k,t} and standard deviation{\widehat{\sigma}}_{k,t} by using some fixed smoothing parameter α where 0<α<1.
Steps 9 and 11 update the best solution found and reset the counter of the number of iterations without improvement, respectively. The global iteration counter is updated in step 13. Note that, the mean{\widehat{\mu}}_{k,t} converges to ϕ^{∗} and the standard deviation{\widehat{\sigma}}_{k,t} to the zero. In brief, we obtain a degenerated PDF with all mass concentrated in the vicinity of the vector ϕ^{∗}.
At each stage t of the CE procedure we simulate a sample ϕ from a\mathcal{N}({\widehat{\mu}}_{t1},{\widehat{\sigma}}_{t1}^{2}) distribution, and update{\widehat{\mu}}_{t} and{\widehat{\sigma}}_{t} of the best samples M_{
ℓ
}.
The pseudocode in Algorithm 5.2.1 is faster than the CE method proposed in[21], as it limits the number iterations without improvements. The algorithm can be utilized to design pilot phase information for all transmit antennas by repeating the same procedure for each set of pilot symbols .
TR for PAPR reduction
TR technique is one of the promising approach for PAPR reduction in OFDM systems[5, 23–25], where the transmitter mitigates the PAPR problem by sending dummy symbols (i.e., symbols not conveying information) in some reserved subcarriers[26]. The advantages of TRbased schemes is that there is no specific PAPR reduction information that needs to be communicated to the receiver. However, one problem with TR techniques lies on the computationally efficient determination of dummy symbols that effectively minimizes the PAPR[23]. The amount of PAPR reduction depends on some factors such as location of the dummy symbols, number of dummy symbols, and allowed power on these dummy symbols.
Some issues to be considered before using the TR techniques include PAPR reduction capacity, power increase in transmit signals and the loss in data rate. In the proposed TR technique, reduction in PAPR is achieved at the expense of increasing the total transmitted power of an OFDM symbol.
To reduce the PAPR we need to minimize\left\right{\mathit{x}}_{i}{}_{\infty}, which is equivalent to the peak of the signal x_{
i
}in (25). The peak minimization problem can written as
\underset{{\mathit{X}}_{{v}_{i}}}{min}\left\right{\mathit{x}}_{i}{}_{\infty}=\underset{{\mathit{X}}_{{v}_{i}}}{min}{\mathit{\Gamma}}_{d}{\mathit{X}}_{{d}_{i}}+{\mathit{\Gamma}}_{{p}_{i}}{\mathit{X}}_{{p}_{i}}+{\mathit{\Gamma}}_{v}{\mathit{X}}_{{v}_{i}}{}_{\infty}.
(29)
Note that, for simplicity we consider that, a same set of null subcarriers is reserved for PAPR reduction on different transmit antenna link. However, power and phase information are different for each transmit antenna link, i.e.,{\mathcal{K}}_{v}={\mathcal{K}}_{{v}_{1}}={\mathcal{K}}_{{v}_{2}}=\dots ={\mathcal{K}}_{{v}_{{N}_{t}}}, and{\mathit{X}}_{{v}_{1}}\ne {\mathit{X}}_{{v}_{2}}\ne \dots \ne {\mathit{X}}_{{v}_{{N}_{t}}}.
For an OFDM symbol with N_{
G
}number of null edge subcarriers or guard subcarriers, we need to select a set of{\mathcal{K}}_{v} subcarriers and optimally allocate the power as well as phase information to ensure that the designed dummy symbols{\mathit{X}}_{{v}_{i}}, significantly reduces the PAPR of each transmit antenna link.
The placement of the dummy symbols
In practice a considerable portion of subcarriers are reserved as guard subcarriers to avoid interferences from neighboring communication channels. In the proposed method, we utilize some of these guard subcarriers for PAPR reduction. The optimal placement can improve the efficiency of the dummy symbols in reducing PAPR. To obtain optimal location of dummy symbols for a fixed number of guard subcarriers N_{
G
} and for a given number of dummy symbols N_{
v
}, the exhaustive method can be utilized[27]. In[27], it has been shown that scattered dummy symbol with maximum distance from adjacent dummy symbols outperforms the contiguous placements of the dummy symbols. However, exhaustive methods used to obtain optimal placements are not tractable for an OFDM symbols with large number of subcarriers. For example, in IEEE 802.16e standard an OFDM symbol consists 256 subcarriers, out of which 55 are nulled at the edges of the block. Suppose 16 subcarriers out of 55 are to be used for PAPR then there are\left(\begin{array}{c}55\\ 16\end{array}\right)\approx 2.97495\times 1{0}^{13}possible combinations. Furthermore, optimal placement of dummy symbols to minimize the associated PAPR of an OFDM symbol depends on various factors such as, the information loaded in the active subcarriers, power distribution to the dummy symbols and the phase information of these dummies. This calls for online update of the placement of these dummy symbols for each transmitted OFDM symbol which increases the complexity especially for a system with large number of guard subcarriers. Since there is no closedform expression relating the placement of dummy symbols with the PAPR, we propose a simple symmetrical placement of the dummy symbols with maximum distance from adjacent dummy symbols. Then, we introduce separate algorithms for power loading and phase information.
The maximum distance between the adjacent dummy symbols is given by
d=\u230a\frac{{N}_{G}}{{N}_{v}1}\u230b
(30)
where ⌊c⌋ rounds c to the nearest integer less than or equal to c. The dummy symbols are located at\left[\frac{{N}_{G}(2\mathit{a}1)d}{2},\right.N\left(\right)close="]">\n \n \n \n \n \n N\n \n \n G\n \n \n \u2212\n (\n 2\n a\n \u2212\n 1\n )\n d\n \n \n 2\n \n \n \n, where\mathit{a}=\left[\frac{{N}_{v}}{2},\frac{{N}_{v}}{2}1,\dots 1\right]. For N_{
v
}= 16, dummy symbols are located at {5,8,11,14,17,20,23,26,230,233,245,247,251}.
Note that, it is also possible to start from the edges by placing the dummy symbols on right and left ends of the guard band subcarriers. In the following subsection we will introduce methods that can be used to load power and phase to the dummy subcarriers.
Convex optimizationbased TR techniques
The main differences in the TR techniques are based on the selection of the (convex) cost function, the possible constraints set and the algorithms used to obtain an optimal solution. In[26], TR techniques that use adaptive projected subgradient method to obtain dummy symbols to minimize PAPR of each symbol is proposed, while in[28], a subgradient optimizationbased framework for iterative PAPR reduction is proposed. Both[26] and[28] utilize iterative algorithm to obtain the peak canceling symbols. The accuracy of the iterative methods depends on the number of iterations and the selection of the updating parameters. The approach in[28] minimizes the peak magnitude of the OFDM symbol vector by using tone values of the reserved subcarriers which are iteratively updated through a subgradient search. The algorithms have very simple update rules and low computational complexities. However, the number of updates required for a satisfactory peak level tends to be high. Convex optimization techniques for PAPR reduction are also addressed in[29] and the references therein.
In our proposed TR techniques we employ convex optimization packages (cvx) in[30] to efficiently design dummy symbols (or peak canceling symbols) under certain constraints without employing iteration methods.
From the objective function in (29), it is desired to optimize the dummy symbols{\mathit{X}}_{{v}_{i}} to reduce the peaks of the time domain signal x_{
i
}to the lowest possible amplitude level. By combining pilot phase information in (26), our objective function can be expressed as
\Omega \left({\mathit{X}}_{{v}_{i}}\right)=\underset{{\mathit{X}}_{{v}_{i}}}{min}\left\right{\mathit{\Gamma}}_{d}{\mathit{X}}_{{d}_{i}}+{\mathit{\Gamma}}_{{p}_{i}}\text{diag}\left({\mathit{X}}_{{p}_{i}}\right){e}^{j{\varphi}_{\mathit{i}}}+{\mathit{\Gamma}}_{v}{\mathit{X}}_{{v}_{i}}{}_{\infty}.
(31)
Only{\mathit{X}}_{{v}_{i}} is allowed to change, thus we need to introduce some constraints to describe the desired characteristics of the convex set{\mathit{X}}_{{v}_{i}} and the signal amplitude constraints to limit the PAPR to an acceptable level.
Practically, one cannot select arbitrary values for{\mathit{X}}_{{v}_{i}}, since they should obey the power spectral density (PSD) constraints for different applications imposed by the standards for the spectral compatibility reasons (see[5, 26, 28]). Therefore, the average power levels of the reserved tones are constrained by the PSD mask levels. Apart from the PSD constraints in frequency domain, we can add the signal peak level reduction requirement in time domain as well. Thus, we can write the peak minimization problem as the constrained optimization problem
\begin{array}{cl}\underset{{\mathit{X}}_{{v}_{i}}}{\text{minimize}}& \Omega \left({\mathit{X}}_{{v}_{i}}\right)\\ \text{subject to}& \left{X}_{{v}_{i}}\right\le \chi ,\\ \Omega \left({\mathit{X}}_{{v}_{i}}\right)\le \beta \left\right{\mathit{\Gamma}}_{d}{\mathit{X}}_{{d}_{i}}{}_{\infty}\end{array}
(32)
where χ is the PSD mask level constraint in frequency domain and β is a fraction value representing the target peak level to be attained. From the constraints above, the phase of the dummy symbols can either be 0° or 180°.
The convex optimization problem in (32) can efficiently be solved by the using the cvx optimization package in[30]. Note that, TR approach does not require any modifications at the receiver, thus, the receiver easily recover the transmitted data symbols by discarding subcarriers loaded with dummy symbols.