Low PAPR Reference Signal Transceiver Design for 3GPP 5G NR Uplink

Low peak-to-average-power ratio (PAPR) transmissions significantly improve the cell coverage as they enable high power transmissions without saturating the power amplifier. A new modulation scheme, namely, pi/2-BPSK was introduced in the Rel-15 3GPP 5G NR specifications to support low PAPR transmissions using the DFT-spread-OFDM waveform in the uplink transmissions. To enable data demodulation using this modulation scheme, Zadoff-Chu sequences are used as reference signals. However, the PAPR of Zadoff-Chu sequences is higher when compared to the pi/2-BPSK data. Therefore, even though the data transmissions have low PAPR, the high PAPR of the reference signal limits the cell coverage in the uplink of Rel-15 3GPP 5G NR design. In this paper we propose a transceiver design which minimizes the PAPR of the reference signals to avoid the aforementioned issues. We show via simulations that the proposed architecture results in more than 2 dB PAPR reduction when compared to the existing design. In addition, when multiple stream transmission is supported, we show that PAPR of the reference signal transmission remains the same for any stream (also referred to as baseband antenna port in 3GPP terminology) when the proposed transceiver design is employed, which is not the case for the current 3GPP 5G NR design


I. INTRODUCTION
For a cellular network, uplink transmissions define the coverage area. This is because the transmission power in the uplink is limited to 23 dBm at the user equipment (UE) owing to hardware limitations (such a battery size) and regulatory constraints as opposed to 43 dBm at the base station in the downlink [1]. This limited transmission power in the uplink must therefore be used carefully to enhance cell coverage without increasing the CAPEX/OPEX costs of deploying more cell sites. Therefore the uplink design of a cellular standard is crucial in enabling uplink transmissions at high powers without saturating the power amplifier, which otherwise results in unwanted non-linear distortions.
To address the above issues and to enhance the cell coverage of the newly designed 3GPP 5G NR when compared to 4G LTE, a new modulation scheme, namely, π/2-BPSK was introduced for the uplink data channel (physical uplink shared channel-PUSCH) and control channel (physical uplink control channel -PUCCH) transmission. This waveform, when combined with an appropriate spectrum shaping enables low peak-to-average-power (PAPR) ratio transmissions without compromising the error rate performance [2]- [4]. Specifically, the PAPR of this modulation scheme with DFT-spread-OFDM waveform and spectrum shaping is smaller than 2 dB. Moreover, it is shown in [4], [5] that the power amplifier can be driven to saturation (adjacent channel leakage ratio (ACLR) and error vector magnitude (EVM) will still be within the required specification limits) and yet the error rate performance of this modulation scheme is not compromised. Hence, this modulation scheme plays a crucial role in significantly enhancing the cell coverage for 3GPP 5G NR-based cellular networks. shaping vector is performed The demodulation reference signals (DMRS) employed in Rel-15 for coherent demodulation of the PUSCH and PUCCH are generated using Zadoff-Chu (ZC) sequences or QPSKbased Computer Generated Sequences(CGS) as specified in Section 5.2.2 in [2] and Section 6.2.2 in [3]. The PAPR of these sequences is around 3.5-4 dB when spectrum shaping is employed which is higher than that of the spectrum-shaped data transmissions [6]- [8]. Therefore, even though the data transmissions have low PAPR and potentially allow for larger coverage, the DMRS design still limits the cell size due to its high PAPR in Rel-15 3GPP 5G NR. Note that, the performance of PUSCH and PUCCH channels directly depend on the quality of the channel estimates obtained using these DMRS sequences. Hence, when the DMRS sequences are transmitted at lower power to avoid PA saturation, the coverage of PUSCH and PUCCH channels is automatically limited. For this reason, 3GPP introduced a new study item in Rel-16 to design new reference signal sequences with lower PAPR [9]. The sequences in [10]- [13] were agreed to be used as low-PAPR reference sequences. In this paper, we will use them as the reference signal sequences for the proposed reference signal transceiver design.
The Rel-15 specifications for 3GPP 5G NR also support multiple stream transmissions using DFT-spread-OFDM waveform. In other words, a single user can be scheduled to transmit multiple streams or multiple users can be configured simultaneously to transmit multiple streams depending on the channel conditions. In order to support these multiple-stream (also known as layers in 3GPP terminology) MIMO transmission, multiple orthogonal DMRS sequences are necessary, one for each stream. This is achieved by introducing the concept of baseband antenna port where one single port is assigned for the demodulation of each stream/layer [2,Sec 6.3.1.3]. Since the DMRS of each stream must be independently decoded for channel estimation of each stream, these DMRS sequences must be orthogonally separated to avoid any interference. In 3GPP specifications, the orthogonality across the ports is achieved by frequency division multiplexing (FDM) or code division multiplexing (CDM). Distinct orthogonal DMRS sequences, each corresponding to an antenna port, share the same time-frequency resources in CDM method as shown in Fig. 1 where r 0 , r 1 are two distinct DMRS sequences corresponding to antenna port 0 and antenna port 1 respectively. In FDM method, the same sequence is employed for all the antenna ports but frequency multiplexed as shown in Fig. 1b. It can be seen that in FDM the length of DMRS on each port will be M P rather than M , where P indicates the number of antenna ports multiplexed in frequency domain. It is agreed in 3GPP that Rel-16 NR [10] support only two layers via FDM and hence the length of DMRS on each port will be M 2 for a data allocation of length M sub-carriers. We show in Section III that this M 2 -point reduction in DMRS length does not reduce the channel estimation quality and the M -length channel estimate vector corresponding to the M -length data allocation can be reconstructed perfectly. When multiple-stream transmissions are supported, the current 3GPP Rel-15 specifications does not clearly mention the spectrum shaping implementation for the π 2 -BPSK data and DMRS sequences. For instance, when multiple users each with one layer are configured to transmit simultaneously, a M P length DMRS sequence corresponding to each user's M length data will be transmitted on one of the P ports, in such case spectrum shaping has to align between data and DMRS transmissons so that channel can be estimated correctly, which otherwise may result in imperfect receiver implementations (causing a loss of data exchanged). In addition to this, if proper design choices are not made, then it is also possible that the same DMRS sequence when mapped to two different baseband antenna ports (for example, as shown in Fig. (1b)), it will behave differently with respect to (w.r.t) PAPR, auto and/or cross-correlation which eventually impact the channel estimation performance (immunity to inter-cell interference) and subsequently data demodulation. Therefore in this paper, we propose two transceiver architectures which generate low PAPR DMRS waveform and also results in identical channel estimation performance on all the baseband antenna ports. Specifically, we show the the sequences designed in [10]- [13] to have low PAPR will have same error rate performance on any stream in the case of multiple-stream transmissions. Notation: The following notation is used in this paper. Upper case letters X denote matrices, bold lower case letters x denote vectors, non bold face letters represent scalars and x t , y f indicates the time domain and frequency domain vectors x amd y respectively. x T and X † represent the transpose and Hermitian operations on the vector x and matrix X respectively. We use the symbol x to denote the data symbols and r to denote reference signal symbols.

II. TRANSMITTER ARCHITECTURE FOR π/2-BPSK DATA AND DMRS GENERATION
In this section, we present transmitter designs to generate low PAPR data, and DMRS waveforms. We first describe the system model, including the design of the DFT-s-OFDM waveform as per the current 3GPP 5G NR specifications and then discuss the proposed transmitter designs.

A. DFT-s-OFDM Signal Model
In the current NR specifications [2], [3], Discrete Fourier transform-spread orthogonal frequency-division multiplexing (DFT-s-OFDM) [14] is used for the uplink transmission, especially in coverage limited scenarios. This waveform is also referred to as single-carrier FDM waveform (SCFDM) in the literature. In 3GPP 5G NR, QAM modulation symbols with modulation order (4,16,64,256) can be transmitted using the DFT-s-OFDM. When compared to LTE, a new modulation scheme, namely, π 2 -BPSK was introduced in 5G NR. This is a special constellation-rotated BPSK modulation, such that even-numbered symbols are transmitted as in BPSK and the odd-numbered data symbols are phase rotated by π 2 as given below - where i = √ −1 and M is the length of a BPSK sequence x(m). Here the sub-script p in x p (m) indicates a phase rotated sequence and the sub-script t in x t (m) indicates a time-domain sequence. The π 2 -phase rotation can be equivalently expressed in vector notation as given below where x t is a M length BPSK vector, P is M × M diagonal matrix with diagonal entries p mm = e i (m mod 2) π 2 .
The π 2 -BPSK modulation scheme when transmitted using DFT-s-OFDM has a low PAPR when compared to higher-order modulation schemes including QPSK as the zero-crossing transitions are avoided. The PAPR for various modulation schemes is shown in Fig. 2, which clearly shows the low PAPR behavior of the π 2 -BPSK modulation scheme. Note that, although the constellation is similar to QPSK, we can only transmit 1-bit on one π 2 -BPSK modulation symbol.

B. Spectrum shaping
Spectrum shaping is a data-independent PAPR reduction technique which can be performed either in time domain or frequency domain [4], [5]. In case of frequency-domain processing, spectrum shaping can be performed by means of a spectrum-shaping function w f = D M w t , where w t is zeropadded time domain impulse response of the L-tap spectrum shaping filter i.e., w t = [w(0), w(1), ..w(L − 1), 0, . . . , 0 Commonly used spectrum shaping filters with 2 and 3-tap impulse response are shown in Fig. 3.
Remark on the length of the spectrum shaping filter: In a recent study [12], a joint optimization of the rotation angle (other than π 2 ) and the spectrum shaping function is considered for further optimization of the PAPR of the BPSKbased DFT-s-OFDM waveforms beyond what is achieved using the filters shown in Fig. 3. The spectrum shaping filter obtained via optimization in [12] is of the length ranging between 8-24. To estimate the channel at the receiver, in [12] it is assumed that the spectrum shaping filter is perfectly known at the receiver and then the impulse response of the wireless channel is estimated for data demodulation. This violates the 3GPP design wherein it is clearly mentioned that the spectrum shaping filter is implementation-specific [1] and therefore this filter is unknown at the receiver. In such cases, the receiver will have to estimate the joint impulse repsonse of the spectrum shaping filter and the wireless channel (will be explained in detail in Section III-2). Note that, a worst case wireless channel impulse response will be of length ≤ 3 for an allocation of size 12 subcarriers (i.e., 1 resource block in 3GPP terminology) as per 3GPP channel models [17]. Now, if the spectrum shaping filter is unknown at the receiver, we will need a minimum of 11-27 samples to estimate the joint impulse response as per the design in [12] which forces the data allocation to be a minimum of 2-4 resource blocks (RB). Again this is contradicting the 3GPP design where the minimum allocation size is 1 RB. Hence, the length of the spectrum shaping filter has to be less than or equal to 3 [1] assuming two CDM groups with 6 DMRS samples per CDM in a RB. Therefore, in this paper we restrict our analysis and simulations to filters with length ≤ 3.

C. DMRS Signal Structure
As discussed in Section I, multiple DMRS sequences are transmitted on frequency division multiplexed antenna ports [2], [3] to support MIMO transmissions. It should be noted that if spectrum shaping is performed on data symbols, identical spectrum shaping should also be performed on DMRS sequences to facilitate proper channel estimation and thereby equalization. However, if this spectrum shaping is not done in the right manner, will alter the properties of the DMRS waveform depending on the antenna port on which DMRS sequence is transmitted, which subsequently may result in non-identical channel estimation (and thereby equalization and demodulation) performance across the antenna ports which is not desirable.
Hence the DMRS transmitter design, besides minimizing the PAPR of the waveform should also ensure that the characteristics of the waveform (like auto-correlation and crosscorrelation) are similar for spectrum-shaped DMRS sequences across all the antenna ports. In this paper, we propose two transmitter designs such that the PAPR of DMRS waveform is low and also the characteristics of the waveform are uniform across all the baseband antenna ports.
In the current 3GPP specifications [10], 2 MIMO streams are supported when π 2 -BPSK modulation scheme is used. To support two MIMO streams, two FDM DMRS ports are most commonly used as opposed to CDM (wherein the code orthogonality may be impacted in heavy delay spread channels). For the case of CDM, the DMRS sequences are mapped on the same antenna port and hence both DMRS ports are identical in terms of sequence generation, mapping and have same PAPR. The FDM case presents a challenging problem that needs to be addressed as will be discussed below. For FDM, a M -length data sequence on a given antenna port is associated with a corresponding M 2 -length DMRS sequence.

D. Transmission Method -1
In this section, we present data and DMRS transmission method-1 wherein the spectrum shaping is performed in the frequency domain.
1) Data waveform design method-1: Let x t denote a M × 1 vector of π 2 -BPSK modulated data symbols generated as per (1). For transmission via DFT-s-OFDM, the π 2 -BPSK data symbols are first DFT-precoded as The subscript f in x f (k) indicates a frequency domain sequence. The DFT precoding shown in (3) can be equivalently represented in vector notation form as - where D M is a M × M DFT matrix given by The spectrum shaping is performed on the DFT-precoded data vector as The mapping matrix M f can be constructed such that it allocates M sub-carriers in a localized or interleaved manner. Finally, the output of this mapping operation is converted to N × 1 time domain signal s t as where D † N is an inverse DFT matrix and N is the total number sub-carriers corresponding to system bandwidth. An appropriate length cyclic prefix is added to s t to generate s t (t) as given in equation (5.3.1) in 3GPP spec [2]. This transmitter architecture for data waveform generation is shown in Fig. 4.
2) DMRS waveform design method-1: The CCDF of PAPR of a DFT-s-OFDM waveform with spectrum-shaped π 2 -BPSK data symbols and the commonly used Zadoff-Chu based DMRS sequences [2, Section 5.2.2], [3, Section 5.5] is shown in Fig. 5. It can be seen that PAPR of π 2 -BPSK is lower than that of the ZC sequences by over 2dB. The high PAPR of ZC-based DMRS sequences will therefore limit the cell coverage as it is currently the case in Release 15 3GPP 5G NR. Hence there is a need for designing new reference signal sequences (DMRS) such that the PAPR of DMRS is similar to or lower than the data waveform. For this reason, 3GPP designed new DMRS sequences with low PAPR in [9]- [13]. We will next describe how to use these sequences and design a transceiver to maintain the low PAPR for DMRS transmissions. As mentioned earlier, we assume 2 MIMO streams are supported and the DMRS are multiplexed in an FDM manner for these streams. Hence, we assume M 2 -length  DMRS sequences will be transmitted for an M -length data allocation.
In this architecture the transmitter design is such that a given time domain DMRS signal r t will result in an identical frequency domain signal r f for any of the antenna ports. This subsequently results in similar auto and cross-correlation properties and hence produces an identical channel estimation performance at receiver. The system model of the architecture is shown in Figs. 6, 7 and the summary is tabulated in Table  I shown on the next page.
DMRS waveform generation for Port 0: Let r t be a predetermined M 2 -length DMRS sequence with BPSK modulated symbols chosen as per the designs in [9]- [13]. This will be cyclically extended to result a M length vectorr t (n) as followsr t (n) = r t nmod M 2 , n = 0, 1, . . . , M − 1.
Using P defined in (2), a π 2 -phase rotation is applied oñ r t to giver p t = Pr t . The resultant π 2 -BPSK signal is DFT precoded as r p0 f = D Mr p t . The resulting DFT-output will be a comb-like structure with non-zero entries only at odd locations which is equivalent to port-0 mapping shown in Fig.1 (and hence the notation r p0 f ). The DFT-precoded DMRS symbols are now spectrum-shaped using w f defined in Section II-B to give the spectrum-shaped port-0 DMRS as DMRS waveform generation for Port 1: As per 3GPP specifications, in FDM-based multiplexing of multiple antenna ports, the DMRS sequence should be identical on both the ports i.e., the input BPSK sequence r t and the resulting π 2 -BPSK sequencer p t has to be same for both port-0 and port-1. However, different from port-0, to generate the spectrumshaped frequency domain-DMRS sequence on port-1, the following additional steps need to be performed -• a precoder T is applied onr p t , where T is a M × Mdiagonal matrix with diagonal entries T mm = e i2πm/M followed by DFT precoding as shown below f is a comb-like structure with non-zero entries only at even sub-carriers equivalent to port-1 mapping as given in Fig. 1. Note: Only when this precoder Z is applied on DMRS of port-1, the effect of spectrum shaping on data (shown in Fig. 4) and the DMRS on ports-0, 1 (shown in Figs. 6, 7) will be identical and data can be demodulated. In the absence of the precoder Z, the non-zero entries of the spectrum shaped outputs r s0 f , r s1 f will not be identical as shown in Fig. 8. This results in non-identical PAPR and channel estimation performance on port-0 and port-1, which is not acceptable in any MIMO system.
Using the proposed architecture, it can be shown that the output of the spectrum shaping filter is identical for both the ports i.e., where r f (k) is the M-point DFT of π 2 -BPSK signalr p t . Therefore, the same reference signal is transmitted on each baseband antenna port as per the 3GPP 5G NR specifications. We further show in Section III that the channel impulse response estimated on both the ports will be identical.  The spectrum-shaped DMRS vectors r s0 f , r s1 f are mapped to a set of sub-carriers in frequency domain as discussed in Section II-B. The resulting output is converted to time domain via inverse-DFT operation similar to the method employed for data transmission as shown below - Using the above, the overall time-domain baseband signals s 0 t (t), s 1 t (t) with an appropriate cyclic prefix is generated as given by equation (5.3.1) in 3GPP spec [2].

E. Transmission Method -2
In the method-1 based transmitter design, the π 2 -BPSK data and DMRS sequences are spectrum shaped in frequency domain. Further, the DFT-precoded DMRS sequences corresponding to each antenna port are generated and spectrumshaped independently. In method-2 based design, we propose a low complexity design where spectrum shaping is performed in time-domain for both data and DMRS sequences via circular convolution operation. Specifically, a single DMRS sequence is spectrum-shaped in time-domain and mapped to both the antenna ports. The architecture for this transmitter design for the data and DMRS is shown in Figs. 9 and 10 respectively. 1) Data waveform design method-2: Let x t be the Mlength data vector to be transmitted from the UE to base station that undergoes a π 2 -phase rotation through a M × M diagonal matrix P . Here, P is the same matrix used in method-1. This results in a M -length data vector x p t = Px t with π 2 -BPSK symbols. Note that in this method, the spectrum shaping of π 2 -BPSK data, is performed in time domain through a circular-convolution procedure with zero-padded w t to produce a spectrum-shaped data as, The spectrum-shaped data sequence is DFT precoded by means of M -point as x s f = D M x s t . The DFT precoded spectrum-shaped data vector is mapped to a set of sub-carriers in frequency domain via a mapping matrix M f (described in Sec II-B). Finally, this mapped sequence is converted to time domain via inverse-DFT operation as Using the above, the overall time-domain baseband signals s t (t) with appropriate length cyclic prefix are generated as per equation (5.3.1) in 3GPP spec [2].
2) DMRS waveform design method-2: Let r t be the pre-determined M 2 -length DMRS sequences (as mentioned earlier in method-1) which undergo π 2 -phase rotation through diagonal matrix diagonal matrix with diagonal entries given by (e i(mmod 2) π 2 ) this result in a M 2 -length DMRS vector r p t = P 1 r t with π 2 -BPSK symbols. The spectrum shaping of the DMRS symbols is performed in time domain through a circular-convolution procedure with zero-padded w t to produce a spectrum-shaped DMRS sequences as, The spectrum-shaped DMRS sequence is DFT precoded by means of In the above equations, r s0 f and r s1 f indicate the frequency domain DMRS sequences on port-0 and port-1 respectively.  It can be seen that with the proposed architecture the non-zero entries of DMRS sequence are exactly identical for both the ports i.e., where r p f (k), w f (k) are the M 2 -DFT outputs of π 2 -BPSK DMRS symbol r p t , filter w t respectively. The DFT precoded spectrum-shaped data and DMRS vector of each port is mapped to a set of sub-carriers in frequency domain via a mapping matrix M (described in Sec II-B). Finally, this mapped sequence is converted to time-domain via inverse-DFT operation as Using the above, the overall time-domain baseband signals for DMRS transmission i.e., s 0 t (t), s 1 t (t) with appropriate length cyclic prefix are generated as per equation (5.3.1) in 3GPP spec [2].

F. Summary of the transmission methods
We presented two transmission methods for the data, DMRS waveform generation. Specifically, in method-1, the processing happens in frequency domain while in method-2 the processing happens in time domain via the circular-convolution operation. Also, in method-1, a M -length DMRS sequence is spectrum shaped in frequency domain, whereas a length M 2 DMRS sequence is spectrum shaped in time domain. Irrespective of this difference, we show that both these methods are capable of estimating the channel perfectly. Further using (7) and the DFT property that even indexed samples of M -point DFT of any arbitrary sequence will be identical to its M 2 -point DFT output, it can be shown that Using (14), we can rewrite (7) as which is exactly identical to (12). Since input to IDFT is identical for both the methods, we conclude that the inverse DFT outputs of port-0, port-1 i.e., r s0 f , r s1 f and the subsequent baseband signals generated through method-1 will be identical to that of generated using method-2. Using an example, we show in the Appendix that the channel estimation performance when these different transmitter methods are used will remain the same. Remark: The current 3GPP specifications for 5G NR do not mention how the spectrum shaping and transmission for data and DMRS must be done for single as well as multiple antenna ports as it is left as an implementation choice. However, as we have shown extensively, this causes ambiguity at both the transmitter and receiver if not done in the right manner. Hence to avoid this ambiguity and also to avoid data loss on any antenna port, the designs mentioned above must be used.

III. RECEIVER DESIGN
The receiver procedure explained next is common for both the transmission methods explained in previous sections. Hence, we do not distinguish between the transmission method-1 and method-2 in this section.
The receiver front end operations such as sampling, synchronization, CP removal and FFT are similar to a conventional DFT-s-OFDM-based system as shown in 11. Further, the ISI introduced by the propagation channel is assumed to be less than that of the CP length. Therefore, after CP removal and DFT, the data and DMRS signals on kth sub-carrier can be represented as (without loss of generality we consider only the initial M subcarriers of the DFT output, i.e., k ∈ [0, M − 1]) In the above, y d correspond to the received data vector with data symbols from both the ports (recall that 2-antenna ports can support 2-MIMO stream transmissions). y 0 DMRS , y 1 DMRS correspond to the received DMRS vectors on port-0 and port-1 t correspond to frequency response of the time-domain wireless channel impulse response h 0 t on port-0 and h 1 t on port-1 respectively and x s0 f , x s1 f , r s0 f , r s1 f are the transmitted data and DMRS sequences defined in Section II. The noise vectors v, v 0 and v 1 are i.i.d. complex Gaussian random variables with zeromean and co-variance σ 2 I where I is an identity matrix and σ 2 is a constant indicating the variance of each noise sample. In practice, for low to medium user speeds the time variations of the multipath channel across consecutive OFDM symbols as shown in Fig. 12 will be minimal and hence without loss of generality we consider that This is a common assumption made in the design of 4G and 5G cellular systems. 1) Channel estimation: As per 3GPP specifications, the spectrum shaping filter w t is implementation-specific i.e., different UEs can use different filters based on their hardware implementation and hence the exact filter being used is unknown at the base station receiver [1], [3]. Hence, the channel estimation module at the receiver should now estimate the impulse response of filter and wireless channel jointly. In our work, we use a DFT-based channel estimation technique to estimate the joint channel impulse response for the M allocated sub-carriers. A simple least-squares based technique with tone averaging or linear interpolation based on assumption that the channel is constant across consecutive sub-carriers does not work well in this case due to the presence of the spectrum shaping filter, because spectrum shaping considerably changes channel across consecutive sub-carriers based on the shape of the filter shown in Fig. 3.
As already mentioned in Section II, a M -length data vector will be associated with M 2 -length DMRS vector. Firstly, we show that a M -length frequency domain channel vector (as the data allocation is M , the channel on all of these M tones must be estimated for demodulation) corresponding to M -length data symbol can be perfectly constructed from M 2length DMRS sequence for both ports.
2) Channel estimation on port-0: As mentioned earlier, port-0 carries DMRS only on even numbered sub-carriers which are extracted and expressed in terms of π 2 -BPSK DMRS as follows where (17) results from (15), and (18) results from (7). Invoking the equivalence between M -point DFT and M 2 -point DFTs (14), the above equation can be represented as where indicates the circular-convolution operation, h 0 t,DMRS is the impulse response of the wireless channel on port-0,r p t (n) is defined in section II We perform channel estimation onỹ DMRS as follows -we first perform a least squares based channel estimation and then on the resulting output we take an M 2 -point IDFT. This gives the joint impulse response of filter and the wireless channel as - The length of h eff will be max length(w t ), length(h 0 t,DMRS ) . Irrespective of the pulse shaping filter, the reference signal design should ensure that the DMRS sequence length will be at-least twice that of the impulse response of the wireless channel i.e., the length of h 0 t,DMRS ) is assumed to be less than M 2 [15] which is typically the case for practical wireless channel models considered by 3GPP [17]. From the above we conclude that h eff completely captures the joint impulse response of the spectrum shaping filter and also the wireless channel.
A de-noising time domain filter [16] is then applied to reduce noise in (20). This filter f (n) is defined as where f c is the "cut-off" point of the time domain filter which is commonly chosen as the length of the wireless channel length length(h 0 t,DMRS if it is known apriori or it is set to the cyclic prefix length in case no knowledge about the wireless channel is available. The rest of the samples are set to 0. This filter helps to extract only the useful samples of the CIR Channel Impulse Response samples Fig. 13: Magnitude of the estimated channel impulse response on port-0 and port-1.
while reducing the noise in the rest of the samples. For more details, please see [16]. The effective impulse response after de-noising is given aŝ Lastly, the time domain filtered samples are transformed via a M -point DFT to recover the frequency-domain channel estimates on each sub-carrier k ∈ [0, M − 1] asĥ f eff = D Mĥeff (n). This can be further used for port-0 data demodulation using well-known techniques.
3) Channel estimation on port-1: As mentioned earlier, port-1 carries DMRS only on odd sub-carriers which are extracted and expressed as follows Using (16), the above equation can be written as y 1 DMRS (k) = r s1 f (2k + 1)h 1 f,DMRS (2k + 1) + v 1 (2k + 1) (21) Assuming that the wireless channel remains constant across consecutive sub-carriers (again a common assumption in 3GPP designs), we have Using (7) and (22), (21) can be expressed as Further processing steps such as the least-squares based channel estimation, de-noising and transforming the effective impulse response to frequency domain are identical to the procedure followed for channel estimation on port-0. For the case of AWGN channel i.e., h 0 f,DMRS (k) = h 1 f,DMRS (k) = 1 ∀k, the estimated joint impulse response h eff on port-0 and port-1 is shown in Fig. 13. It can be noticed that the estimated impulse response is identical for both the ports.

4)
Equalization and data demodulation:: The estimated channel on port-0 and port-1 will be employed for channel equalization of data streams. Specifically, we construct an MMSE-equalization filter using the channel estimates obtained previously and then equalize the received signal samples on all the receive antennas of the base station. The equalized data streams are demodulated to generate soft log-likelihood ratio values which are given as input to the channel decoder module for further bit-level processing.

IV. NUMERICAL RESULTS
In this section, we present various numerical results that show • The PAPR comparison between the π 2 -BPSK based DMRS sequences and the existing 3GPP ZC-based DMRS sequences. • Link level block error rate (BLER) comparison for the data transmissions employing π 2 -BPSK based DMRS sequences and existing 3GPP ZC-based DMRS sequences for various sequence lengths and various bandwidth allocations. • BLER performance for the data transmissions on port-0 and port-1 in the case of MIMO two stream transmissions.
Unless otherwise mentioned, the simulation assumptions shown in Table II are used throughout this paper.
The CCDF of PAPR for ZC and π 2 -BPSK sequences is shown in Fig. 14. The ZC sequences considered in this case are as defined in [2, section 5.2.2] with length 96. The PAPR of both with and without spectrum shaping of ZC sequences is shown in the figure. As can be seen from the figure, the 3GPP ZC sequences without spectrum shaping have a PAPR (at the 10 −3 CDF point) that is 2.8dB more than the π 2 -BPSK sequences. When spectrum shaping is applied to the ZC-DMRS, the PAPR is slightly reduced from that of un-filtered ZC sequences. However, the PAPR of the filtered ZC sequence is still 2.0 dB larger than the PAPR of the π 2 BPSK sequences with the same spectrum shaping. Moreover, as we increase  Fig. 15: PAPR of length-12 3GPP CGS and π 2 -BPSK DMRS sequences the number of allocated sub-carriers for data transmission, the PAPR gap between 3GPP ZC sequence and π 2 BPSK increases even further.
The CCDF of PAPR for ZC and π 2 -BPSK for smaller lengths (N = 12) is shown in Fig. 15. As discussed in section II, for smaller lengths (N < 30), 3GPP employs computer generated sequences (CGS)as DMRS. It can be seen from the figure that the PAPR of the spectrum shaped CGS sequences is almost 1.2 dB larger than the PAPR of the π 2 BPSK sequences. Moreover, it can also be noticed that for CGS the PAPR is further increased with filtering. Hence, these results conclude that the π 2 BPSK sequences designed in [13] are far superior compared to the existing sequences in improving the cell-coverage.
The block error rate performance for a single stream PUSCH transmission in shown in Fig. 16. Here, DMRS is transmitted on port-0. Note that ZC sequences are used for comparing the BLER performance because these sequences have a power density which is frequency-flat and hence treats every sub-carrier equally and can estimate the channel equally well across the entire bandwidth. Hence, the goal for the newly designed sequences is to ensure that they match the performance of these ZC sequences. In this figure, the results are shown for the cases when the base station receiver employs 2 and 4 receive antennas. From Fig. 16, it can be clearly seen that irrespective of number of receive antennas, the link level performance of π 2 -BPSK DMRS is equivalent to that of 3GPP ZC-sequences although the newly designed sequences are not frequency-flat like ZC.
We next consider the performance of the proposed transmitter designs for the case of two MIMO streams transmission setting where DMRS is transmitted on both port-0 and port-1. Firstly, we show the drawbacks of the existing design in 3GPP in Figs. 17 and 18. It is seen that when the 3GPP transceiver is used, there is a clear difference in the performance both in terms of PAPR and BLER across port-0 and port-1. This is highly undesirable as the data on two different ports will behave differently and practically port-1 is useless. This problem is addressed using our proposed transceiver design as claimed earlier. We next show that it is indeed the case.
In Figs. 17 and 19, we show the PAPR and BLER performance for the two MIMO streams transmission setting where DMRS is transmitted on both port-0 and port-1 using our proposed method-1 transceiver design. It can be seen that both the PAPR as well as the BLER is identical for both the ports confirming that the proposed transmitter design produces identical DMRS sequences on both the ports.
In Fig. 20, PAPR of the DMRS sequences on port-0 and port-1 generated by method-1 and method-2 is shown. It can be seen that PAPR is exactly same for both port-0 and port-1 in both the methods confirming that proposed transmitter designs are equivalent. The same is the case with BLER performance as well. Therefore, the proposed methods 1 and 2 have shown to be equivalent both analytically and numerically.

V. CONCLUSION
In this paper we proposed a low PAPR reference signal transceiver design for 3GPP 5G NR π 2 -BPSK based uplink transmissions. Using the proposed design, the PAPR of the reference signal is significantly minimized compared to the current design of Rel-15 5G NR systems. Such a design considerably helps to improve the coverage of the 5G systems. Specifically, we have shown a frequency domain and a time domain transceiver design both of which are equivalent and result in same system performance in terms of PAPR and also BLER. We have shown how the proposed design can be extended to the case of a MIMO transmission without causing any discrepancy on different MIMO streams which is not the case for the current Rel-15 3GPP 5G NR uplink design.