On the feasibility of a secondary service transmission over an existent satellite infrastructure: design and analysis

In this paper, we present a realistic use case in order to investigate the feasibility of a secondary service transmission over an existent satellite infrastructure. By introducing the overlay cognitive radio paradigm towards satellite communications, we compute a theoretical achievable data rate greater than 16 kbps for the secondary service, which is suitable for most M2M applications. Using simulation results, we show that this can be achieved while preserving the primary service performance. In addition, a system design framework is discussed in order to dimension such systems.


Introduction
It can be emphatically stated that the access to space is easier, cheaper, and faster than ever before. This is the widespread view among the main satellite players with respect to the unique opportunistic time currently experienced by the space segment. Actually, the favorable projections to this segment could be sustained especially today, since the demand for the rising new services has increased considerably [1] (multicast, broadcast, high mobility, and wide coverage).
In this sense, as an actual and typical example, it could be pointed out the use of satellite to support the machineto-machine (M2M) communications, providing connectivity to the end-users anytime, anywhere, for any media and device [2]. The M2M communications are one of the central use cases in the fifth-generation (5G) mobile network [3] as they play a major role in the Internet of Things (IoT). In fact, it is predicted in [4] the deployment of In what concerns the technological advances, this satellite era consolidation could be reinforced (but not only) by the recent maturity reached in the manufacturing process (cheaper and faster production, powerful and sophisticated payloads). To keep up with the new challenges of M2M communications, the satellite communication systems need to push the boundaries in the direction to more efficient technical solutions. For this purpose, the search for power and bandwidth efficiency as well as the actual trend to low complexity systems must be a paramount concern for system designers.
For a given provided service to the end-users, the design requirements of the communication system, in general, are settled by (i) the service availability, which ultimately specifies the required bit error rate (BER); (ii) the allocated spectral band, which is assigned by the International Telecommunication Union Radio Regulation [5] (ITU-RR), and (iii) the range of the received carrier-tonoise ratio (CNR), measured in the occupied bandwidth, which is limited by the large path loss as well as the nonlinear behavior of the satellite channel. For instance, it could be typically assigned the dynamic range of CNR between − 3 and 20 dB, in accordance to the extension recently presented at the DVB-S2X standard [6].
Basically, concerning these bounds, the system can be designed efficiently in (i) power, by decreasing the received power (or equivalently the CNR) necessary to reach the specified BER, for instance by adding redundancy bits in a digital encoding system, or (ii) spectral efficient, by increasing the number of bits per hertz into the occupied bandwidth, which ultimately increases the transmitted data rate. Last but not least, the (iii) system complexity should be carefully evaluated in order to reduce as much as possible the number of processing operation, especially when on-board systems are considered.
Apart from that, since the available radio spectrum is today a scarce resource (cf [7] for example), another challenge faced is to develop techniques which enable a better coordination between legacy and future services, especially considering this new machine-type communication environment and its large-scale implementation. Despite of the spectrum regulation and policy still being considered as a dry subject since the earliest days of radio communication, the need to reconsider the static longterm exclusivity of the spectrum via licensed regulation procedures, as well as the encouragement of the techniques which enable the coexistence of different networks, become a key element to permit a proper expansion of these new services [8].
It is within this framework that the cognitive radio (CR) techniques have also become attractive for space applications [9]. Based on the recent developments in the space qualified software-defined radios (SDR) [10] and also by the acceptability of concepts such as flexible [11,12] and hosted payloads, these techniques allow a smarter spectrum management. In addition, some valuable research [13] has been acted in spectral awareness and spectral exploitation techniques, which has driven the cognitive satellite communications towards a promising approach.
In a nutshell, the cognitive user (CU), in our context, unlicensed or less prioritized to operate in a specific spectrum band, senses the environment around it, and adapts its transmission as a function of the interference, by adjusting the frequencies, waveforms, and protocols in order to access the licensed primary user (PU) spectrum efficiently. Without going into further details, three paradigms classify the CU operation [14]: • Interweave, which is based on the idea of opportunistic transmission. In short, the CU observes the white spaces not used by the PU transmission, (in space, time or frequency) and adjusts its operating parameters. It is inferred that, ideally, there is no coexistence between users and, consequently, no power control of the CU transmission is required; • Underlay, denominated as gray spectrum space, where the CU, by means of the partial knowledge of the PU signal characteristics and channel, adjusts its parameters in such a way as to respect an acceptable interference threshold. As a greater knowledge of the PU signal is required, more sophisticated spectral sensing techniques must be employed, such as signal-to-noise ratio (SNR) estimation. It is worth noting that the interweave paradigm can be seen as a special case of underlay, where the threshold does not allow any interference. Some examples of techniques found in the literature are: dynamic resource control (power and frequency), beamforming with multiple antennas, and spectral spreading; • Overlay, where the CU, from the full and noncausal knowledge about the PU waveform and message and channel, uses advanced coding and modulation strategies to transmit simultaneously while mitigating the interference. The occupied spectrum space in this last paradigm is called black, due to the fact that it is occupied by the interfering signals and noise.
The first two schemes were well studied in [13,15]. In this paper, we investigate the third scheme.
The main reason to propose the overlay paradigm for satellite communications lies in the feasibility of transmitting both unlicensed and licensed services simultaneously from the same satellite towards its respective terminals. We emphasize that, due to the priority among users, the superposition coding strategy is required [16] to relay the PU transmission, unlike the technical solutions adopted for the broadcast channel [17]. Furthermore, the dirty paper coding (DPC) [18] is implemented to adapts the cognitive signal to the direction of the PU interference.
Concerning the DPC, from a brief historical perspective, the first idea of practical scheme was proposed by Erez, Shamai and Zamir [19]. It pointed out the Tomlinson-Harashima precoding (THP) for intersymbol interference (ISI) canceling, which can be seen as a DPC application for frequency selective channel. In this technique, the modulo operation is used to pre-subtracted the interference with a minimum power increase. Also in this work, the THP losses, i.e., shaping loss at high signal-to-noise ratio (SNR) regime and the combined modulo and power losses at intermediate/low SNR regimes, were well characterized. Moreover, Eyuboglu and Forney in their seminal paper [20] generalized the combination of the trellis shaping (TS) [21], trellis-coded modulation (TCM), and THP for Gaussian ISI channels. The so-called trellis precoding (TP) performs interference pre-subtraction and allows recovery of the shaping loss. Likewise, a little bit more closer of our application, an extension of TP for multiuser interference was proposed to recover the shaping loss with sufficient high constellation expansion in [22] and [23], where the TS technique acts as a vectorial quantization, replacing the modulo operator. In view of the above, the techniques previously exposed in the recent publications [24,25] concerns the design of the overlay paradigm transmission towards satellite communication systems.
This paper is an extended version of the author's work [26]. Additionally, to the link budget evaluation and system dimensioning introduced in [26], we provide here a design framework to implement such cognitive systems as well as a detailed analysis of the effect of different parameters on the overall system performance. Firstly, a DPC encoder is proposed involving TS and TCM concepts along with proper constellation expansion combined with THP. The discussions leads to a trade-off between power efficiency, by the reduction of the modulo loss, and complexity, key levers for both satellite onboard processing and terminals. Secondly, as a general contribution, we focus on the feasibility of a low data rate secondary transmission. In this sense, a practical use case is investigated, which considers commercial off-the-shelf (COTS) parts [27] and assumes realistic link budget parameters in its evaluation. The discussions and results contained herein could be seen as part of a "preliminary phase" of an engineering process plan [28].
Following this introduction, Section 2 presents the overlay model description. Subsequently, in Section 3, the employed methodology is introduced, with reference to the superposition and dirty paper coding (DPC) designs. The paper contributions on these techniques are detailed in Section 4, emphasizing the satellite context. Next, the results and discussion are presented, by investigation of a realistic use case in Section 5. Finally, Section 6 is dedicated to the conclusions, with suggestions for future works.

Overlay model description
The following scenarios are provided as examples where the overlay CR techniques might be applied to satellite communications. In the first case, presented in the Fig. 1a, an ordinary low Earth orbit (LEO), or medium Earth orbit (MEO), satellite provides two different services towards different terminals. In this context, a single licensed user PU takes priority over the added unlicensed CU. The interference presented at both terminals should be mitigated by properly designed CU encoder, without any changes in the PU transmission chain.
In the same way, the GEO multibeam satellite is illustrated in the Fig. 1b. In this case, considering the frequency reuse, the CU is able to transmits by using, for instance, the determined blue frequency (or polarization) into the red spot footprint, as far as the interference among adjacent beams is resolved. It is worth noting also that all possible different PU transmissions, represented by several blue spots, should be taken into account in the interference mitigation design. By this way, the total satellite capacity could be increased as well as the spectrum resources better managed.
Equally suitable for both scenarios, the interference model with side information, adapted from [16], is presented in Fig. 2. Assuming that the signals are onboard the satellite, the cognitive encoder has full and noncausal knowledge about each PU ith signal and message, which addresses the main overlay paradigm requirement. In this sense, the encoded cognitive signal X n c is function of both primary and cognitive messages m p,i and m c .
Without loss of generality, considering the ith PU and the added CU, the channel gains | h yx,i | (from the transmitter x to the receiver y) are defined by the direct paths (| h cc | and | h pp,i |), and the interfering paths (| h pc,i | and | h cp,i |) losses (the Fig. 2 summarizes these notation). In our context, these gains are computed as function of the each transmission link budget.
The following equations describes the output of the channel, where n refers to the nth symbol: Based on the fact that the terminals may be located in different geographical sites, the Gaussian noise component Z n p,i (resp. Z n s ) is assumed to follow the normal law N 0, N p,i (resp. N (0, N s )). Also, the power constraints to be satisfied are E X n p,i 2 = P p,i and E X n c 2 ≤ P c , respectively.
Finally, since each PU has the same transmission priority, we highlight that the interference among them could be solved by precoding techniques as proposed, for instance, in DVB-S2X standard [6]. Under this assumption, this work provides a design method to permit a secondary service transmission without affecting the PU transmission performance.

Superposition strategy
The purpose of the superposition technique is to ensure that the signal-to-noise ratio (SNR) at each PU receiver is not decreased in the presence of interference. To accomplish this goal, the CU shares part of its power to relay each PU message, andX n c is a modified version of the CU message, as it is going to be detailed in the next section. Based on that operation, the CU transmitted signal is given by where α i ∈ [ 0, 1] is the shared power fraction from P c to relay each PU message.
Under the assumption that all signals are statistically independent, the new power constraint can be defined as E[ X n The signal-tointerference-plus-noise ratio (SINR) at the kth primary receiver is given by In this context, the superposition factor α k ∈ [ 0, 1] that maximizes Eq. (4), for the interference condition (| h pc,k |> 0), which is a generalized form of [16,Eq. 14], is given by Hence, by applying Eq. (5) in Eq. (4), we have By inspection of Eq. (3), we emphasize that the CU transmission is feasible only if the condition N i=1 α i < 1 is satisfied. By this assumption, note that the CU data rate should be decreased when aggressive frequency reuse scenarios are considered.

Dirty paper coding
Once the superposition factors are computed and the CU partially shares its power to relay each PU signal, the next step is to designX n c efficiently, in order to minimize the interference of PU transmission on the CU receiver. The optimal strategy employes the theoretical results presented by Costa [18]. On the assumption that the interference (PU signal) is noncausally known at the CU transmitter, a transmitter-based interference presubtraction can be implemented, without any power increase, reaching the AWGN capacity.
By rearranging the Eq. (2) and considering the superposition, we have Without loss of generality, given that the signals in Eq. (7) are statistically and mutually independents, the implemented model considers a single Gaussian distributed PU constellation, in respect to the total interfering power received at CU terminal. In addition, in order to simplify the notation through this paper, the Eq. (7) is normalized by the direct path attenuation factor |h cc |. Thus, the signal at CU receiver is given by where the factor b represents the normalized interfering path and S n represents the total channel interference. Figure 3 presents the basic diagram of the DPC encoder, where the THP is used to presubtract the multiuser interference. In this configuration, assuming low and intermediate SNR regime, the partial interference presubtraction (PIP) is implemented [18]. In this way, the signalX n c is designed aŝ where X n cc is the coded signal and the factor λ, to be properly chosen, controls the fractioned interference to be presubtracted. Also, MOD is the complex-valued modulo operation, implemented to limit the transmitted power. The modulo region is defined by = √ Md min , where M is the number of points of the square QAM constellation and d min is the minimum intersymbol distance.
When compared to QAM signal on AWGN channel, the performance of THP system is degraded by three separated sources of losses: power loss, modulo loss, and shaping loss [22]. From this point, our goal is to design the coded signal X n cc in order to mitigate these sources of losses. Figure 4 presents the implemented encoder for the cognitive user. Three gains can be achieved by this system: (i) coding gain, represented by the upper part of the diagram; (ii) shaping gain, achieved by the trellis shaping code in the lower part, and (iii) prediction gain (term rubricated by [29]), achieved by the modulo operation jointly with shaping code.

Main encoding blocks
The two codes work independently. The input bit sequence is split in two parts. The upper part is formed by the coset select code C c . This later is described by the generator matrix G c , which encodes the k c message bits x c into a n c -coded bits y c . In the lower part, the r s -bits syndrome sequence s passes through an inverse syndrome former H −T s for the shaping code C s . This initial sequence t jointly with the channel coded sequence y c and the interference λS n are fed into the Viterbi decoder. This later selects, according to a well chosen branch metric, the shaping codeword y s . After that, the shaped sequence z s is obtained by the XOR operation between t and y s . Note that z s and t are within the same coset, according to the trellis shaping on regions strategy, detailed in [21]. Finally, the output shaped sequence X n cc is obtained by mapping the d symbols as a function of y c and the sign mapped bits z.
Similar to the THP operation, the coded symbol sequence X n cc is presubtracted by the scaled interference sequence λS n and modulo operated in order to limit its power. At the end, the transmitted sequenceX n c is obtained and sent through the interference channel.
The following branch metric Eq. (10) is implemented, where the precoder selects the proper region sequence with minimum average power to steer the scaled interference sequence λS n :

Transmitter design
We propose, in this section, a scheme based on the trellis shaped DPC encoder with slight constellation expansion combined with THP. Unlike the previous works published in [22,23], the modulo operation is always part of our transmission system, assuring the required power limitation for the transmitter, given any interference scenario. The available remaining transmitted power (after the superposition) is computed based on the link budget parameters. In order also to comply with the defined SNR range for satellite communication, it is proposed a typical transmission rate of R cu = 2 bits/symbol. The trellis shaped-based DPC is implemented, with a slight expanded constellation of n s = 2, and modulo operation in its output, previously depicted in Fig. 4.
For the coding gain, a systematic 64-state, rate 1/2, convolutional code C c specified in octal notation by the feedforward polynomial h 1 (D) = 54 and the feedback polynomial h 0 (D) = 161 is assumed. Along with this, for the shaping code C s , the 4-state code, rate 1/2, specified by generators g s,1 (D) = 7 and g s,2 (D) = 5 is implemented. In other words, at the DPC transmitter, the 4-QAM-coded constellation is replicated four times by the shaping operation (since n s = 2), resulting in an expanded 16-QAM constellation for X n cc . In addition, considering the shaping operation, the proposed mapping for X n cc is based on the sign bit shaping strategy described by Forney in [21]. Following the same notation as in Fig. 4, consider now the 16-QAM X n cc constellation represented in Fig. 5. Each symbol is defined by the tupple (z 1 z 2 y c 1 y c 2 ), where z 1 and z 2 are the sign bits while y c 1 and y c 2 are the coset bits. It should also be noted that, to keep the specified rate, which maintains the SNR range, the uncoded bits are not implemented in this scheme (when compared with the general scheme presented in the previously [22]).
Thus, the shaping regions are represented by the four different colors that delimit the constellation regions, while the four different markers , X, , or represent the convolutional code cosets. The analogy with the Gelfand -Pinsker multicoding scheme [30], which is a key concept utilized in the DPC theoretical proof [18], is intuitive. Basically, we assign the transmitted "sequence of colors" as the subcodebook, determined by the shaping code C s , which is indexed by the chosen codeword generated by the convolutional code C c . In this sense, the described encoder can be conceptually considered as a practical implementation of the DPC encoder.

Receiver design
At the receiver, as depicted in the Fig. 6, the reverse chain is implemented: firstly, Y n s is multiplied by the factor λ. Before entering the DPC decoder, the signal is modulo operated again, as the following: where in Eq. (11), the following property was utilized The value of λ that minimizes the effective noise N eff = (1 − λ)X n c + λZ n s is obtained by [18]: The decoder employs the same strategy for usual TCM schemes. Figure 7 illustrates an example of trellis for the channel code C c with the parallel transitions. Since each branch of the trellis corresponds to a signal subset (in this case, defined by y c and labeled by the markers , X, , or ), the first step in decoding is to determine the best signal point within each subset (equivalently to determine the best shaping region, represented by the different colors). This is performed by selecting the point that is closest in Euclidean distance to the received point Y s . After that, the selected signal point and its squared distance metric are applied in the usual Viterbi decoder for C c in order to select the most likely coded sequenceŷ c [31]. At the end of the decoding, the estimated shaping bitsŝ are obtained byŝ =ẑH T s .
By observing this simple example, the trellis transition from the state σ n j to the state σ n+1 j provides (y c 1 y c 2 ) mapped to (0, 0), which is represented by the marker " " according with the assigned mapping. Thus, the decoder will perform a hard decision among the four constellation symbols (0, 4,8,12), representing the shaping regions, defined by the pair of bits (z 1 z 2 ). In the same way, the transition from the state σ n j to σ n+1 j+1 provides (y c 1 y c 2 ) mapped to (0, 1), represented by the marker "X, " which decides among the parallel transitions 1, 5, 9, or 13. It is worth noting that an error in the first step of the decoding procedure (i.e., error in the parallel transition decision) results in a wrong decoding of the shaping region. This fact increases significantly the degradation caused by the modulo loss in DPC schemes, as will be further discussed in the next section. We also remark that, in order to reduces the system complexity, the state expansion in the Viterbi decoder at the receiver [32][33][34], which could be an effective strategy for eliminating the parallel transition effect, is not considered in this implementation.

CU practical encoder design for dirty paper channel
In this section, we describe our contributions based on some changes in the low complexity encoder presented in the last section. Firstly, taking into account that the CU signal is Gaussian distributed by trellis shaping operation, we demonstrate a procedure that provides the appropriate output power forX n c . As a consequence, the SINR of the PU is properly evaluated. Subsequently, we deal with the CU link by proposing further expansion in the X cc constellation, jointly with an optimized mapping design. The results are analyzed as a function of the complexity involved in the transmitter implementation.

CU transmitted power control
The power reduction ofX n c , caused by the trellis precoding technique (i.e., E[ |X n c | 2 ] < (1 − α)P c ), impacts directly on both link performances since the value of α, adopted for power sharing at the superposition strategy, is no longer exact. It can be noted, as given by Eq. (6), that the SINR at primary receiver is increased. As a result, the PU presents better bit error rate (BER) performance than when operating in AWGN channel (as further analyzed in [25]). By considering that the exactly same shaping gain generated by a shaping code C s is obtained for the multiuser precoding [22], we propose a method for controlling the CU output power such as E |X n c | 2 = P c , or equivalently, E |X n c | 2 = (1 − α)P c . This is reached by proper scaling the minimum distance d min of the coded constellation X n cc . As a reference, we assume the power of the baseline constellation, without considering the shaping operation, given by [35]: where R is the data rate in bits per two dimensions, without taking into account the shaping redundant bits. The scaled minimum distance is defined as d min . The shaping gain is then defined as Thus, by combining Eq. (14) and Eq. (15) and rearranging the terms, the d min such that the available power after the shaping operation is equals to (1 − α)P c , is given by All things considered, some remarks are pointed out for this proposed procedure: 1. Both links are properly adjusted. Since the output power of the remaining CU transmission, after superposition, satisfies exactly E |X n c | 2 = (1 − α)P c , the BER performance of PU in a interference channel is the same as AWGN channel. In addition, as the CU power is efficiently employed, the BER of cognitive link is improved or, equivalently, the secondary service data rate can be augmented; 2. CU modulo loss might be reduced. The performance degradation caused by modulo loss augments as a function of: (i) the selection probability of boundaries symbols in the X n cc constellation and (ii) the reduction of the SNR level. By these perceptions, it is worth noting that, by re-scaling the minimum distance of X n cc and considering the same Gaussian distributed interference S, this procedure might reduce the occurrence of boundaries selected points for X n cc , since the constellation region is further enlarged. Clearly, because of this modulo loss reduction, this technique improves the cognitive system performance for the same fixed SNR (in the next section, the investigation of a practical use case will further demonstrate this concept). 3. Mapping of X n cc constellation should be optimized. In order to obtain exactly the specified output power (1 − α)P c , the shaping operation should be performed over a defined continuous region delimited by the modulo amplitude . In this sense, it is of utmost importance that the mapping be designed such that the interference presubtracted signalX n c be confined within this delimited region. Otherwise, the previous procedure does not control the exact power. This specific issue, jointly considered with constellation expansion, is further discussed in the next section.

Constellation expansion
The THP precoding works as the simplest solution for multiuser interference (MUI) presubtraction. By using the modulo operation, this technique satisfies the power constraint for application of dirty paper encoder and results in almost negligible degradation in high SNR regimes [36][37][38].
However, for low and intermediate SNR regimes (for instance, below 15 dB), the degradation caused by THP losses becomes more significant, especially due to the modulo loss. Additionally, the modulo loss increases as a function of the probability occurrence of the boundary constellation points in X n cc . In order to recover part of the THP losses, some practical precoding techniques are proposed in [22]. For intermediate SNR regimes, which is our application, [22] proposes the combination of TP and PIP techniques. The recommended shaping code is 5/6 in order to expand the X n cc , in such way to confine the interference. Also, as the trellis shaping is implemented at the precoder, the transmitted presubtracted signalX c has Gaussian distribution. It is interesting to observe that, as the modulo operation is not considered at the precoding, only the power and shaping losses affect the system performance.
As a complement of the approach presented in [22], we consider the modulo implemented at the transmitter output in our investigated schemes. The amount of the interference is function of the link budget and, on the contrary of [22], different constellation expansions are considered. This assumption allows the analysis of the modulo loss impact. Also, the results could be evaluated considering the trade-off between complexity and power efficiency.
As an example for our proposed constellation expansion, and following the same notations in Fig. 4, let us consider that the original DPC constellation is a 16-QAM. Each symbol is defined by the tupple (z 1 z 2 y c 1 y c 2 ), where z 1 and z 2 are the sign bits while y c 1 and y c 2 are the coset bits (as proposed in our low complexity design in the Fig. 5). This constellation can be expanded to, for instance, a 64-QAM by considering an additionally two information bits u 1 and u 2 or two "auxiliary" bits (not information bits) z aux 1 and z aux 2 . Now, each symbol is defined by the tupple (z 1 z 2 u 1 u 2 y c 1 y c 2 ) or (z 1 z 2 z aux 1 z aux 2 y c 1 y c 2 ), respectively.
By employing the same procedure, the signal constellation of X n cc can be expanded as necessary to confine the scaled interference (i.e., | Re(λS n )| and | Im(λS n )| < ( √ M/2).d min , where M is the constellation order of X n cc ). In this way, the assignment of the boundary constellation points can be avoided and, as consequence, the modulo loss is mitigated. Also, it is highlighted that the original information rate is maintained by expanding with auxiliary bits, since these last ones are not information bits. For the transmitter, this operation could be seen as an extension of the trellis shaping procedure. The Viterbi decoder of the shaping operation acts as an usual TCM decoder, where the auxiliary shaping bits z aux represent the parallel branches transitions. These latter are hard decided during each trellis section for C s . Furthermore, it is still necessary to design the optimal mapping of shaping regions, according to the information rate. The next section discusses this topic.

Mapping discussion
Consider the implementation of the mapping I, depicted in the Fig. 8. We assume a shaping code C s of rate 1/2, which provides four shaping regions, represented by the different colors. The convolutional code C c of rate 1/2 is defined, which also provides four TCM cosets. In addition, the modulo regions are represented by dashed lines, according to the X cc constellation expansion adopted in 16, 64, 256, and 1024-QAM.
The interference point is represented by the red "X", which is presubtracted by the X cc constellation point, selected by C s , to form the transmittedX c symbol. As we have discussed in the Section 4.1, by a proper mapping design, we confine theX c signal in a determined continuous region of the constellation. In this way, the control of the transmitted power is achieved. Following, two cases are analyzed.
In the first case, depicted in the Fig. 8a, the uncoded information bits in the trellis precoding are considered. We define the mapping, according to the sign bit shaping strategy, proposed in [21], as (z 1 z 2 u 1 u 2 u 3 u 4 u 5 u 6 y c 1 y c 2 ). In this way, we have 8 information bits transmitted by symbol, which require high SNR (unlike the operating range targeted in this article). Thus, there is no shaping subregions in this scheme. Each 256-QAM constellation, divided in 4 cosets with 64 constellation points inside each coset, is repeated 4 times to form the outer shaping 4-QAM. Regardless of whether the interference S is confined within the 1024-QAM constellation of X cc , the presubtracted signalX c , due to the modulo, is always restrict in its region. Based on this assumption, we observe the following: • The shaping loss is partially recovered. This is reached according to the shaping gain γ s , provided by the code C s , and based on the fact thatX c is continuous and shaped bounded by the squared region of = 32 (i.e., | Re(X c )| and | Im(X c )| < /2, where is the modulo boundary); • The modulo loss is assumed negligible. In fact, even if there are some points selected at the boundaries of X cc , since the system operates in high SNR regime, the modulo loss only cause degradation in very low BER; • The power loss is close to 0. By the continuous approximation [35], we compute the power loss as γ p = 10.log(1024/1023) ≈ 0.0042 dB for this 1024-QAM constellation.
In the second case, depicted in Fig. 8b, we transmit our rate of 2 information bits perX c symbol. The uncoded bits are replaced by the auxiliary shaping bits and the mapping is defined by (z aux 1 z aux 2 z aux 3 z aux 4 z aux 5 z aux 6 (z aux 1 z aux 2 z aux 3 z aux 4 z aux 5 z aux 6 z 1 z 2 y c 1 y c 2 ). The shaping operation selects the closest one of the 2 6 subregions (or equivalent, points inside each shaping cosets) of each one of the 4 region defined by C s .
Considering this rate, as discussed in our proposed power control technique, theX c signal must be continuous, Gaussian shaped, and bounded by the squared region of = 4 (i.e., | Re(X c )| and | Im(X c )| < /2, where is the modulo boundary). For the same interference S, represented also by the red "X, " the only case that the presubctracted signalX c is confined in our region of interest occurs when the "yellow" region is selected by C s . Intuitively, as the other shaping regions (i.e., colors) can be selected by C s , it is not guaranteed that the shaping gain provided by this code is reached. As a result, the power control strategy, discussed in the last section, would not be effective in this case.
The following mapping, represented in Fig. 9, is optimized for scenarios where auxiliary bits are employed. In this case, the mapping is defined by (z aux 1 z aux 2 z aux 3 z aux 4 z aux 5 z aux 6 (z aux 1 z aux 2 z aux 3 z aux 4 z aux 5 z aux 6 z 1 z 2 y c 1 y c 2 ). The same four shaping cosets are represented by the four different colors and the 2 6 subregions (points inside each shaping coset) are spread over the expanded constellation.
The interference point is also represented by the red "X", which is presubtracted by the closest X cc to provide the transmitted signalX c . It is worth noting that, considering the hard decision of the parallel branches for the C s trellis, X c is always inside the smallest dashed square, which is our continuous region of interest ( = 4). By this design, we conclude that • The shaping loss is partially recovered. the shaping gain γ s is reached according to the code C s . As a consequence, the power control technique is properly designed; • The modulo loss can be totally mitigated. If the X cc constellation is expanded enough such that the interference is confined and the boundary selected symbols are avoided, the modulo loss can be mitigated. However, we emphasizes that, in some cases, a low modulo loss might be tolerated in order reduces the transmitter complexity (this will be discussed in the next section, by analyzing the realistic scenario); • The power loss. Considering this rate, this loss is evaluated by the approximation γ p = 16/15 ≈ 0.28 dB. As discussed in [22], this loss represents the relation between continuous, which is the DPC case, and discrete constellation transmission; In summary, the previous discussions showed, by considering proper expansion of the DPC constellation, that the degradation can be made within 0.3 dB of the corresponding reference trellis shaping in AWGN channel. On other hand, we have seen that the system complexity increased, which is a design drawback. It is important to emphasize that the PU user maintains the same performance as in absence of the CU operation.
In this work, we investigate the constellation expansion in 4, 16, 64, and 256 regions, which represents n s = 2, 4, 6, 8 (where, in case of n s > 2, the remaining n s − 2 are auxiliary bits). It is important to point out that the minimum distance d min , evaluated by scaling method, previously exposed in Section 4.1, was implemented in all simulated schemes. Additionally, the impact of modulo operation in the system is also considered.
As we are employing R cu = 2 bits/symbol, the transmitted signal should be confined within the region defined by the 16-QAM constellation (i.e., | Re(X c )| and | Im(X c )| < 2d min ). Having this in mind, we are assuming the previously discussed mapping II (see Fig. 9) for all considered schemes.
The following example, depicted in Fig. 10, clarifies the mapping strategy. In this case, the constellation is expanded from 16 to 64 regions, where the X cc symbol is defined by (z aux1 , z aux2 , z 1 , z 2 , y c 1 , y c 2 ). Consequently, we have the middle bits z 1 and z 2 as the shaping coset bits, which assign the shaping regions (represented by colors) and z aux1 and z aux2 as a points within each shaping coset, which defines the shaping subregions. Finally, the convolutional coded bits y c 1 and y c 2 selects a point (represented by the markers , X, , and ) inside each subregion.
Consider that the pair of bits (y c 1 , y c 2 ) are mapped to (0, 0) by the channel trellis convolutional code C c , represented in the figure by the marker . In this case, Fig. 11 elucidates the Viterbi decoder of the shaping operation located at the transmitter. By observing this example, the trellis transitions from the state σ n j to the state σ n+1 j corresponds to constellation points where z 1 = 0 and z 2 = 0 (yellow regions). Thus, the decoder will perform a hard decision among the four constellation symbols (0, 16, 32, 48), representing the points within the assigned shaping region coset, each defined by the pair of bits (z aux1 , z aux2 ). In the same way, the transition from the state σ n j to σ n+1 j+1 corresponds to constellation points where z 1 = 0 and z 2 = 1 (green regions), which decides among the parallel transitions (4, 20, 36, 52). We can noticed that, thanks to this mapping strategy, regardless of the interference S n , the presubtracted sig-nalX n c is confined within the 16-QAM region, assuring that the power control is established. In our analyzed scenarios, when required to mitigate the modulo loss, further expansions are implemented following the same presented procedure.

Practical system analysis
This section intends to point out some practical issues concerning the techniques exposed and applied in the satellite scenario.
As a matter of system engineering, the design for the CU payload could either be a standalone system (implemented by a dedicated transmission chain and antenna) or a shared transmitter (by using the same transponder and antenna as the PU). In this latter configuration, notice that more caution should be taken into account when the transmission of both signals inputs the same high power amplifier (HPA). In fact, this practice should be avoided since this implementation may induce higher non-linear distortions, particularly in terms of AM/AM and AM/PM conversions [39].
Moreover, at the receiver side (PU and CU), two scenarios could be encountered: (i) by the deployment of geographically separated receiving sites for each user, and thus reducing the interference due to the attenuation at the interfering paths, or (ii) by using the same receiving station with two dedicated demodulators and decoders. In this last case, the attenuation of the interfering and direct paths are the same (i.e., In fact, it increases the interference of both links and, as a consequence, requires higher value of the superposition factor α (consequently, reducing the secondary service data rate).  Deepening the vision on the techniques described, we point out that, due to the superposition, the bit rate of the secondary service might be very low with respect to the primary. However, this practice generates two implementation problems: (i) in the DPC presubtraction technique, the same symbol rate for both signals is considered in order to be able to compute the Eq. (9) and (ii) in the superposition technique, the interference generated by the CU signal would appear as spikes in the PU bandwidth, which makes the usual interference model unrealistic in this case. In order to avoid both constraints, we can think of the implementation of the chirp spread spectrum technique [31] at CU transmission. In this sense, the DPC encoder can correctly perform its operation and the CU receiver can demodulate at a more flexible transmitted data rate.
To improve the whole system performance, the channel estimation techniques could be realized at the terminal's end through a link feedback, for instance, according to the DVB-S2X standard [6]. By these features, the superposition factor α, which depends directly on the channels conditions, as well as the λ, which depends on SNR, can be periodically updated, changing the achievable secondary service data rate and, as consequence, optimizing CU performance.

Results and discussion
In order to investigate the system feasibility, we adopted a scenario where a Cubesat at a altitude of 600 km with same orbital parameters as [40], using COTS parts, transmits both signals (primary and cognitive) from the same satellite antenna towards a single earth station, which is equipped with two dedicated demodulators. In this sense, the channel attenuations are the same and defined as | h |. In this study, just the downlink is considered.

Primary service transmission analysis
The main specification for the PU signal are output power of 1W [27], operating frequency of 2200 MHz (downlink band assigned for Earth exploration satellite service), bit rate of 3.4 Mbps, BER specified to 10 −5 and coded QPSK modulation with FEC (R = 1/2). Table 1 presents the link budget of PU without secondary service addition.
It is worth noting that a conservative margin for demodulation losses of 6 dB is assumed in order to cover the impairments of the communication chain. The overall link margin is about 3.5 dB, as required by the targeted BER.
The principle behind this design strategy was to use part of the power remaining in this margin to transmit the CU signal. Therefore, we defined that 900 mW were allocated for PU transmission (which still maintains the recommended link margin of 3 dB) and 100 mW were used for CU. Figure 12 presents the overlay model considering this use case. The powers transmitted and received are provided considering an example of a realistic link budget parameters. By computations according the Table 1 Considering all parameters, the CU performs the superposition strategy and, by the Eq. (5), the factor α = 0.85 is evaluated. This value guarantees a SINR of 16 dB at PU. We highlight that, thanks to the superposition strategy and the power controlling design of the DPC encoder (see results in [25]), the PU maintains the same performance as in the absence of the CU interference.

Secondary service transmission analysis
From this point, a simulation for CU link is performed considering the whole CU channel interference (see Eq. (8)), which is composed by the PU signal and the CU shared power in the superposition.

CU results and analysis
The system performance is analyzed for the scenario depicted in Fig. 12 in both qualitative and quantitative approaches. The main objective is to provide a comparative study regarding the different expansions of X n cc , emphasizing the trade-off between power efficiency vs. complexity. Figure 13 presents the main results. . The expanded constellation signal X n cc is shown in green "x." The Gaussian distributed version of the scaled interference λS n is superposed in red points and the transmitted signalX n c is shown in blue dots. Additionally, we depict the histogram of X n cc (resp. Fig. 13c, d) in order to study the probability of the boundaries symbols, since that it is determinant for the degradation caused by modulo loss. As our previous definition and in contrast to [22] and [23], we are using the modulo operation at the transmitter output and the shaping metric is implemented taking that into account Eq. (10). This design assumptions assures that the transmitted signalX n c will be confined within the expanded constellation, independently of the channel interference power, guaranteeing the condition E[ |X n c | 2 ] ≤ (1 − α)P c , which is a DPC requirement. By this design, in a clearly reference to the THP scheme, three important consequences are highlighted concerning its discriminated losses: • Shaping loss: When the shaping metric takes into account the modulo operation (see Eq. (10)), the shaping gain corresponding to C s is also reached within the continuous region delimited by . It is worth noting that the same shaping gain is achieved, independently of the channel interference power. This is directly given by the fact that where we can conclude that the shaping minimization will be the same whether or not λS n is confined within the modulo region. This could be observed by the blue dots illustrated in the scatter plot, which provides the shaping gain equals to 0.97 dB for all scenarios, according to the implemented shaping code C s . Another interesting point concerning the shaping operation refers to the fact that, when the transmitted signalX n c is Gaussian distributed, thanks to the shaping operation, the usual decoding method by minimum euclidean distance is optimum [22]. It could be inferred directly due to the Gaussian distribution of the effective noise N eff , given by Eq. (11).
• Modulo loss: Based on the previous observation that the boundary points should be avoided. We observe in Fig. 13c that for 16-QAM the X n cc constellation is uniformly distributed (i.e., the so-called n-cube distribution). We also notice that the histogram of 1024-QAM, presented in Fig. 13d, shows the occurrence of boundary constellation points decreasing as further expansion is considered. • Power loss: The same degradation value evaluated for 16-QAM (in this case, equals to 0.28 dB), as discussed in Section 4.3.
2. Quantitative analysis: Connected with this qualitative investigation, the BER CU is shown for quantitative performance evaluation. Figure13e presents the BER curves for this scenario. We observe that the modulo loss is significantly high for the 16-QAM, 64-QAM, and 256-QAM schemes. However, when 1024-QAM is considered, the degradation just becomes more significant after 5 dB of SNR. This could be inferred by the increase of λ as a function of SNR (see Eq. (13)), which results in more modulo loss degradation. When comparing minimum and maximum X n cc constellation expansion (resp. 16-QAM and 1024-QAM), we detected a gain around 4 dB considering a BER of 10 −3 and 1.5 dB considering a BER of 10 −5 . This defines the minimum (by 16-QAM scheme implementation) and maximum (by 1024-QAM scheme implementation) supported within the squared region bounded by = 4d min (i.e., | Re(X c )| and | Im(X c )| < /2, where is the modulo boundary). Thus, the transmitted power (i.e., E[ |X n c | 2 ]) as well as the power peak to average power ratio (PAPR) remain in the same order, regardless the constellation expansion. Finally, the maintenance of the PU service is evaluated by attesting the exact specified SINR P (see Eq. (4)) at the primary receiver, thanks to the superposition strategy.

CU link budget
By taking the link parameters and the CU BER curve presented in Fig. 13e, we can now evaluate the link budget, presented at Table 2. It is important to note that, by reason of superposition, just 15% of the power originally allocated for CU is used for its own transmission. However, despite of the received low power, the sensitivity of the receiver is in line with the specifications usually attributed for small satellites links (i.e., sensivity threshold of − 118 dBm [41]). We emphasize that all conservative margins are still being considered in order to guarantee the performance.
It is worth noting that different M-QAM schemes present the peak to average power ratio (PAPR) in the same order of magnitude (as presented in Table 2). Also, the further expansion impacts directly in the BER performance, even when the total interference is not confined inside the expanded constellation (as in this case). As pointed in [24], by practical effects, the SNR here does not consider the defined effective noise in Section 11. As an important result of this feasibility analysis, we assure the minimum supported bit rate of 16 and 28 kbps (w.r.t. BER of 10 −5 and 10 −3 , respectively) and the maximum of 22 and 74 kbps (w.r.t. BER of 10 −5 and 10 −3 , respectively) for the secondary service. This range is suitable for most of M2M applications.

Conclusion
The motivation behind this paper concerns the general trend to reconsider the regulation procedure for spectrum allocation, as well as to stimulate the development of techniques which enable the coexistence of different networks. In this view, the terrestrial cognitive radio techniques have also been employed in the satellite communication context.
A step forward is presented here, where the coexistence among primaries and cognitive users is investigated by using overlay paradigm. In this specific paradigm, the cognitive user (CU) has noncausal knowledge about the message and encoding strategy of the primary user (PU). By this assumption, the optimum CU encoding strategy consists in: the superposition and dirty paper coding (DPC) techniques. In this line, given that the PU should operate as in the absence of interference, with no performance degradation, we first propose a scheme based on the trellis shaped DPC encoder for CU. In this sense, some design improvements are discussed in order to overcome the socalled precoding losses. Techniques are detailed for both the encoder and decoder, taking into account the satellite scenario and the trade-off between power efficiency and complexity.
Thereafter, this paper investigates the feasibility of a low data rate secondary service transmission over a primary user infrastructure. A realistic scenario is presented and the previously discussed techniques are implemented to resolve the interference of both links. As a result, we obtained the same performance for the PU as in absence of the CU operation (AWGN channel). Also, concerning the feasibility analysis, we fulfill the minimum supported bit rate of 16 and 28 kbps (w.r.t. BER of 10 −5 and 10 −3 , respectively) and the maximum of 22 and 74 kbps (w.r.t. BER of 10 −5 and 10 −3 , respectively) for the secondary service, where this range is variable according to the implemented scheme. We emphasize that this achievable bit rate is suitable for most of M2M applications. As a drawback, we point out an increase of the output power of the satellite due to the intrinsic signal correlation presented at the superposition technique. In the described scenario, the total power transmitted is about 32 dBm instead of the previously specified 30 dBm. In this case, the transmitted antenna should be properly designed in order to consider this output power.
Finally, we would recommend further research concerning the following topics: (i) an investigation of the feasibility of a secondary service transmission for GEO multibeam satellite scenario; (ii) the development of a proof of concept by means of SDR implementation; (iii) the analysis of the satellite impairments effects on DPC schemes; (iv) the control of PAPR by shaping operation, typically on satellite communication.