Linear and nonlinear techniques for multibeam joint processing in satellite communications

Existing satellite communication standards such as DVB-S2, operate under highly-efficient adaptive coding and modulation schemes thus making significant progress in improving the spectral efficiencies of digital satellite broadcast systems. However, the constantly increasing demand for broadband and interactive satellite links emanates the need to apply novel interference mitigation techniques, striving towards Terabit throughput. In this direction, the objective of the present contribution is to investigate joint multiuser processing techniques for multibeam satellite systems. In the forward link, the performance of linear precoding is investigated with optimal nonlinear precoding (i.e., dirty article coding) acting as the upper performance limit. To this end, the resulting power and precoder design problems are approached through optimization methods. Similarly, in the return link the concept of linear filtering (i.e., linear minimum mean square error) is studied with the optimal successive interference cancelation acting as the performance limit. The derived capacity curves for both scenarios are compared to conventional satellite systems where beams are processed independently and interbeam interference is mitigated through a four color frequency reuse scheme, in order to quantify the potential gain of the proposed techniques.


Introduction
Current satellite systems, following the cellular paradigm, employ multiple antennas (i.e., multiple onboard antenna feeds) to divide the coverage area into small beams (spotbeams). To the end of limiting interbeam interferences, these multibeam satellite communication (SatCom) systems spatially separate beams that share the same bandwidth. This multibeam architecture allows for a significant boost in capacity by reusing the available spectrum several times within the coverage area, especially in the Ka-band. Subsequently, the capacity of current satellite systems can well exceed 100 GBps with state-of-the-art architectures [1]. A large number of recent satellite systems procurements have clearly confirmed the trend towards multibeam satellite systems as broadband reference system architecture. Examples include systems such as Wildblue-1 and Anik F2 (66 Ka-band spot beams), Kasat (82 Ka-band spot beams) and recently Viasat-1 (72 spot beams in Ka-band) for mainly fixed two-way (i.e., interactive) broadband applications as well as the GlobalExpress system designed for a new generation of mobile services in Ka-band. Interactive services, in particular, benefit from these architectures since a finer partitioning of the coverage area allows for parallel data stream transmissions.
Despite the achievements of current SatComs, existing systems are far from the future goals for terabit capacity. Two main obstacles towards the Terabit satellite are namely the internal with respect to the system interferences (i.e., intrasystem or interbeam interferences) and the overwhelming number of spotbeams needed to achieve Terabit throughput. To alleviate these performance constrains, novel techniques need to be explored.
Terrestrial systems, have introduced the paradigm of multicell joint processing to mitigate interferences and boost system capacity. According to this paradigm, user signals received in the uplink channel by neighboring base station (BS) antennas are jointly decoded in order to mitigate intercell interferences. Similarly, user signals in the downlink channel are jointly precoded before being transmitted by neighboring BS antennas for the same purpose. However, one of the practical obstacles in joint processing implementation is the existence of a backhaul network which enables this form of cooperation amongst neighboring BSs.
The interference limited nature of the multibeam satellite channel is a commonality between SatCom and terrestrial systems. Also, considering the architecture of multibeam SatComs networks, a small number of ground stations is responsible for processing the transmitted and received signals that correspond to a vast coverage area. This characteristic simplifies the application of joint processing techniques. In this context, the application of multibeam joint processing in SatCom systems is investigated in the present contribution. The main purpose is to provide an overview of the performance of such techniques for the forward and the return link (RL) in specific realistic scenarios, as well as to quantify the potential gain of such techniques by using the throughput performance of conventional frequency reuse schemes as benchmark.
The rest of this article is structured as follows. An overview of related work is presented in Section 2. In Section 3, the capacity performance of multibeam joint processing is examined, focusing on the forward link (FL) of fixed services. In Section 4 the RL of a satellite system serving mobile users and jointly decodes all the received signals is investigated. Finally, in Section 5, the capacity performance is quantified through numerical simulations and compared to the performance of conventional systems, while Section 6 concludes the article.

Notation
Throughout the formulations of this article, ε[·], (·) † , (·) T , ⊙ and ⊗ denote the expectation, the conjugate transpose matrix, the transpose matrix, the Hadamard product and the Kronecker product operations, respectively. The Frobenius norm of a matrix or vector is denoted by ||·|| I n denotes a n × n identity matrix, I n×m a n × m matrix of ones, 1 n a n × 1 vector of ones, 0 a zero matrix and G n × m a n × m Gaussian matrix.

Joint processing techniques
This section provides a review of related work in terms of multiuser multiple input multiple output (MU-MIMO) and multibeam processing techniques. Optimal nonlinear as well as suboptimal linear techniques have been investigated in the existing literature. In general, nonlinear techniques achieve channel capacity but the induced complexity comes with high implementation costs. Thus, reduced complexity linear techniques that provide sufficient performance can be employed.
Starting from recent advances in information and communication theory, an overview of the existing literature on multibeam processing is provided, before highlighting the contributions of this article.

Multiuser joint processing techniques
The concept of joint processing has the ability of converting the interference channel of the forward and RL of a multi-antenna system into a MIMO Broadcast (BC) and multiple access channel (MAC) respectively. The state-of-the-art on the receiver and transmitter architectures for the two channels follows.

Transmitter architectures: MIMO BC
In MU MIMO communications, the capacity of the MIMO BC channel can be achieved by dirty paper coding (DPC) as shown in [2]. DPC a , allows for the cancelation of the interferences of the previously, serially encoded users, thus causing no interference to following users. However, the implementation complexity of DPC leads to the investigation of linear precoding techniques with reduced complexity such as zero forcing (ZF) and regularized zero forcing (R-ZF). In these techniques, all users can be encoded in parallel with the precoding vectors. In terms of performance, ZF cancels multiuser interference, thus being suitable for the high signal to noise ratio (SNR) regime [3]. On the other hand, R-ZF techniques, also take into account the noise variance, thus making them suitable for any SNR [4]. The main disadvantage of linear techniques, however, is that the number of simultaneously served single antenna users can be at most equal to the total number of transmit antennas.
More recently, several multi-cell processing methods for the downlink of terrestrial systems were devised in [5][6][7]. In particular, assuming data sharing, the authors in [5] studied the design of transmit beamforming by recasting the downlink beamforming problem into a least minimum mean-square-error estimation (MMSE) problem. However, the required signalling between the BSs is too high and global convergence is not guaranteed. Later, in [6], a distributed design in Time-Division-Duplex (TDD) systems was proposed, using only local channel state information (CSI) and demonstrating that performance close to the Pareto bound can be obtained. However, the main issue with [5,6] is that both require data sharing between the BSs. Hence, their use with limited backhaul throughput is prohibited. Finally, in [7], distributed multicell processing without data or CSI sharing was proposed, but with the requirement for moderate control signalling among BSs.

Receiver architectures: MIMO MAC
With respect to the MIMO MAC, MMSE filtering followed by successive interference cancelation (SIC) performed at the receive side, is proven to be the sum-rate capacity achieving strategy [8][9][10]. The reduced-complexity linear minimum mean square error (LMMSE) receiver [11,12] aims at minimizing the square error between the transmitted and the detected signal with the use of MMSE filters. The outputs of the filters are subsequently fed into conventional single-user decoders. The main limitation of the LMMSE receiver is that the number of users that can be effiectively filtered is limited by the rank of the channel matrix, namely the total number of receive antennas in the system.
Multicell joint decoding was firstly introduced by [13,14]. Since then, the initial results have been extended for more practical propagation environments, transmission techniques and backhaul infrastructures in an attempt to better quantify the performance gain. More specifically, it was demonstrated in [9] that fading promotes multiuser diversity which is beneficial for the ergodic capacity performance. Following that, realistic path-loss models and user distribution were investigated in [15,16], where closed-form capacity expressions based on the cell size, path loss exponent and user spatial probability density function (p.d.f.) where provided. The beneficial effect of MIMO links was established in [17,18], where a linear scaling with the number of BS antennas was proven. However, correlation between multiple antennas has an adverse effect as shown in [19], especially when correlation affects the BS-side. Imperfect backhaul connectivity has also a negative effect on the capacity performance as quantified in [20]. Finally, limited or partial CSI availability will result in degraded performance, as proven in [6,21,22]. The topic of CSI will be further discussed in Section 2.4.

Joint processing in SatComs
A multibeam satellite operates over an interference limited channel, for which the optimal communication strategy in general is not yet known [8,[23][24][25]. Hence, orthogonalization in the frequency and polarization domain is used to limit interbeam interferences. However, the concept of multibeam joint processing can be applied and the system can benefit from reusing the full frequency in all beams.

Multibeam joint processing in the FL
In the context of SatComs, multibeam joint processing scenarios have been studied in various settings. Specifically, the FL case has been examined in [26][27][28][29][30][31][32][33]. Various characteristics of the multibeam satellite channel were taken into account such as beam gain [28,29,34], rain fading [30], interference matrix [29] and correlated attenuation areas [28]. Joint processing studies concerning the FL of SatCom systems usually assume fixed users. This assumption originates from the difficulties in acquiring reliable and up to date CSI for the FL of satellite systems. During the CSI acquisition process, the pilot signals need to be broadcasted to the users and then fed back to the transmitter, thus doubling the effect of the long propagation delay of the satellite channel and rendering the acquired CSI outdated.
Subsequently, the adoption of the slow fading channel of the fixed satellite services (FSS), partially alleviates this obstacle since CSI needs to be updated less frequently.
In terms of precoding techniques, Tomlinson Harashima precoding (THP) was studied in [29,34], while linear precoding schemes such as ZF and R-ZF were evaluated in [30,34]. Furthermore, authors in [31] have investigated generic linear precoding algorithms under realistic power constraints for single and dual polarized satellite channels. The effect of flexible power constraints rising from flexible and multiport amplifiers has been evaluated in [33] and an energy efficient scheme for MMSE beamforming was proposed in [32]. Finally, authors in [28] have considered an Opportunistic Beamforming (OB) technique based on a codebook of orthonormal precoders and low-rate feedback.
In the present article, linear R-ZF and nonlinear DPC techniques are considered, while optimization methods are employed to deduce the best power allocation and precoder design, to the end of maximizing system throughput. In contrast to the existing literature, a perbeam power constraint (i.e., individual amplifier per beam) is considered instead of the commonly assumed, less realistic sum-power constraint (i.e., total on board power can be allocated in one beam).

Multibeam joint processing in the RL
First attempts to study multibeam joint processing in the RL, onwards referred to as multi-beam joint decoding, have been carried out in [35][36][37]. The RL of a satellite system employing multibeam joint decoding was studied via simulations in [37] from a system point of view, where MMSE and optimal multiuser receivers were considered, on a simplistic channel model basis, demonstrating a considerable improvement in both availability and throughput. The first analytic investigation of the uplink capacity of a multibeam satellite system was done by [35], where closedform expressions were derived for the capacity of multibeam Rician channels. Asymptotic analysis methods for the eigenvalues of the channel matrix were used in [38] to determine upper bounds for the ergodic capacity and calculate the outage probability of a MIMO land mobile satellite (LMS) channel which is represented by Rican fading with a random line-of-sight (LoS) component. Similarly, in [39] the statistics of minimum and maximum eigenvalues were derived for Rician fading with Gamma distributed LoS component. Finally, it should be noted that a multiuser decoding algorithm was presented in [36].

Practical constrains in the system design level
Albeit the throughput enhancement the cooperative techniques can provide in satellite networks, as it will be shown in the following, several issues arise with the adoption of these techniques in SatComs and need to be addressed.
Firstly, multibeam satellite systems with a high number of beams need to employ multiple GWs. The future of broadband SatComs is without any doubt connected with multibeam satellites. As throughput demand increases, the number of beams needs to be increased so that the same spectrum segments can be reused in spatially separated beams. Due to feeder link limitations, one single gateway cannot accommodate the total number of employed beams thus emanating the necessity for multiple GWs to serve large multibeam systems. In the present publication, to perform multibeam joint processing, both for the forward and the RL, a centralized precoder and decoder respectively is assumed. This theoretical assumption can be supported by a real system implementation via two approaches. One solution would be the exploitation of higher frequency bands for the feeder link (optical feeder links), assuming that such a system can be practically employed. As a result, a single gateway could serve the multibeam satellite system. Alternatively, another approach is the full interconnection amongst the multiple GWs so that they all share the same data (CSI and data). The second approach is easier to implement if we consider the bandwidth capabilities of broadband cable networks. Of course, the added delay is an issue to be considered, especially in the SatCom context where delay is already a major issue. Subsequently, both approaches lead to the verification of the simplistic assumption of cooperative system that utilizes a central precoder/decoder. Additionally, although this contribution does not tackle the subject of decentralized precoding/decoding, works in the existing literature examine the performance degradation effects of the adoption of decentralized precoder designs, for the case where full gateway interconnection cannot be assumed. An example of such an approach for multibeam satellite systems can be found in [40] where the level of cooperation amongst GWs is examined and the most promising technique is shown to be partial data and CSI exchange among the interconnected GWs.
A second major issue is the payload implications of the adoption of full frequency reuse. Multibeam Joint Processing can be classified in the more general category of multi-user detection (MUD) techniques. Interfering users are successively decoded/precoded thus allowing for the subtraction of the known interfering signals. This alleviation of interferences enables the full frequency reuse in multibeam system allowing for more aggressive exploitation of the available bandwidth thus leading to higher spectral efficiency. Nevertheless, the added on-board complexity that results from the increase of the frequency reuse in a multibeam satellite system needs to be noted. More aggressive reuse of the spectrum is translated in a proportional increase in the number of amplifiers accommodated in the satellite payload. Indeed, when advancing from a specific frequency reuse scheme (e.g., four color frequency reuse) to full frequency reuse, the number of on board high-power amplifiers (HPAs) needs to be increased (e.g., four times more HPAs) since each beam will occupy the hole bandwidth of the amplifier. Currently, this proves a heavy burden for the satellite payload hence more simplistic approaches need to be investigated. Added to that, a fairness issue arises in the comparison of multibeam joint decoding to conventional single beam decoding since the first, requires increased power and payload mass compared to the latter.
The above noted issues will not be further addressed in the present contribution but they will be part of the authors' future work. In the following, the achievable RL throughput by the means of MMSE filtering followed by nonlinear SIC and of Linear MMSE, is calculated through simulations. The novelty of this work is the consideration of the multibeam antenna pattern over a correlated Rician channel. Additionally, lognormal shadowing is incorporated in the channel model to investigate the effect of user mobility.

SatCom standards
The second generation of the digital video broadcasting over satellite standard (DVB-S2) is the latest generation standard for SatComs enabling broadband and interactive services via satellite [41]. It has been designed for broadcasting services (standard and high definition tv), Internet and professional services such as TV contribution links and digital satellite news gathering [42]. During the formulation of DVB-S2, three main concepts were carefully considered: (a) best transmission performance approaching the Shannon limit, (b) total flexibility and (c) reasonable receiver complexity [43]. High performance and low complexity iterative decoding schemes like Low Density Parity Check codes (LDPC) along with high order Amplitude and Phase Shift Keying (APSK) modulations were adopted for efficient operation over the nonlinear satellite channel in the quasi error free region. Compared to previous standards, the second generation standard attains 20-35% capacity increase or alternatively 2-2.5 dB more robust reception for the same spectrum efficiency by virtue of the advanced waveforms. Furthermore, to facilitate the provision of interactive services, the standard features operation under Adaptive Coding and Modulation (ACM) parameters. When used for interactive services, ACM allows optimization of the transmission parameters adaptive to varying path conditions [44,45]. Hence, resources are optimally exploited, since operation under a constant fading margin according to a worst case scenario design, is no longer necessary. Moreover, DVB-RCS NG is a next generation (NG) return channel over satellite, Very Small Aperture Terminal (VSAT) standard that has recently been approved by the DVB technical module as common physical layer standard within the RCS2 context. This standard improves the existing mature DVB-RCS standard by including state of the art channel coding and highly efficient, nonlinear modulation schemes. Hence, the efficiency and the flexibility of the return channel operational modes is enhanced. All of these standards have been devised for large multiuser satellite two-way systems and can be adapted to the proposed multibeam processing techniques.

CSI acquisition
Channel state information is one of the most important enablers for the application of the multibeam joint processing techniques. A specific part of the existing literature addresses the importance of CSI in MIMO systems, such as [21,[46][47][48][49]. In the context of SatComs, channel knowledge can be acquired at the end-user ground stations for the FL and then fed back to the gateway station (GS) for the RL. More specifically, CSI should be available at the GS so that multiuser precoding can be performed for the FL and joint decoding at the RL. Current standards are using pilot sequences, either available within the standard as an optional feature (i.e., for DVB-S2) or defined within the specified requirements for the transmission burst structure of the Multi-Frequency Time Division Multiple Access (MF-TDMA) return channel (i.e., for the DVB-RCS NG). A recent study for the effect of CSI in the satellite context can be found in [50], where the satellite link performance with and without CSI is compared and a technique for estimating CSI is proposed. Subsequently, the current state of the art reference transmission standards are well suited for the adaptation of the proposed multibeam satellite systems.
In more detail, CSI is acquired by broadcasting pilot signals through the FL to all ter-minals which in turn measure them and feed the quantized measurements back to the GS through the RL. In most cases, FL and RL operate in different frequency bands and thus the described process yields the FL CSI. Nevertheless, it is often assumed that the two link are reciprocal (especially if they are adjacent in frequency) and as a result the measured CSI can be also used for the RL. Furthermore, the CSI acquisition process in SatComs introduces a long delay which may result in outdated CSI. This complication is especially acute for the FL where CSI is needed before transmission in order to calculate the precoding vectors. In the RL, CSI is only needed for decoding and therefore it can be transmitted by the terminals along with their data.
Based on this discussion, in the following sections we focus on fixed terminals for the FL (slow-varying channel) and mobile terminals for the RL. In the RL case, the joint decoding techniques can be applied either for fixed or mobile satellite services. As a matter of fact, the slow fading channel would even lead to simple practical implementation since CSI is easier to acquire. However, this distinction has been made in order to point out one main difference between the forward and the RL. In the RL, the channel estimates can be sent along with the transmitted data. This introduces much less delay when compared to FL case, where the pilot signal needs to be transmitted and fed back to the GW before the precoding matrix can be calculated, leading to approximately double time delay compared to the RL case. This substantial difference in the CSI acquisition procedure, leads to the definition of the specific scenarios. Added to that, the FSS case for the RL has been also studied by [35]. Nevertheless, the proposed analysis is straightforwardly applicable to FSS, by omitting the shadowing coefficients in the definition of the channel matrix. Finally, it should also be noted that the feeder link, i.e., the link between the gateway and the satellite, is considered ideal.
3 Capacity performance of multibeam joint processing: FL for fixed services The first system example scenario discussed is a Ka-band multibeam scenario for fixed Sat-Com and high speed applications. These scenarios can include, for example, multiuser systems such as broadband internet access systems. In such scenarios mainly the FL is the limiting factor of the overall system dimensioning. Thus, the proposed techniques for this FL scenario, with parameters given in Table 1, will be studied in the following section. The modeling of conventional systems is also included to assist in the evaluation of the potential gain of these techniques. The considered figure of merit is the average per user achievable throughput, namely the sum throughput for all beams divided by the number of users.

Channel model
One of the main differences between SatCom and terrestrial systems are the inherent characteristics of the channels they are operating over. The most fundamental attributes of the satellite channel are the high LoS component of the signal and the multibeam antenna radiation pattern. Additionally, satellite systems operating in frequencies over 10 GHz are prone to atmospheric attenuation. Especially, rain fading is the dominant factor and will be taken into account in the course of our analysis. It is modeled via the latest empirical model proposed in the International Telecommunications  N (μ, σ ) , where μ and s depend on the location of the receiver, the frequency of operation, the polarization and the elevation angle toward the satellite. The p.d.f. of a lognormal variable ξ reads as Variables μ (dB) and s (dB) are the mean and standard deviation of the variable's natural logarithm respectively.
The corresponding K × 1 rain fading coefficients from all antenna feeds towards a single terminal antenna are given in the following vector where j denotes a uniformly distributed phase. The phases from all antenna feeds are hard to differentiate and assumed to be identical. This is because we consider a LoS environment and the satellite antenna feed spacing is not large enough compared to the communication distance [52].
Since rain attenuation is a slow fading process that exhibits spatial correlation over tens of kilometers [53], we assume that users among different beams undergo independent fading. In other words, we assume that each correlated area [28,53] cannot extend over the coverage of a single beam. This is a valid assumption if we consider that beam sizes are typically of the size of hundreds of kilometers. Moreover, the common assumption of user scheduling, according to which only one user per beam is served during a specific time slot, is adopted thus rendering the fading coefficients amongst users independent.
The link gain matrix defines the average SINR of the each user and it mainly depends on the satellite antenna beam pattern and the user position. Define one user's position based on the angle θ between the beam center and the receiver location with respect to the satellite and θ 3 dB is its 3-dB angle. Then the beam gain is approximated by [34]: where u = 2.07123 sin θ/sin θ 3 dB , and J 1 , J 3 are the first kind Bessel functions, of order one and three respectively. The j-th user corresponds to an off-axis angle θ with respect to the boresight of the i-th beam where θ i = 0°.
where l is the wavelength and d 0 ≃ 35, 786 Km, is the satellite altitude.
Collecting one user's beam gain coefficients from all transmit antennas into the K × 1 vector b, the overall channel for that user can be expressed as

Problem formulation
Let us denote the complex signal intended for user k as s k with E |s k | 2 = 1. Before transmission, the signal to be transmitted is weighted by the beamforming vector √ p k w k where w k is a complex vector with ||w k || = 1 and p k is the transmit power for the k-th user signal. The total transmit signal is given by The received signal at user k is where n k is the independent and identically distributed (i.i.d.) zero-mean Gaussian random noise with power density N 0 . Then the received SINR at the k-th user is where B u is the total user-link bandwidth. The achievable average per user throughput is then expressed as In this section, the problem of interest is to maximize the system throughput by optimizing the precoding and power allocation subject to individual power constraints P k = [p 1 , p 2 , ... p K ] on beam k. This problem can be formulated as where Q jj is a k × k matrix full of zeros, besides its (j, j) element which is unitary.
The described throughput maximization problem is difficult to tackle and the optimal solution is unknown in the literature. Next we will separate the optimization of precoding from the power allocation, then provide simple and sub-optimal solutions for each of them.

Regularized zero-forcing and sub-optimal power allocation
Zero forcing is a simple but suboptimal linear precoding strategy that mitigates multiuser interference, while its design only depends on the channel regardless of the noise. Although it is asymptotically optimal in the high SNR regime, the drawback of ZF precoding is that the throughput does not grow linearly with K [54]. R-ZF was proposed as a simple precoding technique with substantial performance. This method introduces a regularization parameter that takes into account the noise effect. Thus, the resulting throughput is proven to grow linearly with K [55]. More specifically the precoding vector w k is taken from the normalized k-th column of where a is the regularization factor, that needs to be carefully chosen to achieve good performance. Based on the large system analysis, the optimal a (in the statistical sense) to maximize the SINR is given by [56], With R-ZF precoding, the throughput maximization problem (9) reduces to a power allocation optimization problem: Subsequently, the applying R-ZF the power and precoding matrix optimization problems are separated and a solution can be found. However, although the constraints are linear, the throughput is non-convex with respect to the power vector thus and hard to find the optimal solution. To overcome this restrain, we propose the use of simple gradient-based algorithms, such as the steepest descent algorithm to find a locally optimal solution for the power allocation optimization problem. Subsequently the average per user throughput for the R-ZF technique will read as

Dirty paper coding
Dirty paper coding is known to be the sum-rate capacity-achieving technique in MU MIMO downlink. Hence, it is used as an upper bound for the suboptimal, linear techniques. As a nonlinear technique, DPC is based on the idea of known interference precancelation while serially encoding user signals.
Let us now assume that π 0 = {1, 2, ..., K} is a trivial user encoding order. Then the received SINR at user k is With DPC, the throughput maximization problem with individual power constraints (9) has been solved by converting it into a dual uplink with sum power constraint across users and uncertain noise and employing an interior-point algorithm [57]. It should be noted that the sum-rate capacity can be achieved by all user encoding orders, but the individual user rates vary according to the employed encoding order.
According to (14) and (8) the achievable average peruser throughput will read as

Conventional frequency reuse scheme
To the end of providing a benchmark scenario so that the performance enhancement can be quantified a conventional system will also be studied. As already discussed in Section 1, the norm in multibeam satellite systems is the use of conventional single beam decoding. In order to achieve acceptable SINR ratios at the receive side, orthogonalization in the frequency domain is employed. In the present contribution, the polarization domain has not been examined for simplicity reasons. Hence, the usual case of a four color frequency reuse scheme has been assumed, where interferences are alleviated by allocating different spectrum segments to adjacent beams. Despite this spatial separation, the potentially large number of beam in a multibeam satellite system emanates the need of accounting for interferences originating from non-neighboring co-channel beams. To this end, the conventional system throughput for each beam is calculated as The channel coefficients h j , for the k-th user are given by (4), while A i C is the set of the co-channel to the k-th, users.

Capacity performance of multibeam joint processing: RL for mobile services
The second scenario considered involves an S-band satellite network for mobile applications. In fact, such networks are typically bandwidth limited and the usage of joint multibeam processing promises to increase overall system capacity by exploiting the full frequency reuse while mitigating the challenging interbeam interference limitation requirements within the overall system concept. More aggressive reuse of the available spectrum reuse is consequently possible through smaller beams with no requirements on co-channel isolation leading to potentially increased overall system capacity. The RL is analyzed in this context hereafter.

Channel model
To the end of accurately modeling the LMS channel the important parameters of the actual system need to be accounted for. Mobile users, due to size limitations, are equipped with low gain antennas and low power amplifiers. Added to that, user mobility, prohibits the use of frequencies over 3 GHz since the link budget would be compromised by the lack of orbital pointing accuracy, the increased free space losses and the high atmospheric attenuation due to rain fading. Finally, the importance of the LoS component of the received signal, an inherent characteristic of SatComs, will be accounted for by assuming Rician fading coefficients.
More specifically, in the following we consider a cluster of K spot-beams covering K user terminals, each equipped with a single antenna, under the limitation of a single transmitting user per bean, during a specific channel instance b . Hence, a MIMO MAC is realized. Subsequently, the input-output analytical expression for the i-th beam reads as where z ij is the complex channel coefficient between the i-th beam and the j-th user and n i is the Additive White Gaussian Noise (AWGN) measured at the receive antenna. To the end of investigating the adverse satellite channel the following characteristics will be incorporated in the channel model: beam gain b ij , lognormal shadowing ξ j , Rician fading h ij and antenna correlation. Hence, (17) becomes Shadowing ξ j only depends on the j-th user position as a result of the practical collocation of the satellite antennae. The general baseband channel model for all beams in vectorial form reads as where y, x, n are K × 1 vectors. The channel matrix Z K×K will be: where each line of the satellite antenna gain matrix B K×K contains the square roots of the normalized coefficients given by (3) as described in Section 3.1. The matrix H R is the channel gain matrix that consists of random i.i.d nonzero mean Gaussian elements and models the Rician satellite channel [35]. Due to rank deficiencies introduced by LoS signal components and the high receive correlation at the satellite side (20) can reduce to [35,58] where the diagonal matrix H Rd is composed of the elements of the unit rank matrix H R . Finally, random fading coefficients following a lognormal distribution have been employed to model shadowing due to user mobility. Owing to the practical collocation of the on board antennae, possible obstructions affect equally all received signals. Subsequently, Ξ d is a diagonal matrix composed of random elements that represent shadowing due to user mobility: Ξ d = diag{ξ}, where ξ = [ξ 1 , ξ 2 ...

Single-beam decoding
In the same direction as in Section 3.2.3 the benchmark performance metric will be given by the throughput of a conventional single beam decoding system, which for each beam reads as The channel coefficients z ij are given in (18), while A i C is the set of co-channel to the i-th, beams.

MMSE filtering with SIC
Conventional single beam decoding sacrifices bandwidth to cope with interferences. However, the almost linear dependence of channel capacity with respect to bandwidth motivates the study of more advanced decoding techniques that allow for the full exploitation of resources. The optimal decoding strategy when full frequency reuse is employed is proven to be SIC. Following the MMSE filtering of the strongest user, it's signal is decoded and then subtracted from the aggregate signal and so on. Hence, the second in order user will cope with less interference. The achievable capacity for this case reads as: where g stands for the transmit (SNR) and all the users are transmitting with the same power.
Due to the high implementation complexity of such techniques, suboptimal methods need to be examined as well. Added to that, imperfect channel estimates lead to residual cancelation errors and practical coding schemes are imperfect hence decoding errors can propagate to the following users. For the above reasons, suboptimal solutions can be applied and their potential gains are examined in the following section.

Linear MMSE filtering
A more practical receiver implementation would only consider MMSE filtering of the received signals followed by singe user decoding. In this case, linear MMSE capacity reads as:

Numerical results
In this section, numerical results are provided in order to study the performance of multibeam processing for both the forward and the RL. The considered metric is the per user throughput, averaged over the channel statistics, in bits/s. Since signals from adjacent beams are no longer harmful when multicell processing is in place, we consider a set of feed antennas which allow for the illumination of beams with variable overlap. In other words, we assume a number of beams with fixed centers of the earth surface but variable diameter. The formulas presented in Section 3 are used to calculate the spectral efficiency of each architecture for every value of the variable overlap, i.e., for every instance of the matrix containing the beam gain coefficients. The objective is to evaluate the effect of beam overlap on system throughput and investigate whether there is an optimal overlap point which optimizes the multibeam processing throughput.
During the simulations, a satellite system with only seven beams were considered for reasons to be explained and justified hereafter. The computational complexity of the employed optimization algorithm, namely the 'the steepest descent algorithm', grows with the number of beams since the algorithm performs channel matrix inversions and multiplications. To overcome this obstacle, a small number of beams symmetrically arranged over a cellular-like coverage area was employed. Subsequently, the achieved total system throughput has been averaged over the number of beams, providing the average per beam achievable rate. This performance metric facilitates the extension of the results into larger systems, assuming the linear dependence of the system throughput with respect to the number of beams; a solid assumption for conventional systems [1]. For the proposed systems, the linear dependence of capacity with respect to the channel dimensions assumption can be justified by the MIMO literature. The prelog of the channel capacity grows linearly with the rank of the channel matrix, i.e., the number of beams in our case. This fact can support the assumption that the broadcast MIMO channel (MIMO BC) capacity will scale approximately linearly with the number of beams. Hence, the average per beam capacity can provide a good estimation for the total capacity of a larger multibeam channel. Additionally, if a larger system was addressed, the approach would have been similar. The precoder matrix should have smaller dimensions than the full channel matrix in order to perform the optimization, since the highly directive antennas lead to very good beam isolation. This means that if the precoder matrix had dimensions equal to the number of beams (i.e., interferences from all beams are taken into account) it would be a matrix with very small, decreasing entries away from the main diagonal, resulting in an ill-conditioned precoding matrix that cannot be accurately handled. The solution is to take into account the first or maybe the second tier of interfering beams by employing smaller precoding matrices.

Forward link
For the multibeam joint precoding, a satellite FL was considered as described in Section 3.1 with detailed parameters listed in Table 1. According to the discussions of Section 2 optimization algorithms were used to solve the problem of power allocation and precoding matrix calculation, in the case of the linear R-ZF. Subsequently sub-optimal R-ZF throughput could be evaluated using (13). The capacity achieving nonlinear DPC is evaluated using (15). Moreover, the conventional spotbeam system has the same user-link bandwidth B u and noise density while it employs a frequency re-use scheme with factor 4 in order to mitigate inter-beam interferences. The achievable capacity is given by (16). The results are plotted in Figures 1 and 2 versus the variable, normalized to the nominal 3 dB, beam overlap.
In Figure 1, the optimized sum rate results are shown for the FL when users are uniformly located within cells.
First it is noted that for all schemes, the maximum rates are achieved when the normalized beam angle is less than the nominal one and for both R-ZF and DPC precoding the optimal beam angles are only 30% of the nominal one. This is because users are randomly located in the cells and it is preferred for satellite antenna feeds to focus on a smaller beam size in order to reduce interference to neighboring cells while users outside the cell can be jointly served by all feeds. As can be seen, when the normalized beam angle is less than the nominal one, more than double rates are achieved by joint beam processing using R-ZF precoding or DPC precoding, compared to conventional single beam processing. Also due to the smaller beam size, interference is not the dominant factor therefore the linear R-ZF precoding performs almost as well as DPC precoding. When the normalized beam size increases, beams become overlapped and achievable rates decrease due to the strong interferences. In this case, DPC clearly shows the advantage of nonlinear interference pre-cancelation over the linear R-ZF precoding.
In Figure 2, the optimized sum rate results are shown for the FL when users are on cell edges which is a worst scenario and results in much lower rates for all schemes. Again substantial rate gains are achieved by multibeam joint processing using R-ZF and DPC precoding. The optimal beam sizes are the nominal one or close to it, which is reasonable to cover users that are on the edge.

Return link
With respect to the RL scenario, a set of Monte Carlo simulations were carried out to evaluate the behavior of the proposed optimal and sub-optimal schemes given in Section 2. The RL scenario follows the parameters of   Table 2. The goal of the simulations is twofold. Firstly, it serves as a benchmark to measure the gain of the theoretical SIC given by (23) and the more realistic minimum mean square error method given by (24), over the conventional four color frequency reuse schemes as given by (22) Secondly, the effect of beam overlapping on the performance of the system is investigated. Thus the achievable per user throughput is plotted in Figures  3 and 4, for the three different receiver implementations, as the percentage of beam overlap changes. The independent variable is normalized over the nominal 3 dB beam size. For beam size less than one, the satellite receives less then half of the maximum gain from each user, hence gaps appear between beams in the coverage area. For more then one, beams overlap and the satellite is receiving more useful as well as interfering signal power c . The metric utilized is average per user achievable throughput, expressed in Mbps.
In Figure 4, mobile users are assumed on the cell edge. In this worst case scenario, results indicate that multibeam joint decoding techniques with SIC can theoretically achieve more than twofold gain over conventional techniques. More realistic receiver implementation techniques with linear MMSE filtering still achieve two times more throughput than the four color frequency reuse scheme. Additionally, system optimality is as expected very close to the nominal value of the beam size. In the same figure, we notice that for high beam separation (i.e., percentage of beam overlap less than 0.6) linear MMSE performs the same to the SIC. This is justified by the fact that when beams do not overlap, interferences become negligible. Taking into account that a characteristic of LMMSE is its optimality at the noise limited regime, the above observation is justifiable. Furthermore, when receive (SNR) increases, interferences become important and linear MMSE techniques prove suboptimal compared to SIC. However, they still manage to maintain a twofold gain over the conventional systems. Finally, an important observation is that the performance of conventional schemes quickly degrades as they are highly affected by interferences. Alternatively, the proposed schemes show higher tolerance to interferences, hence making them appropriate for a real system implementation where practical restrictions prevent ideal multibeam coverage areas. According to Figure 3, when users are randomly allocated within each beam, then the optimal solution is to incorporate highly directive antennas that better serve users close to the beam center. Hence, optimal throughput is achieved for 0.2 of the nominal beam size. As expected, achievable throughput is higher, compared to the worst case scenario with cell edge users. Again, more than twofold gain can be realized of conventional schemes.

Conclusions
The present article provides an overview of the application of joint processing techniques in SatComs. The presented schemes, here in referred to as multibeam joint processing techniques, have the potential of being incorporated in existing satellite payloads with some modifications on the ground segments of multibeam satellite systems, in the existing SatCom standards, in the satellite payload and in the capacity of the feeder link (i.e., the link between the gateway and the satellite). Both the forward and the RL of these systems have been examined under realistic link budget assumptions respectively.
Concerning the FL, precoding along with optimal power allocation amongst beams can provide substantial gains by pre-canceling interferences. Nonlinear DPC has been applied to provide an upper bound for the  performance of this approach. Linear R-ZF provides a more realistic value for the potential gain of joint processing techniques. Performance results were compared to current conventional architectures so that the potential gain could be quantified. More than twofold gain is expected with the implementation of multibeam joint processing in the FL. Finally, the performance of the aforementioned schemes was examined versus an important system design parameter, namely the percentage of beam overlap, where the optimal values of beam overlap for throughput maximization have been deduced.
In the RL, nonlinear MMSE filtering followed by SIC acts as the performance upper bound for the joint decoding approach. Linear MMSE filtering is the suboptimal scheme that depicts the more realistic performance. The conventional single beam decoding scheme with frequency reuse acts a performance benchmark. Multibeam joint processing in the RL can potentially achieve more than twofold gain over current system architectures. Again, the performance was studied versus variable beam overlap and the optimal values of this parameters have been extracted.
Endnotes a Or alternatively: Known Interference Precancelation. b This assumption is accurate for existing standards such as DVB-RCS where in every beam, each user transmits during one timeslot. c An appropriate threshold for the beam gain is -4.3d dB of the maximum beam gain to avoid gaps in the coverage area.