Energy efficiency and sum rate tradeoffs for massive MIMO systems with underlaid device-to-device communications

In this paper, we investigate the coexistence of two technologies that have been put forward for the fifth generation (5G) of cellular networks, namely, network-assisted device-to-device (D2D) communications and massive MIMO (multiple-input multiple-output). Potential benefits of both technologies are known individually, but the tradeoffs resulting from their coexistence have not been adequately addressed. To this end, we assume that D2D users reuse the downlink resources of cellular networks in an underlay fashion. In addition, multiple antennas at the BS are used in order to obtain precoding gains and simultaneously support multiple cellular users using multiuser or massive MIMO technique. Two metrics are considered, namely the average sum rate (ASR) and energy efficiency (EE). We derive tractable and directly computable expressions and study the tradeoffs between the ASR and EE as functions of the number of BS antennas, the number of cellular users and the density of D2D users within a given coverage area. Our results show that both the ASR and EE behave differently in scenarios with low and high density of D2D users, and that coexistence of underlay D2D communications and massive MIMO is mainly beneficial in low densities of D2D users.


I. INTRODUCTION
The research on future mobile broadband networks, referred to as the fifth generation (5G), has started in the past few years.In particular, stringent key performance indicators (KPIs) and tight requirements have been introduced in order to handle higher mobile data volumes, reduce latency, increase the number of connected devices and at the same time increase the energy efficiency (EE) [2], [3].The current network and infrastructure cannot cope with 5G requirements-fundamental changes are needed to handle future non-homogeneous deployments as well as new trends in user behavior such as high quality video streaming and future applications like augmented reality.5G technology is supposed to evolve existing networks and at the same time integrate new dedicated solutions to meet the KPIs [3].The new key concepts for 5G include massive MIMO (multipleinput multiple-output), ultra dense networks (UDN), device-to-device (D2D) communications, and huge number of connected devices, known as machine-type communications (MTC).The potential gains and properties of these different solutions have been studied individually, but the practical gains when they coexist and share network resources are not very clear so far.In this paper, we study the coexistence of two of these main concepts, namely massive MIMO and D2D communication.
Massive MIMO is a type of multiuser MIMO (MU-MIMO) technology where the base station (BS) uses an array with hundreds of active antennas to serve tens of users on the same time/frequency resources by coherent transmission processing [4], [5].Massive MIMO techniques are particularly known to be very spectral efficient, in the sense of delivering high sum rates for a given amount of spectrum [6].This comes at the price of deploying more transceiver hardware, but the solution is still likely to improve the energy efficiency of networks [7], [8].On the other hand, in a D2D communication, user devices can communicate directly with each other and the user plane data is not sent through the BS [9].D2D communications are considered for close proximity applications which have the potential to achieve high data rates with little amount of transmission energy, if interference is well-managed.In addition, D2D communications can be used to decrease the load of the core network.D2D users either have their own dedicated time/frequency resources (overlay approach) which in turn leads to elimination of the crosstier interference between the two types of users (i.e., cellular and D2D users), or they transmit simultaneously with cellular users in the same resource (underlay approach).
We consider two network performance metrics in this work: The average sum rate (ASR) in bit/s and the EE which is defined as the number of bits transmitted per Joule of energy consumed by the transmitted signals and the transceiver hardware.It is well-known that these metrics depend on the network infrastructure, radio interface, and underlying system assumptions [8], [10], [11].The motivation behind our work is to study how the additional degrees of freedom resulting from high number of antennas in the BS can affect the ASR and EE of a multi-tier network where a D2D tier is bypassing the BS, and how a system with massive MIMO is affected by adding a D2D tier.We focus on the downlink since majority of the payload data and network energy consumption are coupled to the downlink [10].We assume that each D2D pair is transmitting simultaneously with the BS in an underlay fashion.In addition, we assume that the communication mode of each user (i.e., D2D or cellular mode) has already been decided by higher layers.

A. Related Work
The relation between the number of BS antennas, ASR and EE in cellular networks has been studied in [7], [8], [12], [13] among others.The tradeoff between ASR and EE was described in [7] for massive MIMO systems with negligible circuit power consumption.This work was continued in [12] where radiated power and circuit power were considered.In [8], joint downlink and uplink design of a cellular network was studied in order to maximize EE for a given coverage area.The maximal EE was achieved by having a hundred BS antennas and serving tens of users in parallel, which matches well with the massive MIMO concept.Furthermore, the study [13] considered a downlink scenario in which a cellular network has been overlaid by small cells.It was shown that by increasing the number of BS antennas, the array gain allows for decreasing the radiated signal energy while maintaining the same ASR.However, the energy consumed by the transceiver chains increases.Maximizing the EE is thus a complicated problem where several counteracting factors need to be balanced.This stands in contrast to maximization of the ASR, which is relatively straightforward since the sum capacity is the fundamental upper bound.
There are only a few works in the D2D communication literature where the base stations have multiple antennas [14]- [18].In [14], uplink MU-MIMO with one D2D pair was considered.
Cellular user equipments (CUEs) were scheduled if they are not in the interference-limited zone of the D2D user.The study [15] compared different multi-antenna transmission schemes.In [16], two power control schemes were proposed for a multi-cell MIMO network.Two works that are more related to our work are [17] and [18].The former investigates the mode selection problem in the uplink of a network with potentially many antennas at the BS.The impact of the number of antennas on the quality-of-service and transmit power was studied when users need to decide their mode of operation (i.e., D2D or cellular).The latter study, [18], only employs extra antennas in the network to protect the CUEs from interference of D2D users in the uplink.
The ASR in D2D communications is mostly studied in the context of interference and radio resource management [19], [20].There are a few works that consider EE in D2D communications, but only for single antenna BSs, e.g., [21], [22], and [23], where the first one proposed a coalition formation method, the second one designed a resource allocation scheme, and the third one aimed at prolonging the battery life of user devices.
The spatial degrees of freedom offered by having multiple antennas at BSs are very useful in the design of future mobile networks, because the spatial precoding enables dense multiplexing of users while keeping the inter-user interference under control.In particular, the performance for cell edge users, which have almost equal signal-to-noise ratios (SNRs) to several BSs, can be greatly improved since only the desired signals are amplified by the transmit precoding [24]- [26].In order to model the random number of users and random user positions, we use mathematical tools from stochastic geometry [27] which are powerful in analytically quantifying certain metrics in closed-form.

B. Contributions
Our main contributions in this paper can be summarized as follows: and their locations are modeled according to a homogeneous Poisson point process (PPP) while a fixed number of CUEs are randomly distributed in the network.
• Tractable and directly computable expressions: We derive tightly approximated expressions for the coverage probability of D2D users and CUEs.These expressions are directly used to compute our main performance metrics, namely, the ASR and EE.We verify the tightness of these approximations by Monte-Carlo simulations.Furthermore, we provide analytical insights on the behavior of these metrics for both CUEs and D2D users.
To the best of our knowledge, the energy efficiency analysis for underlay D2D communications in a network with large number of BS antennas has not been carried out before.
• Performance analysis: Based on extensive simulations, we characterize the typical relation between the ASR and EE metrics in terms of the number of BS antennas, the number of CUEs, and the D2D user density for a given coverage area and study the incurred tradeoffs in two different scenarios.

II. SYSTEM MODEL
We consider a single-cell scenario where the BS is located in the center of the cell and its coverage area is a disc of radius R. The BS serves U c single-antenna CUEs which are uniformly distributed in the coverage area.These are simultaneously served in the downlink using an array of T c antennas located at the BS.It is assumed that 1 ≤ U c ≤ T c so that the precoding can be used to control the interference caused among the CUEs [28].
In addition to the CUEs, there are other single-antenna users that bypass the BS and communicate pairwise with each other using a D2D communication mode.The locations of the D2D transmitters (D2D Tx) are modeled by a homogeneous PPP Φ with density λ d in R 2 . 1   This means that the average number of D2D Tx per unit area is λ d and these users are uniformly distributed in that area.The D2D receiver (D2D Rx) is randomly located in an isotropic direction with a fixed distance away from its corresponding D2D Tx-a model that is similar to the one considered in [29].The system setup is illustrated in Fig. 1.
Let R k,j denote the distance between the j-th D2D Tx to the k-th D2D Rx.The performance analysis for D2D users is carried out for a typical D2D user, which is denoted by the index 0.The typical D2D user is an arbitrary D2D user located in the cell and its corresponding receiver is positioned in the origin.The results for a typical user show the statistical average performance of the network [27].Therefore, for any performance metric derivation, the D2D users inside the cell are considered and the ones outside the cell are only taken into account as sources of interference.Note that we neglect potential interference from other BSs and leave the multi-cell case for future work.This is because the interference from D2D transmissions is likely to be much stronger than the interference from other BSs.We assume equal power allocation for both CUEs and D2D users.Let P c denote the total transmit power of the BS, then the transmit power per CUE is Pc Uc .The transmit power of the D2D Tx is denoted by P d .Let h j ∈ C Tc×1 be the normalized channel response between the BS and the j-th CUE, for j ∈ {0, . . ., U c −1}.These channels are modeled as Rayleigh fading such that h j ∼ CN (0, I), where CN (•, •) denotes a circularly symmetric complex Gaussian distribution.Perfect instantaneous channel state information (CSI) is assumed in this work for analytic tractability, but imperfect CSI is a relevant extension.Linear downlink precoding is considered at the BS based on the zero-forcing (ZF) scheme that cancels the interference between the CUEs [28].The precoding matrix is denoted by V = [v 0 , . . ., v U c−1 ] ∈ C Tc×Uc in which each column v j is the normalized transmit precoding vector assigned to the CUE j.Let f 0,BS ∈ C Tc×1 be the channel response from the BS to D2D Rx and let it be Rayleigh fading as f 0,BS ∼ CN (0, I).Moreover, let r j ∈ C and s ∈ C Uc×1 denote the transmitted data signals intended for a D2D Rx and the CUEs, respectively.
Since each user requests different data, the transmitted signals can be modeled as zero-mean and 1 The assumption that the D2D Tx are distributed in the whole R 2 plane removes any concern about the boundary effects and makes the model more mathematically tractable.The boundary effects are local effects in which users at the network boundary experience less interference than the ones closer to the center, because they have fewer neighbors.
uncorrelated with E |r j | 2 = P d and E ||s|| 2 = P c .The fading channel response between the j-th D2D Tx and the k-th D2D Rx is denoted by g k,j ∈ C where g k,j ∼ CN (0, 1).Moreover, R 0,BS denotes the random distance between the typical D2D Rx and the BS.The pathloss is modeled as A i d −α i with i ∈ {c, d}, where index c indicates the pathloss between a user and the BS and index d gives the pathloss between any two users.A i and α i are the pathloss coefficient and exponent, respectively, where we assume α i > 2. The received signal at the typical D2D Rx is where η d is zero-mean additive white Gaussian noise with power N 0 = Ñ0 B w , Ñ0 is the power spectral density of the white Gaussian noise, and B w is the channel bandwidth.For given channel realizations, the signal-to-interference-plus-noise ratio (SINR) at the typical D2D Rx is in which both the numerator and the denominator have been normalized by A d .I BS,0 is the received interference power from the BS and I d,0 is the received interference power from other D2D users that transmit simultaneously which are defined as where Let D 0,k and e 0,k ∈ C with e 0,k ∼ CN (0, 1) be the distance and fading channel response between a typical CUE and the k-th D2D Tx, respectively, and let D 0,BS denote the distance between a typical CUE and the BS.Then, the received signal at the typical CUE is where η c is zero-mean additive white Gaussian noise with power N 0 .Then, the corresponding SINR for the typical CUE is where is the received interference power from all D2D users (normalized by A d ).

III. PERFORMANCE ANALYSIS
In this section, we first introduce the performance metrics that are considered in this paper.
Then we proceed to derive the coverage probability for both CUEs and D2D users which are needed to compute these metrics.

A. Performance Metrics
In this paper, two main performance metrics for the network are considered: the average sum rate (ASR) and energy efficiency (EE).The ASR is obtained from total rates of both D2D users and CUEs as where πR 2 λ d is the average number of D2D users in the cell and Rt with t ∈ {c, d} denotes the average rates of the CUEs and D2D users, respectively.Rt for both cellular and D2D users is computed as the successful transmission rate by where is the coverage probability when the received SINR is higher than a specified threshold β t needed for successful reception.Note that SINR t contains random channel fading and random user locations.Finding the supremum guarantees the best constant rate for the D2D users and the CUEs.If we know the coverage probability (P t cov (β t )), (10) can easily be computed by using line search for each user type independently.Moreover, (10) is easily achievable in practice since the modulation and coding is performed without requiring that every transmitter knows the interference characteristics at its receiver.
Energy efficiency is defined as the benefit-cost ratio between the ASR and the total consumed power: For the total power consumption, we consider a detailed model described in [8]: where P c + λ d πR 2 P d is the total transmission power averaged over the number of D2D users, η is the amplifier efficiency (0 < η ≤ 1), C 0 is the load independent power consumption at the BS, C 1 is the power consumption per BS antenna, C 2 is the power consumption per user device, and U c + 2λ d πR 2 is the average number of active users.
In order to calculate the ASR and EE, we need to derive the coverage probability for both cellular and D2D users.The analytic derivation of these expressions is one of the main contributions of this paper.

B. Coverage Probability of D2D Users
We first derive the expression for the coverage probability of D2D users.
Proposition 1: The approximate coverage probability for a typical D2D user is given by where is the average D2D SNR, and B(x; a, b) is the incomplete Beta function.
Proof: The proof is given in Appendix A.
The coverage probability expression in Proposition 1 allows us to compute the average data rate of a typical D2D user in (10).We note that ( 14) is actually a tight approximation and its tightness is evaluated in Section IV.From the expression in ( 14), we make several observations as listed below.
Remark 1: In the high-SNR regime for the D2D users where γd β d , the last term in (14) converges to one, i.e., exp − β d γd → 1, and we have This can also be referred to as the interference-limited regime.
Remark 2: The coverage probability of a typical D2D user is a decreasing function of the D2D density λ d .Because higher λ d results in more interference among D2D users.In particular, it can be seen that Recall that in our model, the D2D Rx is associated to the D2D Tx which is located at a fixed distance away.However, if we had assumed that the D2D Rx's association to a D2D Tx is based on, for example, the shortest distance or the maximum SINR, then the P d cov would have been unaffected by the D2D density (in the high-interference regime).Now, considering the number of BS antennas or the number of CUEs as variables, we have the following behavior of the D2D coverage probability.
Remark 3: P d cov is not affected by the number of BS antennas T c .The BS antennas are used to cancel out the interference among CUEs and they do not have any impact on D2D users' performance as long as the number of CUEs U c is constant and does not vary with the number of BS antennas T c .The coverage probability of a typical D2D user P d cov is a decreasing function of U c .However, increasing the number of CUEs have a small effect on D2D users' performance.This is due to the fact that the resulting interference from the BS to D2D users does not change significantly by increasing the number of CUEs as the transmit power of the BS is the same irrespective of the number of users and the precoding is independent of the D2D channels.Thus, a change of U c will only change the distribution of the interference but not its average.
Next we comment on how changes in the transmit powers of the BS and D2D Tx as well as the distance between D2D user pairs affect the coverage probability of D2D users.

Remark 4: P d
cov is a decreasing function of the ratio between the transmit power of the BS and of the D2D users, i.e., Pc P d , which is part of the first term in (14) and corresponds to the interference from the BS.For instance, if we fix P c and decrease P d , the coverage probability for D2D users decreases as the interference from the BS would be the dominating factor.At the same time, if we decrease P c , it would improve the coverage of D2D users.
Remark 5: P d cov is a decreasing function of the distance between D2D Tx-Rx pairs R 0,0 and the cell radius R. Increasing the cell radius with the same D2D user density reduces the effect of the interference from the BS.Also by decreasing the distance between D2D Tx-Rx pairs, it is evident that a better performance for D2D users can be obtained.
Using Proposition 1, the following corollary provides the optimal D2D user density that maximizes the D2D ASR, i.e., πR 2 λ d Rd , where Rd is given in (10).
Corollary 1: For a given SINR threshold β d , the optimal density of D2D users λ * d that maximizes the D2D ASR is Proof: Given the SINR threshold β d and using ( 9)-( 10), the D2D ASR is where P d cov (β d ) is given in (14) and depends on λ d through an exponential function.Taking the derivative of (17) with respect to λ d and setting it to zero yields the optimal D2D user density given in ( 16) that maximizes the D2D ASR.

C. Coverage Probability of Cellular Users
Next, we compute the coverage probability for CUEs.
Proposition 2: The coverage probability for a typical cellular user is given by with where ) , and Proof: The proof is given in Appendix B.
This proposition gives an expression for the coverage probability of CUEs in which there is only one random variable left.The expectation in (18) with respect to D 0,BS is intractable to derive analytically but can be computed numerically.The analytical results of Proposition 1 and Proposition 2 have been verified by Monte-Carlo simulations in Section IV.A main benefit of the analytic expressions (as compared to pure Monte-Carlo simulations with respect to all sources of randomness) is that they can be computed much more efficiently, which basically is a prerequisite for the multi-variable system analysis carried out in Section IV.
Next, we present some observations from the result in Proposition 2 as follows.
Remark 6: In the interference-limited regime where where I d,c N 0 , the coverage probability in (18) for a typical cellular user is simplified to The result obtained in Remark 6 has a lower computational complexity compared to the expression in Proposition 2 and at the same time it is a tight approximation for Proposition 2.
This can be observed from the denominator of the (7) where the term N 0 A d ≈ 0. where ) .Proof: (21) follows directly from ( 18) by setting T c − U c = 0.
Proof: Let m = T c − U c .Substituting SINR c from ( 7) into (11), we have where (a) follows from the CCDF of Step (c) is obtained from the dominated convergence theorem which allows for an interchange of limit and expectation and step (d) is due to the fact that ∞ k=0 z k k! = e z .In the results so far, we have discussed the case where there exist some D2D users as underlay to the cellular network, that is, λ d = 0, However, it is interesting to see what can be achieved without D2D users.
Corollary 4: If λ d = 0, the coverage probability for a typical cellular user is given by where Γ(•) is the Gamma function and ζ is defined in (5).
Proof: Substituting SINR c from ( 7) into (11) and setting λ d = 0, we have where (a) follows from the CCDF of Step (b) follows from taking the expectation with respect to z which is similar to the expression in (35) with the Laplace transform L z (l) = 2 αc + i and using the identity The closed-form results in Corollary 4 for λ d = 0 depends only on noise rather than interference and perhaps can result in higher ASR for CUEs.The ASR for λ d > 0 also depends on noise but its impact is much smaller.However, we note that this result is obtained for a single cell scenario.Thus, comparing Proposition 2 and Corollary 4 and evaluating the potential performance gain/loss due to introducing D2D communications would make more sense in a multi-cell scenario.
Using the results from Proposition 1 and Proposition 2, we proceed to evaluate the network performance in terms of the ASR and EE from ( 9) and ( 12), respectively.

IV. NUMERICAL RESULTS
In this section, we assess the performance of the setup in Fig. 1 in terms of ASR and EE using numerical evaluations.As we pointed out in Sec.III, many parameters affect these performance metrics.Initially, we consider the EE and the ASR as functions of three key parameters, namely, the number of BS antennas T c , the density of D2D users λ d , and the number of cellular users U c .We show the individual effect of these system parameters on the two performance metrics while other parameters such as BS transmit power P c , D2D transmit power P d , and distance between D2D Tx-Rx pair R 0,0 are fixed.Later on, we also comment on the choice of these fixed parameters.The system and simulation parameters are given in Table I.
Before we proceed to the performance evaluation, we verify the analytical results of Proposition 1 and Proposition 2 by Monte-Carlo simulations.As depicted in Fig. 2, simulation results closely follow the analytical derivations.The small gap in Fig. 2a is due to the spatial interference correlation resulting from the fact that multiple interfering streams are coming from the same location, hence, the Chi-squared distribution in ( 28) is an approximation.This is a quite standard approximation in analyzing MIMO systems [30].Moreover, in the simulations, the locations of the D2D Tx are generated in an area with radius 10R according to the PPP as opposed to our analytical assumption that they are located in the whole R 2 region.This assumption reduces the interference as compared to our analytical results and thus improves the coverage probability as can be seen in Fig. 2a.
We consider two scenarios corresponding to the number of CUEs U c in our evaluations.First, we assume that U c is chosen as a function of the number of BS antennas T c .Then, we move on to the case where we fix the number of CUEs and study the tradeoffs among other parameters.
Both scenarios are relevant in the design of massive MIMO systems.In order to speed up the numerical computations, we neglected the terms that are very small.

A. Number of CUEs as a Function of the Number of BS Antennas
In this scenario, we assume that there is a fixed ratio between the number of CUEs U c and the number of BS antennas T c .We assume this ratio to be Tc Uc = 5.Simply put, to serve one additional user, we add five more antennas at the BS since the main gains from massive MIMO come from multiplexing of many users rather than only having many antennas.respectively, in Fig. 4a, the average rates of the CUEs become higher than the case with U c = 1 user and T c = 5 antennas as expected from Corollary 3 and the multiplexing gain from having many CUEs.However, by introducing a small number of D2D users, there is a substantial probability that the interference from the D2D users reduces the CUEs' rates per link as observed in Remark 7. The reduction in these rates are not compensated in the ASR by the contribution of the D2D users' rates.Note that, as we stated in Remark 3, when U c is scaled with T c , it impacts the D2D coverage probability, but the decrease in the performance of D2D users is not significant.Furthermore, if we keep increasing λ d , even though the rate per link decreases for both CUEs and D2D users, there is a local minima after which the aggregate D2D rate over all D2D users becomes higher and the ASR increases again.The second turning point follows from the same reasoning as for the case of U c = 1 user and T c = 5 antennas, i.e., in higher D2D densities, the interference from D2D users are the limiting factor for the ASR.This effect can also be observed in Fig. 4b where the ASR performance is depicted versus different number of CUEs (and BS antennas) for two D2D densities.At the lower density, the ASR is linearly increasing with U c (and T c ), however, in the interference-limited regime (higher λ d ), increasing the number of CUEs and BS antennas do not impact the network ASR performance.
The reasoning in Fig. 4a and Fig. 4b can be well understood from Fig. 5 which explains the tradeoff between the ASR of CUEs and D2D users in the network.In the scenario in which we have T c = 70 antennas and U c = 14 users, the cellular network contributes more to the total ASR for the low D2D density regime (e.g., λ d = 10 −6 ) due to high number of CUEs and BS antennas.
In this region, the ASR gains from massive MIMO is large.By increasing λ d , the gain from massive MIMO vanishes as the interference added by the D2D users dominates and degrades the performance that was achieved by interference cancellation between CUEs.Therefore, with medium D2D user density, if there is a fixed rate constraint for CUEs, the network can still benefit (from the ASR perspective) from underlay D2D communications.However, in the high D2D density regime (e.g., λ d = 10 −4 ), the cellular ASR is too small and it is better that the cellular and D2D tiers use the overlay approach for communication instead of the underlay approach.
In Fig. 6, we show the network performance in terms of the EE as a function of the parameters λ d and U c with Tc Uc = 5.It is observed that the EE is a decreasing function of U c and T c .In contrast, there is a maximum point in the EE based on different values of λ d .To study this result further, similar to the ASR, we first plot the EE versus λ d for U c ∈ {1, 14} and T c ∈ {4, 70} in Fig. 7a.We can see that the pattern for both low and high number of BS antennas are similar to Fig. 4a.The higher EE is achieved with U c = 1 user and T c = 5 antennas as opposed to U c = 14 users and T c = 70 antennas.This is because the extra circuit power of the cellular tier Furthermore, if we plot the EE versus U c , we see a different behavior for low and high D2D densities.Fig. 7b illustrates that in the low D2D density regime (λ d = 10 −6 ), even though the ASR increases linearly, the EE almost stays the same as the number of CUEs, and correspondingly the number of BS antennas, increases.From (13), we can observe that for a fixed λ d , only the circuit power is changed by increasing U c and T c .At the same time, the circuit power dominates the the total power consumption and increases almost linearly leading to an (almost) constant EE.The network performance in terms of the EE is poor with high density of D2D users (λ d = 10 −4 ).This is due to the fact that the sum rate contributed by the CUEs is already degraded by the interference from high number of D2D users, and additionally, increasing U c (and accordingly T c ) increases the circuit power without any gain in the total ASR.Consequently, the EE decreases.Thus, massive MIMO can only improve the EE if the D2D user density is small, otherwise dedicated resources or underlaying with fewer BS antennas is beneficial.

B. Fixed Number of CUEs
In this section, we evaluate the system performance when the number of CUEs is fixed with U c = 4 users.The general trend of the network performance is the same as the case with Tc Uc = 5 in the previous section.However, there are some differences which are highlighted in Fig. 8a and Fig. 8b for the ASR and EE, respectively.As it is shown in Fig. 8a, in the low D2D user density regime (i.e., λ d = 10 −6 ) the ASR is increasing in T c , however, with a lower slope as compared to the case of Tc Uc = 5.By increasing the number of BS antennas for the fixed number of CUEs, better performance per user can be achieved, however in this case, as the number of CUEs is not high, the ASR increases with a small slope.For high D2D user density (i.e., λ d = 10 −4 ), the ASR is almost flat.Fig. 8b illustrates that when the D2D user density is low, the EE benefits from adding extra BS antennas until the sum of the circuit power consumption of all antennas dominates the performance and leads to a gradual decrease in the EE.As the figure implies, there exists an optimal number of BS antennas which is relatively small since the main massive MIMO gains come from multiplexing rather than just having many antennas.However, in high density D2D scenario, which is the interference-limited scenario, the EE decreases monotonically with T c .
Increasing the number of BS antennas in this region cannot improve the ASR significantly, as shown in Fig. 8a; at the same time the circuit power consumption increases as a result of the higher number of BS antennas, which in turn leads to decreasing network EE.
The conclusion is that the D2D user density has a very high impact on a network that employs the massive MIMO technology.In the downlink, these two technologies can only coexist in low density of D2D users with careful interference coordination.The number of CUEs should be a function of the number of BS antennas in order to benefit from high number of BS antennas in terms of the ASR and EE.Otherwise, in high density of D2D users, the D2D communication should use the overlay approach rather than the underlay, that is, dedicated time/frequency resources should be allocated to the D2D tier.

C. The Effect of Other System Parameters
So far, we have discussed the results based on constant transmit power P c , D2D transmit power P d , and distance between D2D Tx-Rx pairs R 0,0 given in Table I.Now we comment on the choice of these parameters and study their effects on the system performance.Proposition 1, Proposition 2, and Remark 4, it is evident that the coverage probability for both D2D and cellular tiers, and consequently the network ASR and EE, depend on the ratio of P d and P c .Therefore, we fix P c and vary P d .
Fig. 9a shows the ASR as function of λ d under two different power levels, i.e., P d = 6 dBm and P d = 13 dBm in a scenario where the number of CUEs U c is scaled by T c .We see that higher P d degrades the ASR at higher number of CUEs (and BS antennas) when the D2D user density is low, but has negligible impact at lower number of CUEs.The reason is that increasing P d , on the one hand, boosts the D2D user rates, and on the other hand, causes more interference to CUEs which deteriorates their rates.Consequently, at low D2D user densities and high number of CUEs and BS antennas where the cellular sum rate is the main contributer to the total ASR, the interference caused by higher D2D transmit power is the dominant factor leading to lower total ASR.However, as λ d increases, the contribution of the D2D sum rate to the total ASR increases, and thus with higher P d , the increase in the D2D sum rates compensates the decrease in CUEs sum rate and the difference in terms of the total ASR between the different power levels vanishes.When the number of CUEs is small, i.e., U c = 1 user and T c = 5 antennas, the CUE and D2D users have almost the same contributions to the ASR and increasing P d has negligible impact on the performance.users with careful interference coordination, because the massive MIMO gains vanish when the interference from the D2D tier becomes too large.The number of cellular users should be a function of the number of base station antennas in order to benefit from high number of base station antennas in terms of the average sum rate and energy efficiency.If there is a high density of D2D users, the D2D communication should use the overlay approach rather than the underlay or the network should only allow a subset of the D2D transmissions to be active at a time.

APPENDIX A PROOF OF PROPOSITION 1
The proof follows by substituting the definition of SINR d from ( 2) into (11) where we obtain Step (a) comes from the fact that |g 0,0 | 2 ∼ exp(1) and (b) follows since the noise and interference terms are mutually independent.In step (c), the Laplace transform defined as L x (s) = E x e −sx is identified.
The first Laplace transform in ( 25) is with respect to I BS,0 in (3) which is a function of two random variables, namely f H 0,BS V 2 and R 0,BS .This Laplace transform is calculated as for α c > 2, where (a) follows by introducing the notation and from the Laplace transform of the probability density function (PDF) of f H 0,BS V 2 which, by neglecting the spatial correlation, is tightly approximated by a Chi-squared distribution as 2 f H 0,BS V 2 ∼ χ 2 2Uc [30].Note that where f H 0,BS v i , i = {0, . . ., U c − 1}, are zero-mean circular symmetric complex Gaussian random variables with unit variance.Therefore, Uc−1 i=0 |f H 0,BS v i | 2 is the summation of U c i.i.d.exponential random variables which has an Erlang(U c , 1) distribution.Equivalently, the sum scaled down by σ 2 2 (i.e., multiplied by 2 σ 2 ) has a (standard) Chi-squared distribution with 2U c degrees of freedom.Hence, the PDF of f H 0,BS V 2 is From Laplace transform theory we know that L t n e −αt = n! (s+α) n+1 and with some simplifications, we obtain the result in step (a).Step (b) in (26) follows from the PDF of R 0,BS which is f R 0,BS (r) =    2r R 2 , if 0 ≤ r ≤ R, 0, otherwise, (30) as the typical D2D Rx is uniformly distributed over the cell area and the BS is located in the cell center.Step (c) in ( 26) is obtained by the change of variable for a, b > 0.
Next, we proceed to calculate the second Laplace transform in (25).This transform is with respect to I d,0 in (4) which is a function of two random variables, that is |g 0,j | 2 and R 0,j .
Therefore, we have where (a) is based on the probability generating functional (PGFL) [31], and (b) follows from the fact that G ∼ exp(1) and L e −t = 1 s+1 .

•Fig. 1 .
Fig.1.System model where a multi-antenna BS communicates in the downlink with multiple CUEs, while multiple user pairs communicate in D2D mode.The CUEs are distributed uniformly in the coverage area and the D2D users are distributed according to a PPP.The D2D users that are outside the coverage area are only considered as interferers.

Remark 7 :Corollary 2 :
The coverage probability of a typical CUE P c cov (β c ) is a decreasing function of the D2D user density λ d .From Proposition 2, only Υ(λ d , s, i) is a function of λ d which is composed of an exponential term in λ d multiplied by a polynomial term in λ d .Thus, if λ d → ∞, the exponential term which has a negative growth dominates the polynomial term and P c cov (β c ) → 0. We proceed to analyze the behavior of Proposition 2 by considering a number of special cases.If T c = U c , the coverage probability for a typical cellular user is given by

Corollary 3 :
If (T c − U c ) → ∞, the coverage probability for a typical cellular user tends to one, that is, lim (Tc−Uc)→∞ P c cov (β c ) = 1.

Fig. 3
Fig.3shows the ASR as a function of the density of D2D users λ d and the number of CUEs U c , which is scaled by T c .It is observed that increasing U c , or equivalently T c , always increases the ASR.In contrast, there is an optimal value of λ d as derived in Corollary 1 which results in the maximum ASR for all values of U c and appears approximately at λ d = 10 −4 .However, there is a difference in the shape of the ASR between the lower and higher values of U c .In order to clarify this effect, we plot the ASR versus λ d in a 2-D plot with U c ∈ {1, 14} equivalent toT c ∈ {5, 70} in Fig. 4a.As seen in Fig.4a, for U c = 1 user and T c = 5 antennas, the rate contributed from the CUEs to the sum rate is low as there is only one CUE.This rate is in a comparable level as the contribution of D2D users sum rate to the total ASR.Adding D2D users to the network (i.e., increasing λ d ), which may cause interference, will nevertheless leads to an increase in the ASR.This increase in the ASR continues until reaching a certain density that gives the maximum ASR.By further increasing λ d , the interference between D2D users reduces their coverage probability as previously observed in Remark 2. This limits the per link data rate and even a high number of D2D users cannot compensate for the D2D rate loss.At the same time, increasing λ d tremendously affects the CUEs sum rate (cf.Remark 7).Consequently, as λ d increases, the ASR decreases.By increasing the number of CUEs and BS antennas to U c = 14 users and T c = 70 antennas,

Fig. 3 .Fig. 4 .
Fig. 3. ASR [Mbit/s] as a function of the number of CUEs Uc and the D2D user density λ d for a fixed ratio Tc Uc = 5.

Fig. 6 .
Fig. 6.EE [Mbit/Joule] as a function of the number of CUEs Uc and the D2D user density λ d for a fixed ratio Tc Uc = 5.

Fig. 7 .
Fig. 7. EE [Mbit/Joule]: (a) as a function of the D2D user density λ d for a fixed ratio Tc Uc = 5 with the number of CUEs Uc ∈ {1, 14}; (b) as a function of the number of CUEs Uc with the D2D user density λ d ∈ {10 −6 , 10 −4 } for a fixed ratio Tc Uc = 5. From

Fig. 9b depicts
Fig. 9b depicts the EE as a function of λ d under the same two levels of D2D transmit power.It

Fig. 10 .
Fig. 10.Cellular ASR vs. D2D ASR [Mbit/s] for different distances between D2D Tx and D2D Rx with Uc = 4 users and Tc = 70 antennas.The curves are obtained by varying the value of λ d from 10 −6 to 10 −2 .

1
κβ d r −αc +1 → t which leads to the integral boundary y 1 κβ d R −αc +1 .Finally, (d) follows by integration by part where B(x; a, b) is the incomplete Beta function defined as B(x; a, b) = x 0 t a−1 (1 − t) b−1 dt,

SubstitutingE(− 1 )) s 2 /
SINR c from(7) into(11), we getP c cov (β c ) = Pr |h H 0 v 0 | 2 ≥ I d,c I i d,c e −sI d,c (d) = E D 0,BS e i d i ds i L I d,c (s) ,(33)where (a) follows from the CCDF of|h H 0 v 0 | 2 with 2|h H 0 v 0 | 2 ∼ χ 2 2(Tc−Uc+1) given D 0,BS and I d,c .In (b), we use Binomial expansion as I d,c + follows by taking the expectation with respect to the interference I d,c .Step (d) follows fromE I d,c I i d,c e −sI d,c = (−1) i d i ds i L I d,c (s),(35)where L I d,c (s) is obtained using similar steps as in the derivation of L I d,0 in (32):L I d,c(s) = exp − πλ d P in (33) and using the Faà di Bruno's formula for the i-th derivative of a composite function f (g(s)) with f (s) = e s and g(s) = − α d , Proposition 2 follows.