- Research Article
- Open Access
An MMSE Approach to the Secrecy Capacity of the MIMO Gaussian Wiretap Channel
EURASIP Journal on Wireless Communications and Networking volume 2009, Article number: 370970 (2009)
This paper provides a closed-form expression for the secrecy capacity of the multiple-input multiple output (MIMO) Gaussian wiretap channel, under a power-covariance constraint. Furthermore, the paper specifies the input covariance matrix required in order to attain the capacity. The proof uses the fundamental relationship between information theory and estimation theory in the Gaussian channel, relating the derivative of the mutual information to the minimum mean-square error (MMSE). The proof provides the missing intuition regarding the existence and construction of an enhanced degraded channel that does not increase the secrecy capacity. The concept of enhancement has been used in a previous proof of the problem. Furthermore, the proof presents methods that can be used in proving other MIMO problems, using this fundamental relationship.
The informationtheoretic characterizationof secrecy in communication systems has attracted considerable attention in recent years. (See  for an exposition of progress in this area.) In this paper, we consider the general multiple-input multiple-output (MIMO) wiretap channel, presented in , with transmit antennas and and receive antennas at the legitimate recipient and the eavesdropper, respectively:
where and are assumed to be fixed during the entire transmission and are known to all three terminals. The additive noise terms and are zero-mean Gaussian vector processes independent across the time index . The channel input satisfies a total power constraint:
The secrecy capacity of a wiretap channel, defined by Wyner , as "perfect secrecy" capacity is the maximal rate such that the information can be decoded arbitrarily reliably by the legitimate recipient, while insuring that it cannot be deduced at any positive rate by the eavesdropper.
For a discrete memoryless wiretap channel with transition probability , a single-letter expression for the secrecy capacity was obtained by Csiszár and Körner :
where is an auxiliary random variable over a certain alphabet that satisfies the Markov relationship . This result extends to continuous alphabet cases with power constraint (2). Thus, in order to evaluate the secrecy capacity of the MIMO Gaussian wiretap channel we need to evaluate (3) under the power constraint (2). For the degraded case Wyner's single-letter expression of the secrecy capacity results from setting :
The problem of characterizing the secrecy capacity of the MIMO Gaussian wiretap channel remained open until the work of Khisti and Wornell  and Oggier and Hassibi . In their respective work, Khisti and Wornell  and Oggier and Hassibi  followed an indirect approach using a Sato-like argument and matrix analysis tools. In  Liu and Shamai propose a more information-theoretic approach using the enhancement concept, originally presented by Weingarten et al. , as a tool for the characterization of the MIMO Gaussian broadcast channel capacity. Liu and Shamai have shown that an enhanced degraded version attains the same secrecy capacity as does the Gaussian input distribution. From the mathematical solution in  it is evident that such an enhanced channel exists; however it is not intuitive why, or how to construct such a channel.
A fundamental relationship between estimation theory and information theory for Gaussian channels was presented in ; in particular, it was shown that for the MIMO standard Gaussian channel,
and regardless of the input distribution, the mutual information and the minimum mean-square error (MMSE) are related (assuming real-valued inputs/outputs) by
where stands for the conditional mean of given . This fundamental relationship and its generalizations [8, 9], referred to as the I-MMSE relations, have already been shown to be useful in several aspects of information theory: providing insightful proofs for entropy power inequalities , revealing the mercury/waterfilling optimal power allocation over a set of parallel Gaussian channels , tackling the weighted sum-MSE maximization in MIMO broadcast channels , illuminating extrinsic information of good codes , and enabling a simple proof of the monotonicity of the non-Gaussianness of independent random variables . Furthermore, in  it has been shown that using this relationship one can provide insightful and simple proofs for multiuser single antenna problems such as the broadcast channel and the secrecy capacity problem. Similar techniques were later used in  to provide the capacity region for the Gaussian multireceiver wiretap channel.
Motivated by these successes, this paper provides an alternative proof for the secrecy capacity of the MIMO Gaussian wiretap channel using the fundamental relationship presented in [8, 9], which results in a closed-form expression for the secrecy capacity, that is, an expression that does not include optimization over the input covariance matrix, a difficult problem on its own due to the nonconvexity of the expression . Thus, another important contribution of this paper is the explicit characterization of the optimal input covariance matrix that attains the secrecy capacity. The proof presented here provides the intuition regarding the existence and construction of the enhanced degraded channel which is central in the approach of . Furthermore, the methods presented here could be used to tackle other MIMO problems, using the fundamental relationships shown in [8, 9].
2. Definitions and Preliminaries
Consider a canonical version of the MIMO Gaussian wiretap channel, as presented in :
where is a real input vector of length , and and are additive Gaussian noise vectors with zero means and covariance matrices and , respectively, and are independent across the time index . The noise covariance matrices and are assumed to be positive definite. The channel input satisfies a power-covariance constraint:
where is a positive semidefinite matrix of size , and "" denotes "less or equal to" in the positive semidefinite partial ordering between real symmetric matrices. Note that (8) is a rather general constraint that subsumes constraints that can be described by a compact set of input covariance matrices . For example, assuming is the secrecy capacity under a covariance constraint (8) we have according to  the following:
where is the secrecy capacity under a total power constraint (2), and is the secrecy capacity under a per antenna power constraint. As shown in [2, 7], characterizing the secrecy capacity of the general MIMO Gaussian wiretap channel (1) can be reduced to characterizing the secrecy capacity of the canonical version (7). For full details the reader is referred to , and [17, Theorem 3].
We first give a few central definitions and relationships that will be used in the sequel. We begin with the following definition:
that is, is the covariance matrix of the estimation error vector, known as the MMSE matrix. For the specific case in which the input to the channel is Gaussian with covariance matrix , we define
where is the covariance matrix of the additive Gaussian noise, . That is, is the error covariance matrix of the joint Gaussian estimator.
The fundamental relationship between information theory and estimation theory in the Gaussian channel gave rise to a variety of other relationships [8, 9]. In our proof, we will use the following relationship, given by Palomar and Verdú in :
where is the covariance matrix of the additive Gaussian noise, .
Our first observation regarding the relationship given in (12) is detailed in the following lemma.
For any two symmetric positive semidefinite matrices and , such that and positive semidefinite matrix , the integral is nonnegative (where is any path from to ).
The proof of the lemma is given in Appendix A.
3. The Degraded MIMO Gaussian Wiretap Channel
We first consider the degraded MIMO Gaussian wiretap channel, that is, .
The secrecy capacity of the degraded MIMO Gaussian wiretap channel (7), , under the power-covariance constraint (8) is
Using (12) the difference to be maximized, according to Wyner's single-letter expression (4), can be written as
This is due to the independence of the line integral (A.3) on the path in any open connected set in which the gradient is continuous .
The error covariance matrix of any optimal estimator is upper bounded (in the positive semidefinite partial ordering between real symmetric matrices) by the error covariance matrix of the joint Gaussian estimator, , defined in (11), for the same input covariance. Formally, , and thus one can express as follows: , where is some positive semidefinite matrix.
Due to this representation of we can express the mutual information difference, given in (14), in the following manner:
where the last inequality is due to Lemma 1 and the fact that . Equality in (15) is attained when is Gaussian. Thus, we obtain the following expression:
4. The General MIMO Gaussian Wiretap Channel
In considering the general case, we first note that one can apply the generalized eigenvalue decomposition  to the following two symmetric positive definite matrices:
That is, there exists an invertible general eigenvector matrix, , such that
where is a positive definite diagonal matrix. Without loss of generality, we assume that there are () elements of larger than 1:
Hence, we can write as
where , and . Since the matrix is positive definite, the problem of calculating the generalized eigenvalues and the matrix is reduced to a standard eigenvalue problem . Choosing the eigenvectors of the standard eigenvalue problem to be orthonormal, and the requirement on the order of the eigenvalues, leads to an invertible matrix , which is -orthonormal. Using these definitions we turn to the main theorem of this paper.
The secrecy capacity of the MIMO Gaussian wiretap channel (7), under the power-covariance constraint (8), is
where, using the invertible matrix defined in (18) one defines,
and letting where is the submatrix and is the submatrix, one defines,
Following [7, Lemma 2], we may assume that is (strictly) positive definite. We divide the proof into two parts: the converse part, that is, constructing an upper bound, and the achievability part-showing that the upper bound is attainable.
Our goal is to evaluate the secrecy capacity expression (3). Due to the Markov relationship, , the difference to be maximized can be written as
We use the I-MMSE relationship (12) on each of the two differences in (24):
where , and
where . Thus, putting the two together, (24) becomes
We define, , and obtain
That is, is the error covariance of the optimal estimation of from , and as such it is positive semidefinite. It is easily verified that , defined in (22), satisfies both , and . The integral in (27) can be upper bounded using this fact and Lemma 1:
Equality will be attained when the second integral equals zero. Using the upper bound in (29) we present two possible proofs that result with the upper bound given in (30). The more information-theoretic proof is given in the sequel, while the second, the more estimation-theoretic proof, is relegated to Appendix B.
The upper bound given in (29) can be viewed as the secrecy capacity of an MIMO Gaussian model, similar to the model given in (7), but with noise covariance matrices and and outputs and respectively. Furthermore, this is a degraded model, and it is well known that the general solution given by Csiszár and Körner , reduces to the solution given by Wyner  by setting . Thus, (29) becomes
where the third inequality is according to (15), and the last two transitions are due to Theorem 1, (16). This completes the converse part of the proof.
We now show that the upper bound given in (30) is attainable when is Gaussian with covariance matrix , as defined in (23). The proof is constructed from the next three lemmas. We first prove that is a legitimate covariance matrix, that is, it complies with the input covariance constraint (8).
The matrix defined in (23) complies with the power-covariance constraint (8), that is,
The proof of Lemma 2 is given in Appendix C. In the next two Lemmas we show that attains the upper bound given in (30).
The following equality holds:
Proof of Lemma 3.
We first calculate the expression in the left hand side (assuming ), which is the upper bound in (30):
where we have used the generalized eigenvalue decomposition (18) and the definition of (22). From (18) we note that,
Using (34) we can derive the following relationship (full details are given in Appendix D):
And similarly we can derive
Thus, we have
which is the result attained in (33). This concludes the proof of Lemma 3.
The following equality holds:
Proof of Lemma 4.
Due to the generalized eigenvalue decomposition (18) we have,
Using similar steps as the ones used to obtain (35) we can show that,
Thus, concluding the proof of Lemma 4.
Putting all the above together we have that
where the first equality is due to Lemma 3, and the second equality is due to Lemma 4. Thus, the upper bound given in (30) is attainable using the Gaussian distribution over , and , defined in (23). This concludes the proof of Theorem 2.
5. Discussion and Remarks
The alternative proof we have presented here uses the enhancement concept, also used in the proof of Liu and Shamai , in a more concrete manner. We have constructed a specific enhanced degraded model. The constructed model is the "tightest" enhancement possible in the sense that under the specified transformation, the matrix is the "smallest" possible positive definite matrix, that is, both and .
The specific enhancement results in a closed-form expression for the secrecy capacity, using . Furthermore, Theorem 2 shows that instead of we can maximize the secrecy capacity by taking an input covariance matrix that "disregards" subchannels for which the eavesdropper has an advantage over the legitimate recipient (or is equivalent to the legitimate recipient). Mathematically, this allows us to switch back from to , and thus to show that , explicitly defined, is the optimal input covariance matrix. Intuitively, is the optimal input covariance for the legitimate receiver, since under the transformation, , it is for the sub-channels for which the legitimate receiver has an advantage and zero otherwise.
The enhancement concept was used in addition to the I-MMSE approach in order to attain the upper bound in (30). The primary usage of these two concepts came together in (29), where we derived an initial upper bound. We have shown that the upper bound is attainable when is Gaussian with covariance matrix . Thus, under these conditions the second integral in (29) should be zero, that is,
where the second transition is due to the choice , the third is due to the choice of a Gaussian distribution for with covariance matrix , and the last equality is due to Lemma 4.
Liang Y, Poor HV, Shamai (Shitz) S: Information theoretic security. Foundations and Trends in Communications and Information Theory 2008, 5(4-5):355-580.
Liu T, Shamai (Shitz) S: A note on secrecy capacity of the multi-antenna wiretap channel. IEEE Transaction on Information Theory 2009, 55(6):2547-2553.
Wyner AD: The wire-tap channel. Bell System Technical Journal 1975, 54(8):1355-1387.
Csiszár I, Körner J: Broadcast channels with confidential messages. IEEE Transactions on Information Theory 1978, 24(3):339-348. 10.1109/TIT.1978.1055892
Khisti A, Wornell G: The MIMOME channel. Proceedings of the 45th Annual Allerton Conference on Communication, Control and Computing, September 2007, Monticello, Ill, USA
Oggier F, Hassibi B: The secrecy capacity of the MIMO wiretap channel. Proceedings of IEEE International Symposium on Information Theory (ISIT '08), July 2008, Toronto, Canada 524-528.
Weingarten H, Steinberg Y, Shamai (Shitz) S: The capacity region of the Gaussian multiple-input multiple-output broadcast channel. IEEE Transactions on Information Theory 2006, 52(9):3936-3964.
Guo D, Shamai (Shitz) S, Verdú S: Mutual information and minimum mean-square error in Gaussian channels. IEEE Transactions on Information Theory 2005, 51(4):1261-1282. 10.1109/TIT.2005.844072
Palomar DP, Verdú S: Gradient of mutual information in linear vector Gaussian channels. IEEE Transactions on Information Theory 2006, 52(1):141-154.
Guo D, Shamai (Shitz) S, Verdú S: Proof of entropy power inequalities via MMSE. Proceedings of IEEE International Symposium on Information Theory (ISIT '06), July 2006, Seattle, Wash, USA 1011-1015.
Lozano A, Tulino AM, Verdú S: Optimum power allocation for parallel Gaussian channels with arbitrary input distributions. IEEE Transactions on Information Theory 2006, 52(7):3033-3051.
Christensen S, Agarwal R, Carvalho E, Cioffi J: Weighted sum-rate maximization using weighted MMSE for MIMO-BC beamforming design. IEEE Transactions on Wireless Communications 2008, 7(12):4792-4799.
Peleg M, Sanderovich A, Shamai (Shitz) S: On extrinsic information of good binary codes operating over Gaussian channels. European Transactions on Telecommunications 2007, 18(2):133-139. 10.1002/ett.1130
Tulino AM, Verdú S: Monotonic decrease of the non-Gaussianness of the sum of independent random variables: a simple proof. IEEE Transactions on Information Theory 2006, 52(9):4295-4297.
Guo D, Shamai (Shitz) S, Verdú S: Estimation in Gaussian noise: properties of the minimum mean-square error. Proceedings of IEEE International Symposium on Information Theory (ISIT '08), July 2008, Toronto, Canada
Ekrem E, Ulukus S: Secrecy capacity region of the Gaussian multi-receive wiretap channel. Proceedings of IEEE International Symposium on Information Theory (ISIT '09), June-July 2009, Seoul, Korea
Liu R, Liu T, Poor HV, Shamai (Shitz) S: Multiple-input multiple-output Gaussian broadcast channels with coonfidential messages. submitted to IEEE Transactions on Information Theory and in Proceedings of IEEE International Symposium on Information Theory (ISIT'09), Seoul, Korea, June-July 2009
Apostol TM: Calculus, Multi-Variable Calculus and Linear Algebra, with Applications to Differential Equations and Probability. 2nd edition. Wiley, New York, NY, USA; 1969.
Strang G: Linear Algebra and Its Applications. Wellesley-Cambridge Press, Wellesley, Mass, USA; 1998.
Horn RA, Johnson CR: Matrix Analysis. University Press, Cambridge, UK; 1985.
This work has been supported by the Binational Science Foundation (BSF), the FP7 Network of Excellence in Wireless Communications NEWCOM++, and the U.S. National Science Foundation under Grants CNS-06-25637 and CCF-07-28208.
A. Proof of Lemma 1
The inner product between matrices and is defined as
and the Schur product between matrices and is defined as
For a function with gradient the line integral (type II)  is given by
Thus in our case, where are matrices, and the integral over a path from to is equivalent to the following line integral:
Since the Schur product preserves the positive definite/semidefinite quality [20, 7.5.3], it is easy to see that when , both are symmetric, and since is a positive semidefinite matrix for all , the integral is always nonnegative.
B. Second Proof of Theorem 2
The error covariance matrix of the optimal estimator can be written as , where both and are positive semidefinite, and is the error covariance matrix of the optimal linear estimator of from . Using this in (29), we have
where the last inequality is again due to Lemma 1. Equality will be attained when , that is, when .
We denote . The optimal linear estimator has the following form:
where is the covariance matrix of , and are the cross-covariance matrices of and , and is the covariance matrix of . We can easily calculate and (assuming zero mean):
Regarding we can claim the following:
where equality, , is attained when the estimation error is zero, that is, when . Since this can only be achieved when or ; however since the Markov property, , must be preserved, we conclude that in order to achieve equality.
We have , where is a positive semidefinite matrix, and the linear estimator is
Substituting this into the integral in (B.1) we have
where the second inequality is due to Lemma 1, and the last inequality is due to Theorem 1, (16). The resulting upper bound equals the one given in (30). The rest of the proof follows via similar steps to those in the proof given in Section 4.
C. Proof of Lemma 2
Since the sub-matrix is positive semidefinite it is evident that . Thus, it remains to show that . Since is invertible, in order to prove , it is enough to show that
We notice that,
Using blockwise inversion  we have
where denotes and
due to the positivedefinite quality of and the Schur Complement Lemma . Hence,
D. Deriving Equation (35)
About this article
Cite this article
Bustin, R., Liu, R., Poor, H.V. et al. An MMSE Approach to the Secrecy Capacity of the MIMO Gaussian Wiretap Channel. J Wireless Com Network 2009, 370970 (2009). https://doi.org/10.1155/2009/370970
- Broadcast Channel
- Fundamental Relationship
- Gaussian Channel
- Secrecy Capacity
- Wiretap Channel