An MMSE Approach to the Secrecy Capacity of the MIMO Gaussian Wiretap Channel

Bustin, Ronit; Liu, Ruoheng; Poor, H. Vincent; Shamai (Shitz), Shlomo

doi:10.1155/2009/370970

Research Article
Open access
Published: 27 July 2009

An MMSE Approach to the Secrecy Capacity of the MIMO Gaussian Wiretap Channel

Ronit Bustin¹,
Ruoheng Liu²,
H. Vincent Poor² &
…
Shlomo Shamai (Shitz)¹

EURASIP Journal on Wireless Communications and Networking volume 2009, Article number: 370970 (2009) Cite this article

2595 Accesses
95 Citations
Metrics details

Abstract

This paper provides a closed-form expression for the secrecy capacity of the multiple-input multiple output (MIMO) Gaussian wiretap channel, under a power-covariance constraint. Furthermore, the paper specifies the input covariance matrix required in order to attain the capacity. The proof uses the fundamental relationship between information theory and estimation theory in the Gaussian channel, relating the derivative of the mutual information to the minimum mean-square error (MMSE). The proof provides the missing intuition regarding the existence and construction of an enhanced degraded channel that does not increase the secrecy capacity. The concept of enhancement has been used in a previous proof of the problem. Furthermore, the proof presents methods that can be used in proving other MIMO problems, using this fundamental relationship.

1. Introduction

The informationtheoretic characterizationof secrecy in communication systems has attracted considerable attention in recent years. (See [1] for an exposition of progress in this area.) In this paper, we consider the general multiple-input multiple-output (MIMO) wiretap channel, presented in [2], with transmit antennas and and receive antennas at the legitimate recipient and the eavesdropper, respectively:

(1)

where and are assumed to be fixed during the entire transmission and are known to all three terminals. The additive noise terms and are zero-mean Gaussian vector processes independent across the time index . The channel input satisfies a total power constraint:

(2)

The secrecy capacity of a wiretap channel, defined by Wyner [3], as "perfect secrecy" capacity is the maximal rate such that the information can be decoded arbitrarily reliably by the legitimate recipient, while insuring that it cannot be deduced at any positive rate by the eavesdropper.

For a discrete memoryless wiretap channel with transition probability , a single-letter expression for the secrecy capacity was obtained by Csiszár and Körner [4]:

(3)

where is an auxiliary random variable over a certain alphabet that satisfies the Markov relationship . This result extends to continuous alphabet cases with power constraint (2). Thus, in order to evaluate the secrecy capacity of the MIMO Gaussian wiretap channel we need to evaluate (3) under the power constraint (2). For the degraded case Wyner's single-letter expression of the secrecy capacity results from setting [3]:

(4)

The problem of characterizing the secrecy capacity of the MIMO Gaussian wiretap channel remained open until the work of Khisti and Wornell [5] and Oggier and Hassibi [6]. In their respective work, Khisti and Wornell [5] and Oggier and Hassibi [6] followed an indirect approach using a Sato-like argument and matrix analysis tools. In [2] Liu and Shamai propose a more information-theoretic approach using the enhancement concept, originally presented by Weingarten et al. [7], as a tool for the characterization of the MIMO Gaussian broadcast channel capacity. Liu and Shamai have shown that an enhanced degraded version attains the same secrecy capacity as does the Gaussian input distribution. From the mathematical solution in [2] it is evident that such an enhanced channel exists; however it is not intuitive why, or how to construct such a channel.

A fundamental relationship between estimation theory and information theory for Gaussian channels was presented in [8]; in particular, it was shown that for the MIMO standard Gaussian channel,

(5)

and regardless of the input distribution, the mutual information and the minimum mean-square error (MMSE) are related (assuming real-valued inputs/outputs) by

(6)

where stands for the conditional mean of given . This fundamental relationship and its generalizations [8, 9], referred to as the I-MMSE relations, have already been shown to be useful in several aspects of information theory: providing insightful proofs for entropy power inequalities [10], revealing the mercury/waterfilling optimal power allocation over a set of parallel Gaussian channels [11], tackling the weighted sum-MSE maximization in MIMO broadcast channels [12], illuminating extrinsic information of good codes [13], and enabling a simple proof of the monotonicity of the non-Gaussianness of independent random variables [14]. Furthermore, in [15] it has been shown that using this relationship one can provide insightful and simple proofs for multiuser single antenna problems such as the broadcast channel and the secrecy capacity problem. Similar techniques were later used in [16] to provide the capacity region for the Gaussian multireceiver wiretap channel.

Motivated by these successes, this paper provides an alternative proof for the secrecy capacity of the MIMO Gaussian wiretap channel using the fundamental relationship presented in [8, 9], which results in a closed-form expression for the secrecy capacity, that is, an expression that does not include optimization over the input covariance matrix, a difficult problem on its own due to the nonconvexity of the expression [5]. Thus, another important contribution of this paper is the explicit characterization of the optimal input covariance matrix that attains the secrecy capacity. The proof presented here provides the intuition regarding the existence and construction of the enhanced degraded channel which is central in the approach of [2]. Furthermore, the methods presented here could be used to tackle other MIMO problems, using the fundamental relationships shown in [8, 9].

2. Definitions and Preliminaries

Consider a canonical version of the MIMO Gaussian wiretap channel, as presented in [2]:

(7)

where is a real input vector of length , and and are additive Gaussian noise vectors with zero means and covariance matrices and , respectively, and are independent across the time index . The noise covariance matrices and are assumed to be positive definite. The channel input satisfies a power-covariance constraint:

(8)

where is a positive semidefinite matrix of size , and "" denotes "less or equal to" in the positive semidefinite partial ordering between real symmetric matrices. Note that (8) is a rather general constraint that subsumes constraints that can be described by a compact set of input covariance matrices [7]. For example, assuming is the secrecy capacity under a covariance constraint (8) we have according to [7] the following:

(9)

where is the secrecy capacity under a total power constraint (2), and is the secrecy capacity under a per antenna power constraint. As shown in [2, 7], characterizing the secrecy capacity of the general MIMO Gaussian wiretap channel (1) can be reduced to characterizing the secrecy capacity of the canonical version (7). For full details the reader is referred to [7], and [17, Theorem 3].

We first give a few central definitions and relationships that will be used in the sequel. We begin with the following definition:

(10)

that is, is the covariance matrix of the estimation error vector, known as the MMSE matrix. For the specific case in which the input to the channel is Gaussian with covariance matrix , we define

(11)

where is the covariance matrix of the additive Gaussian noise, . That is, is the error covariance matrix of the joint Gaussian estimator.

The fundamental relationship between information theory and estimation theory in the Gaussian channel gave rise to a variety of other relationships [8, 9]. In our proof, we will use the following relationship, given by Palomar and Verdú in [9]:

(12)

where is the covariance matrix of the additive Gaussian noise, .

Our first observation regarding the relationship given in (12) is detailed in the following lemma.

Lemma 1.

For any two symmetric positive semidefinite matrices and , such that and positive semidefinite matrix , the integral is nonnegative (where is any path from to ).

The proof of the lemma is given in Appendix A.

3. The Degraded MIMO Gaussian Wiretap Channel

We first consider the degraded MIMO Gaussian wiretap channel, that is, .

Theorem 1.

The secrecy capacity of the degraded MIMO Gaussian wiretap channel (7), , under the power-covariance constraint (8) is

(13)

Proof.

Using (12) the difference to be maximized, according to Wyner's single-letter expression (4), can be written as

(14)

This is due to the independence of the line integral (A.3) on the path in any open connected set in which the gradient is continuous [18].

The error covariance matrix of any optimal estimator is upper bounded (in the positive semidefinite partial ordering between real symmetric matrices) by the error covariance matrix of the joint Gaussian estimator, , defined in (11), for the same input covariance. Formally, , and thus one can express as follows: , where is some positive semidefinite matrix.

Due to this representation of we can express the mutual information difference, given in (14), in the following manner:

(15)

where the last inequality is due to Lemma 1 and the fact that . Equality in (15) is attained when is Gaussian. Thus, we obtain the following expression:

(16)

4. The General MIMO Gaussian Wiretap Channel

In considering the general case, we first note that one can apply the generalized eigenvalue decomposition [19] to the following two symmetric positive definite matrices:

(17)

That is, there exists an invertible general eigenvector matrix, , such that

(18)

where is a positive definite diagonal matrix. Without loss of generality, we assume that there are () elements of larger than 1:

(19)

Hence, we can write as

(20)

where , and . Since the matrix is positive definite, the problem of calculating the generalized eigenvalues and the matrix is reduced to a standard eigenvalue problem [19]. Choosing the eigenvectors of the standard eigenvalue problem to be orthonormal, and the requirement on the order of the eigenvalues, leads to an invertible matrix , which is -orthonormal. Using these definitions we turn to the main theorem of this paper.

Theorem 2.

The secrecy capacity of the MIMO Gaussian wiretap channel (7), under the power-covariance constraint (8), is

(21)

where, using the invertible matrix defined in (18) one defines,

(22)

and letting where is the submatrix and is the submatrix, one defines,

(23)

Proof.

Following [7, Lemma 2], we may assume that is (strictly) positive definite. We divide the proof into two parts: the converse part, that is, constructing an upper bound, and the achievability part-showing that the upper bound is attainable.

(a)
Converse

Our goal is to evaluate the secrecy capacity expression (3). Due to the Markov relationship, , the difference to be maximized can be written as

(24)

We use the I-MMSE relationship (12) on each of the two differences in (24):

(25)

where , and

(26)

where . Thus, putting the two together, (24) becomes

(27)

We define, , and obtain

(28)

That is, is the error covariance of the optimal estimation of from , and as such it is positive semidefinite. It is easily verified that , defined in (22), satisfies both , and . The integral in (27) can be upper bounded using this fact and Lemma 1:

(29)

Equality will be attained when the second integral equals zero. Using the upper bound in (29) we present two possible proofs that result with the upper bound given in (30). The more information-theoretic proof is given in the sequel, while the second, the more estimation-theoretic proof, is relegated to Appendix B.

The upper bound given in (29) can be viewed as the secrecy capacity of an MIMO Gaussian model, similar to the model given in (7), but with noise covariance matrices and and outputs and respectively. Furthermore, this is a degraded model, and it is well known that the general solution given by Csiszár and Körner [4], reduces to the solution given by Wyner [3] by setting . Thus, (29) becomes

(30)

where the third inequality is according to (15), and the last two transitions are due to Theorem 1, (16). This completes the converse part of the proof.

(b)
Achievability

We now show that the upper bound given in (30) is attainable when is Gaussian with covariance matrix , as defined in (23). The proof is constructed from the next three lemmas. We first prove that is a legitimate covariance matrix, that is, it complies with the input covariance constraint (8).

Lemma 2.

The matrix defined in (23) complies with the power-covariance constraint (8), that is,

(31)

The proof of Lemma 2 is given in Appendix C. In the next two Lemmas we show that attains the upper bound given in (30).

Lemma 3.

The following equality holds:

(32)

Proof of Lemma 3.

We first calculate the expression in the left hand side (assuming ), which is the upper bound in (30):

(33)

where we have used the generalized eigenvalue decomposition (18) and the definition of (22). From (18) we note that,

(34)

Using (34) we can derive the following relationship (full details are given in Appendix D):

(35)

And similarly we can derive

(36)

Thus, we have

(37)

which is the result attained in (33). This concludes the proof of Lemma 3.

Lemma 4.

The following equality holds:

(38)

Proof of Lemma 4.

Due to the generalized eigenvalue decomposition (18) we have,

(39)

Using similar steps as the ones used to obtain (35) we can show that,

(40)

Thus, concluding the proof of Lemma 4.

Putting all the above together we have that

(41)

where the first equality is due to Lemma 3, and the second equality is due to Lemma 4. Thus, the upper bound given in (30) is attainable using the Gaussian distribution over , and , defined in (23). This concludes the proof of Theorem 2.

5. Discussion and Remarks

The alternative proof we have presented here uses the enhancement concept, also used in the proof of Liu and Shamai [2], in a more concrete manner. We have constructed a specific enhanced degraded model. The constructed model is the "tightest" enhancement possible in the sense that under the specified transformation, the matrix is the "smallest" possible positive definite matrix, that is, both and .

The specific enhancement results in a closed-form expression for the secrecy capacity, using . Furthermore, Theorem 2 shows that instead of we can maximize the secrecy capacity by taking an input covariance matrix that "disregards" subchannels for which the eavesdropper has an advantage over the legitimate recipient (or is equivalent to the legitimate recipient). Mathematically, this allows us to switch back from to , and thus to show that , explicitly defined, is the optimal input covariance matrix. Intuitively, is the optimal input covariance for the legitimate receiver, since under the transformation, , it is for the sub-channels for which the legitimate receiver has an advantage and zero otherwise.

The enhancement concept was used in addition to the I-MMSE approach in order to attain the upper bound in (30). The primary usage of these two concepts came together in (29), where we derived an initial upper bound. We have shown that the upper bound is attainable when is Gaussian with covariance matrix . Thus, under these conditions the second integral in (29) should be zero, that is,

(42)

where the second transition is due to the choice , the third is due to the choice of a Gaussian distribution for with covariance matrix , and the last equality is due to Lemma 4.

References

Liang Y, Poor HV, Shamai (Shitz) S: Information theoretic security. Foundations and Trends in Communications and Information Theory 2008, 5(4-5):355-580.
MATH Google Scholar
Liu T, Shamai (Shitz) S: A note on secrecy capacity of the multi-antenna wiretap channel. IEEE Transaction on Information Theory 2009, 55(6):2547-2553.
Article Google Scholar
Wyner AD: The wire-tap channel. Bell System Technical Journal 1975, 54(8):1355-1387.
Article MathSciNet MATH Google Scholar
Csiszár I, Körner J: Broadcast channels with confidential messages. IEEE Transactions on Information Theory 1978, 24(3):339-348. 10.1109/TIT.1978.1055892
Article MathSciNet MATH Google Scholar
Khisti A, Wornell G: The MIMOME channel. Proceedings of the 45th Annual Allerton Conference on Communication, Control and Computing, September 2007, Monticello, Ill, USA
Google Scholar
Oggier F, Hassibi B: The secrecy capacity of the MIMO wiretap channel. Proceedings of IEEE International Symposium on Information Theory (ISIT '08), July 2008, Toronto, Canada 524-528.
Google Scholar
Weingarten H, Steinberg Y, Shamai (Shitz) S: The capacity region of the Gaussian multiple-input multiple-output broadcast channel. IEEE Transactions on Information Theory 2006, 52(9):3936-3964.
Article MathSciNet MATH Google Scholar
Guo D, Shamai (Shitz) S, Verdú S: Mutual information and minimum mean-square error in Gaussian channels. IEEE Transactions on Information Theory 2005, 51(4):1261-1282. 10.1109/TIT.2005.844072
Article MathSciNet MATH Google Scholar
Palomar DP, Verdú S: Gradient of mutual information in linear vector Gaussian channels. IEEE Transactions on Information Theory 2006, 52(1):141-154.
Article MathSciNet MATH Google Scholar
Guo D, Shamai (Shitz) S, Verdú S: Proof of entropy power inequalities via MMSE. Proceedings of IEEE International Symposium on Information Theory (ISIT '06), July 2006, Seattle, Wash, USA 1011-1015.
Google Scholar
Lozano A, Tulino AM, Verdú S: Optimum power allocation for parallel Gaussian channels with arbitrary input distributions. IEEE Transactions on Information Theory 2006, 52(7):3033-3051.
Article MathSciNet MATH Google Scholar
Christensen S, Agarwal R, Carvalho E, Cioffi J: Weighted sum-rate maximization using weighted MMSE for MIMO-BC beamforming design. IEEE Transactions on Wireless Communications 2008, 7(12):4792-4799.
Article Google Scholar
Peleg M, Sanderovich A, Shamai (Shitz) S: On extrinsic information of good binary codes operating over Gaussian channels. European Transactions on Telecommunications 2007, 18(2):133-139. 10.1002/ett.1130
Article Google Scholar
Tulino AM, Verdú S: Monotonic decrease of the non-Gaussianness of the sum of independent random variables: a simple proof. IEEE Transactions on Information Theory 2006, 52(9):4295-4297.
Article MathSciNet MATH Google Scholar
Guo D, Shamai (Shitz) S, Verdú S: Estimation in Gaussian noise: properties of the minimum mean-square error. Proceedings of IEEE International Symposium on Information Theory (ISIT '08), July 2008, Toronto, Canada
Google Scholar
Ekrem E, Ulukus S: Secrecy capacity region of the Gaussian multi-receive wiretap channel. Proceedings of IEEE International Symposium on Information Theory (ISIT '09), June-July 2009, Seoul, Korea
Google Scholar
Liu R, Liu T, Poor HV, Shamai (Shitz) S: Multiple-input multiple-output Gaussian broadcast channels with coonfidential messages. submitted to IEEE Transactions on Information Theory and in Proceedings of IEEE International Symposium on Information Theory (ISIT'09), Seoul, Korea, June-July 2009
Apostol TM: Calculus, Multi-Variable Calculus and Linear Algebra, with Applications to Differential Equations and Probability. 2nd edition. Wiley, New York, NY, USA; 1969.
MATH Google Scholar
Strang G: Linear Algebra and Its Applications. Wellesley-Cambridge Press, Wellesley, Mass, USA; 1998.
MATH Google Scholar
Horn RA, Johnson CR: Matrix Analysis. University Press, Cambridge, UK; 1985.
Book MATH Google Scholar

Download references

Acknowledgments

This work has been supported by the Binational Science Foundation (BSF), the FP7 Network of Excellence in Wireless Communications NEWCOM++, and the U.S. National Science Foundation under Grants CNS-06-25637 and CCF-07-28208.

Author information

Authors and Affiliations

Department of Electrical Engineering, Technion-Israel Institute of Technology, Technion City, Haifa, 32000, Israel
Ronit Bustin & Shlomo Shamai (Shitz)
Department of Electrical Engineering, Princeton University, Princeton, NJ, 08544, USA
Ruoheng Liu & H. Vincent Poor

Authors

Ronit Bustin
View author publications
You can also search for this author in PubMed Google Scholar
Ruoheng Liu
View author publications
You can also search for this author in PubMed Google Scholar
H. Vincent Poor
View author publications
You can also search for this author in PubMed Google Scholar
Shlomo Shamai (Shitz)
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ronit Bustin.

Appendices

A. Proof of Lemma 1

The inner product between matrices and is defined as

(A1)

and the Schur product between matrices and is defined as

(A2)

For a function with gradient the line integral (type II) [18] is given by

(A3)

Thus in our case, where are matrices, and the integral over a path from to is equivalent to the following line integral:

(A4)

Since the Schur product preserves the positive definite/semidefinite quality [20, 7.5.3], it is easy to see that when , both are symmetric, and since is a positive semidefinite matrix for all , the integral is always nonnegative.

B. Second Proof of Theorem 2

The error covariance matrix of the optimal estimator can be written as , where both and are positive semidefinite, and is the error covariance matrix of the optimal linear estimator of from . Using this in (29), we have

(B1)

where the last inequality is again due to Lemma 1. Equality will be attained when , that is, when .

We denote . The optimal linear estimator has the following form:

(B2)

where is the covariance matrix of , and are the cross-covariance matrices of and , and is the covariance matrix of . We can easily calculate and (assuming zero mean):

(B3)

Regarding we can claim the following:

(B4)

thus,

(B5)

where equality, , is attained when the estimation error is zero, that is, when . Since this can only be achieved when or ; however since the Markov property, , must be preserved, we conclude that in order to achieve equality.

We have , where is a positive semidefinite matrix, and the linear estimator is

(B6)

Substituting this into the integral in (B.1) we have

(B7)

where the second inequality is due to Lemma 1, and the last inequality is due to Theorem 1, (16). The resulting upper bound equals the one given in (30). The rest of the proof follows via similar steps to those in the proof given in Section 4.

C. Proof of Lemma 2

Since the sub-matrix is positive semidefinite it is evident that . Thus, it remains to show that . Since is invertible, in order to prove , it is enough to show that

(C1)

We notice that,

(C2)

Using blockwise inversion [20] we have

(C3)

where denotes and

(C4)

due to the positivedefinite quality of and the Schur Complement Lemma [20]. Hence,

(C5)

D. Deriving Equation (35)

(D1)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Bustin, R., Liu, R., Poor, H.V. et al. An MMSE Approach to the Secrecy Capacity of the MIMO Gaussian Wiretap Channel. J Wireless Com Network 2009, 370970 (2009). https://doi.org/10.1155/2009/370970

Download citation

Received: 26 November 2008
Revised: 15 March 2009
Accepted: 21 June 2009
Published: 27 July 2009
DOI: https://doi.org/10.1155/2009/370970

An MMSE Approach to the Secrecy Capacity of the MIMO Gaussian Wiretap Channel

Abstract

1. Introduction

2. Definitions and Preliminaries

3. The Degraded MIMO Gaussian Wiretap Channel

4. The General MIMO Gaussian Wiretap Channel

5. Discussion and Remarks

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

A. Proof of Lemma 1

B. Second Proof of Theorem 2

C. Proof of Lemma 2

D. Deriving Equation (35)

Rights and permissions

About this article

Cite this article

Share this article

Keywords