Skip to main content

On Gaussian covert communication in continuous time

  • The Correction to this article has been published in EURASIP Journal on Wireless Communications and Networking 2020 2020:29


The paper studies covert communication over a continuous-time Gaussian channel. The covertness condition requires that the channel output must statistically resemble pure noise. When the additive Gaussian noise is “white” over the bandwidth of interest, a formal coding theorem is proven, extending earlier results on covert Gaussian communication in discrete time. This coding theorem is applied to study scenarios where the input bandwidth can be infinite and where positive or even infinite per-second rates may be achievable.


Covert communication, or communication with low probability of detection [14], refers to scenarios where the transmitter and the receiver must keep a warden from discovering the fact that they are using the channel to communicate. On an additive white Gaussian noise (AWGN) channel, this means that the warden’s observation should be statistically close to pure noise. It was first shown in [1] that the AWGN channel obeys the so-called square-root law for covert communication: the number of information nats that can be communicated covertly over the channel can only grow proportionally to the square root of the total number of channel uses. The exact scaling law, when covertness is measured in terms of relative entropy, was determined in [3]. Similar results have been obtained for the binary symmetric channel [2] and general discrete memoryless channels [3, 4]. A number of further works have extended these results in several directions. Among them, some also consider total variation distance in place of relative entropy as the measure for covertness [5, 6].

A discrete-time AWGN channel is usually used to model a continuous-time communication channel with a bandwidth constraint on its input waveform, corrupted by Gaussian noise that is “white” with respect to that bandwidth. Using the sampling theorem, such a continuous-time channel with bandwidth W Hz over the time interval [ 0,T] is approximately equivalent to 2WT uses of a discrete-time AWGN channel [7]. Hence, one can roughly say (as in, e.g., a brief remark in [1]) that the number of nats that can be covertly communicated over this continuous-time channel is proportional to \(\sqrt {\text {\usefont {U}{eur}{m}{n}\selectfont {W}}\text {\usefont {U}{eur}{m}{n}\selectfont {T}}}\).

In this paper, we first provide a rigorous mathematical framework to study covert communication over the continuous-time channel with AWGN over the bandwidth of interest. Formal treatment of continuous-time Gaussian channels is nontrivial because, in short, no nonzero signal can be both band-limited and time-limited (see, e.g., [8], Theorem 6.8.2). Indeed, Shannon’s capacity formula for the band-limited Gaussian channel [7] called for several follow-up works to acquire a clear physical meaning; see [9, 10] and references therein. In this paper, we adopt a model proposed by Wyner [9] where the input is required to be “strictly band-limited and approximately time-limited” and introduce a covertness constraint to that model.

There are some important technicalities in formulating the continuous-time model. In particular, we find it important to let the warden observe the output waveform over the entire real line. Indeed, even when the transmitted signal is strictly time-limited, one should not assume that the warden can only observe the channel output within the time period that the input occupies. We demonstrate this by constructing a slightly different model and proving a pathological result under that model.

Under the proposed framework, we prove that the maximum number of nats that can be covertly communicated over T seconds and bandwidth W Hz is indeed proportional to \(\sqrt {\text {\usefont {U}{eur}{m}{n}\selectfont {W}}\text {\usefont {U}{eur}{m}{n}\selectfont {T}}}\). Additionally, we show that binary phase shift keying (BPSK) is optimal in the sense that it achieves the dominant term in this maximum number of nats. The latter is related to the fact that BPSK can achieve the capacity per unit cost on the AWGN channel [11], i.e., BPSK is asymptotically optimal in the limit where the signal-to-noise ratio goes to zero.

Using the continuous-time result, we then investigate the regime where W is infinity or grows to infinity together with T. Our intention is to capture engineering insights to scenarios where information is transmitted over a large bandwidth and a relatively short time, such as in “spread-spectrum” communication [12]. We prove that, if W is infinity or grows large fast enough compared to T, then covert communication can have positive rates in nats per second. This is not surprising, since we already argued that information throughput grows like \(\sqrt {\text {\usefont {U}{eur}{m}{n}\selectfont {W}}\text {\usefont {U}{eur}{m}{n}\selectfont {T}}}\). Additionally, we show that, if available bandwidth grows large fast enough, then, under the same average-power constraint on the input, the covert communication capacity is the same as the capacity without covertness constraint. Note, however, that traditionally, the capacity of the “infinite-bandwidth” AWGN channel is computed by bringing W to infinity after letting T→.

Our framework only applies to the case where the Gaussian noise is white over the bandwidth of interest. When the noise is colored, we are not able to prove a rigorous coding theorem. Instead, we use known formulas to calculate relative entropy and mutual information [13], which lead us to some conjectures.

The infinite-bandwidth results are related to our recent work [14], which shows that the continuous-time Poisson channel with neither bandwidth nor peak-power constraint permits transmission of infinitely many information nats per second. However, a close look at the two works reveals fundamental differences between the two channels. For example, in the Poisson case, a constraint on the average input power has no effect on covert communication, whereas in the Gaussian case, it generally does.

The recent work [15] also treats covert communication over a continuous-time Gaussian channel. It adopts a different model from the current paper: the transmitted signal is required to be strictly time-limited while satisfying some “spectral mask” constraints in the frequency domain. Under these constraints, the authors of [15] show that \(\sqrt {\text {\usefont {U}{eur}{m}{n}\selectfont {W}}{\text {\usefont {U}{eur}{m}{n}\selectfont {T}}}}\) growth of covert information is achievable using raised-cosine modulation. However, they do not prove any continuous-time converse result under that model.

The rest of this paper is arranged as follows. After introducing some notation, we formulate and solve the continuous-time covert communication problem in Section 2. We then study the infinite-bandwidth scenario in Section 3. We propose a slightly different model and show its deficiency in Section 4. The paper is concluded with some discussion in Section 5.

Some notation

We use uppercase letters like X to denote random variables, and corresponding lowercase letters like x to denote their realizations. We write a random real function on the interval [ a,b] as X(t), t[ a,b]; sometimes we also use the shorthand notation \(X_{a}^{b}\), where a and b might be replaced with − or , respectively. When the domain of the function is clear from context, we may further shorten it as X(·). To denote the realization of a random function, we replace X in the above by x. A vector (Xi,Xi+1,…,Xj) is written as \(X_{i}^{j}\), where j may be replaced by .

We slightly deviate from standard notation to express the relative entropy in terms of two random variables (as opposed to in terms of their distributions). For example,

$${\kern60pt}\!\mathscr{D}\left(\left.Y_{-\infty}^{\infty} \right\| Z_{-\infty}^{\infty}\right)$$

denotes the relative entropy between the distributions of Y(t), t(−,), and Z(t), t(−,), respectively. Mutual information between two continuous-time random processes or continuous-valued random variables is written, for example, like

$${\kern60pt}\!\text{\usefont{U}{eur}{m}{n}\selectfont {I}}(X_{-\infty}^{\infty};Y_{-\infty}^{\infty}).$$

For definition of relative entropy and mutual information for general probability distributions, we refer to [13].

We use W to denote the bandwidth that the input signal can employ, and T to denote the total time of communication. We shall often study the limit where the product WT→. Here, W and T can be functions of each other, or be such that W is fixed while T→, or vice versa. Further, we use δ to denote the covertness parameter, which can be a positive constant, or a positive function of WT satisfying

$$\begin{array}{*{20}l}{\kern60pt} {\lim}_{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\to\infty} \text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\delta = \infty, \end{array} $$
$$\begin{array}{*{20}l} {\lim}_{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\to\infty} \frac{\delta}{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}} = 0. {\kern4.9pt} \end{array} $$

We use o(a) to denote a “higher-order term” than a, in the sense that the ratio o(a)/a approaches zero in the limit where a0; this limit usually coincides with WT→.

Finally, information is measured in nats, and log denotes the natural logarithm.

A formal continuous-time treatment

Consider the continuous-time channel described by

$${\kern45pt} Y(t) = {{X(t)}} + Z(t),\quad t\in\mathbb{R}, $$

where X(·) is the (potentially random) channel input sent by the transmitter, which we require to be square-integrable: with probability one,

$${\kern60pt} \int_{-\infty}^{\infty} {{X}}(t)^{2} \mathrm{d} t < \infty; $$

Y(·) is the channel output observed by both the intended receiver and the warden; and Z(·) is the additive noise to the channel, which we assume to be generated randomly according to a zero-mean stationary Gaussian process and independent of X(·).

A codebook is specified by an encoder, which is a mapping from a message m taken from the message set \(\mathcal {{M}}\) to an input waveform x(·), and a decoder, which is a mapping from an output waveform y(·) to the decoded message \(\hat {m}\in \mathcal {{M}}\). We allow the transmitter and the receiver to use a random codebook: the distribution according to which the codebook is drawn is known to the warden, but the realization of the codebook is not.

We require that the input waveform be “strictly band-limited to [ −W,W] and approximately time-limited to [ 0,T]” in the sense of Wyner [9]. Formally, for every message \(m\in \mathcal {{M}}\):

  1. 1.

    The Fourier Transform of x(·), which is given by

    $${\kern15pt} f\mapsto \int_{-\infty}^{\infty} x(t) e^{-\text{\usefont{U}{eur}{m}{n}\selectfont {i}} 2 \pi f t} \mathrm{d} t, \quad f\in\mathbb{R}, $$

    must equal zero for all f [ −W,W];

  2. 2.

    The ratio

    $${\kern40pt} \frac{\int_{0}^{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}} |x(t)|^{2} \mathrm{d} t}{\int_{-\infty}^{\infty} |x(t)|^{2} \mathrm{d} t} \ge 1-\eta $$

    for some η(0,1);

  3. 3.

    The receiver maps y(t), t(−,), to a decoded message; and

  4. 4.

    For some given δ>0, the following covertness condition must be satisfied:

    $${\kern45pt} \!\mathscr{D}\left(\left.Y_{-\infty}^{\infty}\right\|Z_{-\infty}^{\infty}\right) \le \delta. $$

Let M(W,T,ε,η,δ) be the largest possible value of \(|\mathcal {{M}}|\) such that the above conditions are satisfied and that the average probability of a decoding error for a uniformly chosen message is less than or equal to ε.

In the rest of this section, we assume that Z(·) has power spectral density (PSD) N(·) that is constant within the bandwidth of interest:

$${\kern55pt} \text{\usefont{U}{eur}{m}{n}\selectfont {N}}(\,f) = \frac{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}{2},\quad |\,f|\le \text{\usefont{U}{eur}{m}{n}\selectfont {W}}. $$

Remark 1

In some parts of this paper, e.g., the next theorem, W is allowed to grow to infinity. There is no single noise process to satisfy (8) for every finite W, because a random process having a PSD that is constant over the entire real line does not exist. In cases where W→, our setting should be understood in such a way that the noise process “adapts” itself with our choice of input bandwidth W. Although this formulation (that the noise process adapts to the input signal) has no physical meaning, it serves as a mathematically valid route to study the limit where W→, providing engineering insights to wideband scenarios.

The next theorem extends Theorem 5 of [3] to the continuous-time setting. Furthermore, it shows that BPSK is optimal up to the dominant term in total throughput. Here, by BPSK, we mean a signaling scheme where the symbols take values in the set {−a,a} for some constant a. It is clear from the proof in Section 2.4 that BPSK is also optimal in achieving the discrete-time result ([3], Theorem 5)Footnote 1.

Theorem 1

If, for every finite W that we consider, the additive Gaussian noise has PSD satisfying (8), then, irrespectively of the value of N0,

$${\kern15pt} {\lim}_{\eta\downarrow 0} {\lim}_{\epsilon\downarrow 0} {\lim}_{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\to\infty} \frac{\log \text{\usefont{U}{eur}{m}{n}\selectfont {M}}(\text{\usefont{U}{eur}{m}{n}\selectfont {W}},\text{\usefont{U}{eur}{m}{n}\selectfont {T}},\epsilon,\eta,\delta)}{\sqrt{2 \text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\delta}} = 1. $$

Furthermore, the limit (9) can be achieved by modulating a set of waveforms with BPSK.


See Sections 2.3 and 2.4. □

The next corollary follows immediately from Theorem 1.

Corollary 1

If W is a constant that does not depend on T, then

$${\kern10pt} {\lim}_{\eta\downarrow 0} {\lim}_{\epsilon\downarrow 0} {\lim}_{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\to\infty} \frac{\log \text{\usefont{U}{eur}{m}{n}\selectfont {M}}(\text{\usefont{U}{eur}{m}{n}\selectfont {W}},\text{\usefont{U}{eur}{m}{n}\selectfont {T}},\epsilon,\eta,\delta)}{\sqrt{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\delta}} = \sqrt{2\text{\usefont{U}{eur}{m}{n}\selectfont {W}}}. $$

Hence, not surprisingly, when available bandwidth is fixed, the amount of information that can be covertly communicated over the continuous-time Gaussian channel is approximately proportional to \(\sqrt {\text {\usefont {U}{eur}{m}{n}\selectfont {T}}\delta }\).

In the rest of this section, we first review some mathematical tools (Section 2.1), which will allow us to reduce the continuous-time channel to a discrete-time one (Section 2.2). We then prove the converse (Section 2.3) and direct (Section 2.4) parts of Theorem 1. Some intermediate steps in the proof will be borrowed from [3].

Prolate spheroidal wave functions

We provide a very brief introduction to the prolate spheroidal wave functions (PSWFs), which are powerful tools in formal treatment of continuous-time Gaussian channels; for more details, we refer the reader to [9, 10, 16, 17] and references therein.

Given any W,T>0, there exists a countably infinite set of positive real numbers {λi} satisfying

$${\kern60pt} 1>\lambda_{1}>\lambda_{2}>\cdots $$

and a corresponding set of real functions \(\{{\psi }_{i}\colon \mathbb {R}\to \mathbb {R}\}_{i=1}^{\infty }\) such that the following properties are satisfied:

  • Each ψi(·) is band-limited to W Hz. Further, the functions {ψi(·)} are orthonormal on \(\mathbb {R}\), and complete in the space of square-integrable functions that are band-limited to W Hz;

  • The restrictions of {ψi(·)} to the time interval [ 0,T] are orthogonal:

    $$ \int_{0}^{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}} \psi_{i}(t) \psi_{j}(t) \mathrm{d} t = \begin{cases} \lambda_{i},&i=j,\\0,&i\neq j.\end{cases} $$

    Further, the restrictions of {ψi} to [ 0,T] are complete in the space of square-integrable functions on [ 0,T].

It was shown by Slepian [17] that the coefficients {λi} above satisfy the following: for any α(0,1) (not dependent on W and T), as WT→,

$$\begin{array}{*{20}l}{\kern60pt}\lambda_{2(1-\alpha) \text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}} \to 1 \end{array} $$
$$\begin{array}{*{20}l} \lambda_{2(1+\alpha) \text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}} \to 0.{\kern-2.3pt} \end{array} $$

Finally, for a zero-mean stationary Gaussian process Z(·) with PSD

$${\kern45pt} \text{\usefont{U}{eur}{m}{n}\selectfont {N}}(f) = \left\{\begin{array}{ll} \frac{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}{2}, &\quad |f|\le \text{\usefont{U}{eur}{m}{n}\selectfont {W}} \\ 0, &\quad |f|> \text{\usefont{U}{eur}{m}{n}\selectfont {W}}, \end{array}\right. $$

we have the following Karhunen-Loève expansion in terms of the above PSWFs:

$${\kern40pt} Z(t) = {\sum_{i=1}^{\infty}} \;{\hat{Z}_{i}} \psi_{i}(t),\quad t\in[\!0,{\text{\usefont{U}{eur}{m}{n}\selectfont{T}}}], $$

where \(\{\hat {Z}_{i}\}\) are independent and identically distributed (IID) Gaussian random variables of mean zero and variance N0/2.

Reduction to discrete time

Recall Condition 1 that the input signal x(t), \(t\in \mathbb {R}\), must be band-limited to W Hz. Since the set {ψi(·)} is complete in the space of square-integrable functions that are band-limited to W Hz, we can write X(·) as

$${\kern45pt} X(t) = \sum_{i=1}^{\infty} \hat{X}_{i} \psi_{i}(t),\quad t\in\mathbb{R} $$

for an infinite sequence of random variables \(\hat {X}_{1},\hat {X}_{2},\ldots \). Furthermore, the output Y(·) can be passed through an ideal low-pass filter of cut-off frequency W. Doing so does not affect X(·), but will change the PSD of Z(·) to the one given by (15), so that the resulting noise process will satisfy the expansion (16). We can then decompose the continuous-time channel into an infinite sequence of parallel discrete-time channels:

$${\kern45pt} \hat{Y}_{i} = \hat{X}_{i} + \hat{Z}_{i},\quad i =1,2,\ldots, $$

where \(\hat {Z}_{1},\hat {Z}_{2},\ldots \) are IID zero-mean Gaussian of variance N0/2. One can see that the above reduction is optimal for both the receiver (i.e., decoding) and for the warden (i.e., detection of communication). We can hence base both the converse and the direct parts of the proof of Theorem 1 on the channel (18).

Converse part of Theorem 1

Consider a random code for the message set {1,…,M} that satisfies all aforementioned conditions. By a standard argument using Fano’s inequality and the chain rule of mutual information, we have

$${\kern35pt} (1-\epsilon) \log \text{\usefont{U}{eur}{m}{n}\selectfont {M}} - 1 \le \sum_{i=1}^{\infty} \!\text{\usefont{U}{eur}{m}{n}\selectfont \;{I}}(\hat{X}_{i} ; \hat{Y}_{i}). $$

Further, noting that, under a second-moment constraint, the Gaussian input maximizes mutual information over a Gaussian channel [18], we have, for every i,

$${\kern25pt} \!\text{\usefont{U}{eur}{m}{n}\selectfont {I}}(\hat{X}_{i};\hat{Y}_{i}) \le \frac{1}{2} \log \left(1+ \frac{2\rho_{i}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}\right) \le \frac{\rho_{i}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}, $$

where we defined

$${\kern35pt} \rho_{i} \triangleq \textsf{E}\left[|\hat{X}_{i}|{~}^{2}\right],\quad i=1,2,\ldots. $$

(The expectation is computed over the possibly random codebook and a uniformly chosen message.) Combining (19) and (20) yields

$${\kern35pt} (1-\epsilon) \log \text{\usefont{U}{eur}{m}{n}\selectfont {M}} - 1 \le \frac{1}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}} \sum_{i=1}^{\infty} \rho_{i}. $$

To bound the right-hand side of (22), consider the covertness requirement. Recall that the processing on \(Y_{-\infty }^{\infty }\) described in Section 2.2 maps it to \(\hat {Y}_{1}^{\infty }\). Clearly, it also maps \(Z_{-\infty }^{\infty }\) to \(\hat {Z}_{1}^{\infty }\). Also recall that this reduction is optimal for the warden. Hence

$$\begin{array}{*{20}l}{\kern25pt} \!\mathscr{D}\left(\left.Y_{-\infty}^{\infty}\right\|Z_{-\infty}^{\infty}\right) = \!\mathscr{D} \left(\left. \hat{Y}_{1}^{\infty} \right\| \hat{Z}_{1}^{\infty} \right). \end{array} $$

It then follows by Condition 4 that

$$\begin{array}{*{20}l}{\kern15pt} \delta & \ge \!\mathscr{D} \left(\left. \hat{Y}_{1}^{\infty} \right\| \hat{Z}_{1}^{\infty} \right) \ge \sum_{i=1}^{\infty} \!\mathscr{D}\left(\left. \hat{Y}_{i}\,\right\| \hat{Z}_{i}\right), \end{array} $$

where the second inequality follows by the same steps as Eq. (13) of [3]. For a fixed second moment, the relative entropy on the right-hand side above is maximized by \(\hat {X}_{i}\) being zero-mean Gaussian ([3], Eq. (74)):

$${\kern15pt} \!\mathscr{D}\left(\left. \hat{Y}_{i}\,\right\| \hat{Z}_{i}\right) \ge \frac{\rho_{i}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}} - \frac{1}{2} \log \left(1+\frac{2\rho_{i}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}\right). $$

Fix any α(0,1). We combine (24) and (25), drop all summands with i>2(1+α)WT (note that they are nonnegative), and use the convexity of the right-hand side of (25) in ρi to obtain

$$\begin{array}{*{20}l} &\delta \ge \sum_{i=1}^{\lfloor 2(1+\alpha)\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}} \rfloor} \frac{\rho_{i}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}} - \frac{1}{2} \log \left(1+\frac{2\rho_{i}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}\right) \\ &\ge \lfloor 2(1+\alpha)\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}} \rfloor \left(\frac{\bar{\rho}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}} - \frac{1}{2} \log \left(1+\frac{2\bar{\rho}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}\right) \right), \end{array} $$

where we defined

$${\kern35pt} \bar{\rho} \triangleq \frac{1}{\lfloor 2(1+\alpha)\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}} \rfloor} \sum_{i=1}^{\lfloor 2(1+\alpha)\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}} \rfloor} \rho_{i}. $$

Recall (2), which together with (26) implies that \(\bar {\rho }\) must tend to zero as WT→. Furthermore, by applying the Taylor expansion

$${\kern25pt} \log \left(1+\frac{2\bar\rho}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}\right) = \frac{2\bar\rho}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}- \frac{2\bar\rho^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}^{2}} + o (\bar\rho^{2}) $$

to (26), we see that \(\bar {\rho }\) must satisfy

$${\kern15pt} \bar{\rho} \le \sqrt{\frac{\delta}{\lfloor 2(1+\alpha)\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}} \rfloor}}\cdot \text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0} + o \left(\sqrt{\frac{\delta}{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}}}\right). $$

Next consider Condition 2. Since the functions ψi(·) have unit energy and satisfy (12), Condition 2 requires

$${\kern60pt} \frac{\sum_{i=1}^{\infty} \lambda_{i} \rho_{i}}{\sum_{i=1}^{\infty} \rho_{i}} \ge 1-\eta. $$

Further using (14), we have the following requirement:

$$\begin{array}{*{20}l}{\kern25pt} 1-\eta & \le \varliminf_{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\to\infty} \frac{\sum_{i=1}^{\lfloor 2(1+\alpha)\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}} \rfloor} \rho_{i}}{\sum_{i=1}^{\infty} \rho_{i}} \\ & = \varliminf_{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\to\infty} \frac{\lfloor 2(1+\alpha)\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}} \rfloor \bar{\rho}}{\sum_{i=1}^{\infty} \rho_{i}}. \end{array} $$

Together with (22), this implies

$$\begin{array}{*{20}l} & (1-\epsilon) \varlimsup_{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont{T}}\to\infty} \frac{\log{\text{\usefont{U}{eur}{m}{n}\selectfont {M}}}}{\sqrt{2\text{\usefont{U}{eur}{m}{n}\selectfont {W}}{\text{\usefont{U}{eur}{m}{n}\selectfont{T}}\delta}}}\\ \quad & \le \frac{1}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}} \cdot \varlimsup_{\text{\usefont{U}{eur}{m}{n}\selectfont{W}}\text{\usefont{U}{eur}{m}{n}\selectfont{T}}\to\infty} \frac{\sum_{i=1}^{\infty} \rho_{i}}{\sqrt{2\text{\usefont{U}{eur}{m}{n}\selectfont {W}}{\text{\usefont{U}{eur}{m}{n}\selectfont{T}}\delta}}}\\ & = \frac{1}{\text{\usefont{U}{eur}{m}{n}\selectfont{N}}_{0}} \cdot \varlimsup_{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont{T}}\to\infty} \frac{\sum_{i=1}^{\infty} \rho_{i}}{\lfloor 2(1+\alpha)\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont{T}}\rfloor \bar{\rho}} \cdot \frac{\lfloor 2(1+\alpha)\text{\usefont{U}{eur}{m}{n}\selectfont{W}}\text{\usefont{U}{eur}{m}{n}\selectfont{T}} \rfloor \bar{\rho}}{\sqrt{2\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont{T}}\delta}}\\ & \le \frac{1}{(1-\eta)\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}} \varlimsup_{\text{\usefont{U}{eur}{m}{n}\selectfont{W}}\text{\usefont{U}{eur}{m}{n}\selectfont{T}}\to\infty} \frac{\lfloor 2(1+\alpha)\text{\usefont{U}{eur}{m}{n}\selectfont{W}}\text{\usefont{U}{eur}{m}{n}\selectfont{T}} \rfloor \bar{\rho}}{\sqrt{2\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont{T}}\delta}}\\ & = \frac{1+\alpha}{(1-\eta)\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}} \varlimsup_{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont{T}}\to\infty} \sqrt{ \frac{2 \text{\usefont{U}{eur}{m}{n}\selectfont {W}} \text{\usefont{U}{eur}{m}{n}\selectfont {T}}}{\delta}} \cdot \bar{\rho} \\ & \le \frac{1+\alpha}{1-\eta} {\lim}_{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont{T}}\to\infty} \sqrt{ \frac{2 \text{\usefont{U}{eur}{m}{n}\selectfont {W}} \text{\usefont{U}{eur}{m}{n}\selectfont {T}}}{\delta}}\sqrt{\frac{\delta}{\lfloor 2(1+\alpha)\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont{T}}\rfloor}} \\ & = \frac{\sqrt{1+\alpha}}{1-\eta}, \end{array} $$

where the second-to-last line follows by (29). Letting ε, η, and α go to zero in the above yields the desired converse result.

Direct part of Theorem 1

Fix some α,γ(0,1), both of which will be chosen to be close to zero later. We use the first 2(1−α)WT channels in (18) to communicate and discard the remaining channels. On these channels, we generate a random codebook by picking every entry in every codeword independently and equally likely to be a or −a, where

$${\kern20pt} a \triangleq (1-\gamma) \sqrt{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}} \cdot \left(\frac{\delta}{\lfloor 2(1-\alpha)\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}} \rfloor}\right)^{\frac{1}{4}}. $$

(Every \(\hat {X}_{i}\) with i>2(1−α)WT is chosen to be zero with probability one.) The bandwidth constraint is obviously satisfied. The time-limit constraint is also satisfied for any η>0 when WT is sufficiently large, by virtue of (13). For covertness, we have

$$\begin{array}{*{20}l} \!\mathscr{D}(Y_{-\infty}^{\infty}\|Z_{-\infty}^{\infty}) & = \!\mathscr{D}\left(\left. \hat{Y}_{1}^{\lfloor 2(1-\alpha)\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}} \rfloor} \right\| \hat{Z}_{1}^{\lfloor 2(1-\alpha)\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}} \rfloor} \right) \\ & = \sum_{i=1}^{\lfloor 2(1-\alpha)\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}} \rfloor} \!\mathscr{D}(\hat{Y}_{i} \| \hat{Z}_{i}) \\ & = \lfloor 2 (1-\alpha) \text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}} \rfloor \cdot \!\mathscr{D}(\hat{Y}\| \hat{Z}), \end{array} $$

where we dropped the subscript in the last line, as all pairs \((\hat {X}_{i},\hat {Y}_{i})\), i=1,…,2(1−α)WT, have the same joint distribution. Note that, by our choice of \(\hat {X}\), the random variable \(\hat {Y}\) has the following probability density function (PDF):

$$ f(y) = \frac{1}{\pi \text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}} \left(\frac{1}{2}e^{-\frac{(y-a)^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}} + \frac{1}{2} e^{-\frac{(y+a)^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}}\right),\quad y\in\mathbb{R}. $$

The PDF of \(\hat {Z}\) is that of the Gaussian:

$${\kern45pt} g(z) = \frac{1}{\pi\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}} e^{-\frac{z^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}},\quad z\in\mathbb{R}. $$

We hence have

$$\begin{array}{*{20}l}{} \!\mathscr{D}(\hat{Y}\| \hat{Z}) & = \int_{-\infty}^{\infty} f(y) \log {\frac{\frac{1}{2}{e}^{-\frac{(y-a)^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}} + \frac{1}{2} {e}^{-\frac{(y+a)^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}}}{{e}^{-\frac{y^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont{N}}_{0}}}}}\\ & = -\frac{a^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont{N}}_{0}} + {\int}_{-\infty}^{\infty} f(y) \log{\left({\frac{1}{2}}{{e}^{\frac{2ay}{\text{\usefont{U}{eur}{m}{n}\selectfont{N}}_{0}}}} + \frac{1}{2} {e^{-\frac{2ay}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}}}\right)}.\\ \end{array} $$

For the integrand above, we have the following upper bound:

$${} \log\left(\frac{1}{2} e^{\frac{2ay}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}} + \frac{1}{2} e^{-\frac{2ay}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}}\right) \le \frac{2a^{2}y^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}^{2}} - \frac{4a^{4}y^{4}}{3\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}^{4}} + \frac{64 a^{6}y^{6}}{45 \text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}^{6}}. $$

Further note that, by (35), the second moment of \(\hat {Y}\) is \(\left (a^{2}+\frac {\text {\usefont {U}{eur}{m}{n}\selectfont {N}}_{0}}{2}\right)\), the fourth moment is \(\left (a^{4}+3a^{2}\text {\usefont {U}{eur}{m}{n}\selectfont {N}}_{0} + \frac {3\text {\usefont {U}{eur}{m}{n}\selectfont {N}}_{0}^{2}}{4}\right)\), and the sixth moment is finite. We can thus continue (37) as

$$\begin{array}{*{20}l}{} \!\mathscr{D}(\hat{Y}\| \hat{Z}) & \le - \frac{a^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont{N}}_{0}} + \frac{2a^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont{N}}_{0}^{2}} \left(a^2+\frac{\text{\usefont{U}{eur}{m}{n}\selectfont{N}}_{0}}{2}\right) \\ & \quad{} - \frac{4a^{4}}{3\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}^{4}} \left(a^4+3a^{2}\text{\usefont{U}{eur}{m}{n}\selectfont{N}}_{0} + \frac{3\text{\usefont{U}{eur}{m}{n}\selectfont{N}}_{0}^{2}}{4}\right) + o\left(a^{4}\right) \\* & = \frac{a^{4}}{\text{\usefont{U}{eur}{m}{n}\selectfont{N}}_{0}^{2}} + o\left(a^{4}\right) \\* & = (1-\gamma)^{4} \cdot \frac{\delta}{\lfloor 2 (1-\alpha) \text{\usefont{U}{eur}{m}{n}\selectfont{W}}\text{\usefont{U}{eur}{m}{n}\selectfont{T}} \rfloor} + o\left(\frac{\delta}{\text{\usefont{U}{eur}{m}{n}\selectfont{W}}\text{\usefont{U}{eur}{m}{n}\selectfont{T}}}\right). \end{array} $$

Combining (34) and (39), we get

$${\kern25pt} \!\mathscr{D}(Y_{-\infty}^{\infty} \| Z_{-\infty}^{\infty}) \le (1-\gamma)^{4} \delta + o(\delta), $$

which is smaller than δ for large enough WT.

We now analyze the maximum possible values for M for which the decoder can decode correctly. To this end, like [3], we use an information-spectrum result [19, 20], which guarantees that a sequence of codes can have vanishing error probability provided

$$\begin{array}{*{20}l} &{\lim}_{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}} \to\infty} \frac{\log \text{\usefont{U}{eur}{m}{n}\selectfont {M}}}{\sqrt{2\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\delta}}\\ & \qquad\le p\,\text{-}\liminf_{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\to\infty} \frac{1}{\sqrt{2\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\delta}} \log \frac{p(\hat{Y}_{1}^n|\hat{X}_{1}^n)}{p(\hat{Y}_{1}^n)}\\ &\qquad= p\,\text{-}\liminf_{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\to\infty} \frac{\lfloor 2 (1-\alpha) \text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}} \rfloor}{\sqrt{2\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\delta}} \log\frac{p(\hat{Y}|\hat{X})}{p(\hat{Y})}, \end{array} $$

where \(p\,\text {-}\liminf \) denotes the limit inferior in probability, namely, the largest number such that the probability that the random variable in consideration is greater than this number tends to one in the limit, and where we slightly abuse notation to use p(·) and p(·|·) to denote PDF and conditional PDF of the corresponding random variables or random vectors. Recall that \(p(\hat {Y})=f(\hat {Y})\), where f(·) is given by (35), while given \(\hat {X}=x\), \(\hat {Y}\) is Gaussian with mean x and variance \(\frac {\text {\usefont {U}{eur}{m}{n}\selectfont {N}}_{0}}{2}\). Also recall that x equals either a or −a. We hence have

$$\begin{array}{*{20}l}{\kern25pt} \frac{p(\hat{Y}|\hat{X})}{p(\hat{Y})} & = \frac{ e^{-\frac{(\hat{Y}-\hat{X})^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}}}{ \frac{1}{2} e^{-\frac{(\hat{Y}-a)^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}} + \frac{1}{2} e^{-\frac{(\hat{Y}+a)^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}}}\\ & = \frac{e^{\frac{2\hat{X}\hat{Y}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}}}{\frac{1}{2} e^{\frac{2a\hat{Y}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}} + \frac{1}{2}e^{-\frac{2a\hat{Y}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}}}. \end{array} $$


$${\kern25pt} \log \left(\frac{1}{2} e^{\frac{2a\hat{Y}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}} + \frac{1}{2}e^{-\frac{2a\hat{Y}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}}\right) \le \frac{2a^{2}\hat{Y}^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}^{2}}, $$

we obtain

$$\begin{array}{*{20}l}{\kern35pt} \log\frac{p(\hat{Y}|\hat{X})}{p(\hat{Y})} & \ge \frac{2}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}} {\hat{X}\hat{Y}} - \frac{2a^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}^{2}} \hat{Y}^2. \end{array} $$

Noting that

$$\begin{array}{*{20}l}{\kern55pt} \textsf{E}\left[\hat{X}\hat{Y}\right] & = a^{2} \end{array} $$
$$\begin{array}{*{20}l} \textsf{E}\left[\hat{Y}^{2}\right] & = a^{2} + \frac{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}{2}, \end{array} $$

we can compute the expected value of the right-hand side of (44) to be

$${\kern25pt} \textsf{E}\left[\frac{2}{\text{\usefont{U}{eur}{m}{n}\selectfont{N}}_0} {\hat{X}\hat{Y}} - \frac{2a^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont{N}}_{0}^{2}} \hat{Y}^{2}\right] = \frac{a^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}} - \frac{2a^{4}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}^{2}}. $$

Similarly, the variance can be shown to be

$${\kern15pt} \textsf{Var}\left[{\frac{2}{\text{\usefont{U}{eur}{m}{n}\selectfont{N}}_0} {\hat{X}\hat{Y}} - \frac{2a^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont{N}}_{0}^{2}} \hat{Y}^{2}}\right] = \frac{2a^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}} + o\left(a^{2}\right). $$

Using (44), (47), (48), and Chebyshev’s inequality, we obtain

$$\begin{array}{*{20}l} &p\,\text{-}\liminf_{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\to\infty} \frac{\lfloor 2 (1-\alpha) \text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}} \rfloor}{\sqrt{2\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\delta}} \log\frac{p(\hat{Y}|\hat{X})}{p(\hat{Y})}\\ \quad &\quad \ge p\,\text{-}\liminf_{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\to\infty} \frac{\lfloor 2 (1-\alpha) \text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}} \rfloor}{\sqrt{2\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\delta}} \left(\frac{2}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}} {\hat{X}\hat{Y}} - \frac{2a^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}^{2}} \hat{Y}^{2} \right) \\ &\quad = {\lim}_{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\to\infty} \frac{\lfloor 2 (1-\alpha) \text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}} \rfloor}{\sqrt{2\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\delta}} \cdot \frac{a^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}\\* &\quad = {(1-\gamma)^{2}}{\sqrt{1-\alpha}}. \end{array} $$

Recalling (41), the proof is completed when we bring both γ and α to zero.

Exploring infinite bandwidth

In this section, we study cases where available bandwidth grows without bound. Section 3.1 considers the scenario where the additive white Gaussian noise also has unlimited bandwidth. To this end, we assume the noise PSD is constant over a finite bandwidth W and then let W grow to infinity, either together with T or with T held fixed; recall Remark 1. Section 3.2 considers the case where the noise PSD is constant within a certain bandwidth and zero elsewhere (while no bandwidth limit is imposed on the input). Both these sections directly use Theorem 1 to obtain the desired results. Finally, Section 3.3 considers colored noise, where our analysis is less rigorous and does not lead to an explicit coding theorem; we state our findings there as “conjectures”.

White noise with unbounded bandwidth

Corollary 2

Assume that the noise process satisfies (8) for every finite W that we consider. If the limit

$${\kern65pt} {\lim}_{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\to\infty} \frac{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\delta}{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}} = c $$

exists, where c may equal infinity, then the per-second covert communication capacity is

$$ {\lim}_{\eta\downarrow 0}{\lim}_{\epsilon\downarrow 0} {\lim}_{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\to\infty} \frac{\log \text{\usefont{U}{eur}{m}{n}\selectfont {M}}(\text{\usefont{U}{eur}{m}{n}\selectfont {W}},\text{\usefont{U}{eur}{m}{n}\selectfont {T}},\epsilon,\eta,\delta)}{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}} = \sqrt{2c}. $$

We note the following. If W grows large more slowly than T, then the per-second capacity for covert communication is zero for any finite δ, as in the finite-bandwidth case. If W grows large linearly with T, then a positive covert communication rate is achievable as long as δ is positive and bounded away from zero. If W grows large faster than T, i.e., if W/T→ as WT→, then there exists some δ that tends to zero slowly enough when WT→, for which the largest per-second covert communication rate is infinity.

Next, consider an average-power constraint P that is imposed on the input: every input signal must satisfy

$${\kern65pt} \frac{1}{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}} \cdot \int_{-\infty}^{\infty} x(t)^{2} \mathrm{d} t \le \text{\usefont{U}{eur}{m}{n}\selectfont {P}}. $$

As the next theorem shows, when W grows sufficiently fast, the covert communication capacity equals the capacity of the infinite-bandwidth Gaussian channel under the same power constraint. In other words, the covertness requirement has no effect on capacity in this case.

Theorem 2


$${\kern65pt} \frac{\text{\usefont{U}{eur}{m}{n}\selectfont {P}}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}} \le \sqrt{2c}, $$

where c is given in (50), then

$${\kern10pt} {\lim}_{\eta\downarrow 0}{\lim}_{\epsilon\downarrow 0} {\lim}_{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\to\infty} \frac{\log \text{\usefont{U}{eur}{m}{n}\selectfont {M}}(\text{\usefont{U}{eur}{m}{n}\selectfont {W}},\text{\usefont{U}{eur}{m}{n}\selectfont {T}},\epsilon,\eta,\delta)}{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}} = \frac{\text{\usefont{U}{eur}{m}{n}\selectfont {P}}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}. $$


The converse holds because one cannot achieve a larger per-second capacity than the right-hand side of (54) even without a covertness constraint, and when the power constraint (52) is imposed on the average over all codewords instead of every codeword; see [9].

The achievability proof is similar to that of Theorem 1 given in Section 2.4, but, instead of (33), we choose

$${\kern65pt} a \triangleq \sqrt{\frac{\text{\usefont{U}{eur}{m}{n}\selectfont {P}}}{2\text{\usefont{U}{eur}{m}{n}\selectfont {W}}}}. $$

The power constraint is satisfied: every x(·) satisfies

$${\kern20pt} \int_{-\infty}^{\infty} x(t)^{2} \mathrm{d} t = \lfloor 2(1-\alpha)\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\rfloor \cdot a^{2} \le \text{\usefont{U}{eur}{m}{n}\selectfont {P}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}. $$

For covertness, instead of (39), we now have

$${\kern35pt} \!\mathscr{D}(\hat{Y}\| \hat{Z}) \le \frac{\text{\usefont{U}{eur}{m}{n}\selectfont {P}}^{2}}{4\text{\usefont{U}{eur}{m}{n}\selectfont {W}}^{2} \text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}^{2}} + o(\text{\usefont{U}{eur}{m}{n}\selectfont {W}}^{-2}), $$

so (40) becomes

$$\begin{array}{*{20}l} \!\mathscr{D}(Y_{-\infty}^{\infty}\|Z_{-\infty}^{\infty}) & \le \frac{\lfloor 2(1-\alpha)\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\rfloor \text{\usefont{U}{eur}{m}{n}\selectfont {P}}^{2}}{4\text{\usefont{U}{eur}{m}{n}\selectfont {W}}^{2}\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}^{2}} + o \left(\frac{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}}{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}}\right)\\* & \le (1-\alpha) \frac{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}}{2\text{\usefont{U}{eur}{m}{n}\selectfont {W}}}\cdot \frac{\text{\usefont{U}{eur}{m}{n}\selectfont {P}}^{2}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}^{2}} + o \left(\frac{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}}{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}}\right). \end{array} $$

The right-hand side of (58) is less than δ when WT is large enough, by (50) and (53). Thus, we conclude that the covertness condition is satisfied for large enough WT.

Using the information-spectrum method, we know that a sequence of codes can have vanishing probability of decoding error if

$$\begin{array}{*{20}l}{} {\lim}_{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\to\infty} \frac{\log \text{\usefont{U}{eur}{m}{n}\selectfont {M}}}{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}} & \le p\,\text{-} \liminf_{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\to\infty} \frac{1}{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}} \log \frac{p(\hat{Y}_{1}^n|\hat{X}_{1}^n)}{p(\hat{Y}_{1}^n)}\\* & = p\,\text{-} \liminf_{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\to\infty} \frac{\lfloor 2(1-\alpha)\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\rfloor}{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}} \log \frac{p(\hat{Y}|\hat{X})}{p(\hat{Y})}.\\* \ \end{array} $$

In place of (49), we now have

$$ p\,\text{-} \liminf_{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\to\infty} \frac{\lfloor 2(1-\alpha)\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\rfloor}{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}} \log \frac{p(\hat{Y}|\hat{X})}{p(\hat{Y})} \ge (1-\alpha) \frac{\text{\usefont{U}{eur}{m}{n}\selectfont {P}}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}. $$

The proof is completed by letting α0. □

White noise with finite bandwidth

In this section, we assume that W=, i.e., there is no bandwidth constraint at all on the input (as opposed to letting W→ together with T as in Section 3.1). But we assume that the additive Gaussian noise is band-limited and has PSD

$${\kern50pt} \text{\usefont{U}{eur}{m}{n}\selectfont {N}}(f) =\left\{\begin{array}{ll} \frac{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}{2}, & |f| \le \text{\usefont{U}{eur}{m}{n}\selectfont {W}}_{0} \\ 0, & |f|>\text{\usefont{U}{eur}{m}{n}\selectfont {W}}_{0},\end{array}\right. $$

where W0 is some constant that depends on neither T nor δ. Not surprisingly, the input bandwidth will be effectively limited by the noise bandwidth W0.

Corollary 3

If the additive Gaussian noise has PSD given by (61), then

$$ {\lim}_{\eta\downarrow 0}{\lim}_{\epsilon\downarrow 0} {\lim}_{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\to\infty} \frac{\log \text{\usefont{U}{eur}{m}{n}\selectfont {M}}(\infty,\text{\usefont{U}{eur}{m}{n}\selectfont {T}},\epsilon,\eta,\delta)}{\sqrt{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\delta}} = \sqrt{2\text{\usefont{U}{eur}{m}{n}\selectfont {W}}_{0}}. $$


If the input signal has nonzero energy outside the frequency range [ −W0,W0], then \(\mathscr {D}(Y_{-\infty }^{\infty }\|Z_{-\infty }^{\infty })\) will be infinity, violating the covertness constraint for any finite δ; this can be seen either using Theorem 10.5.1 of [13] or by noting that Z(·) has an orthonormal expansion in the PSWFs for W0 and T, but X(·) and Y(·) do not. Hence, the input frequency must be restricted to [−W0,W0]. The claim then follows by replacing W with W0 in Theorem 1. □

Remark 2

In the context of Corollary 3, one could restrict the receiver to only observing y(t), t[ 0,T], without affecting (62). This is because the receiver can perform the reduction to discrete time as in Section 2.2 merely using its observation on [ 0,T], thanks to the orthogonality property (12). (This would not be possible if the noise were not band-limited, because the receiver would then need to first use an ideal low-pass filter, which cannot operate on the finite interval [ 0,T].)

Note that, in this slightly different setting, we still allow the warden to observe the entire real line. As we shall see in Section 4, restricting the warden’s observation to [ 0,T] will result in serious artifacts.

Colored noise

We shall not study colored noise with the same rigor as we studied white noise, due to technical difficulties. Instead, we turn to formulas for mutual information and relative entropy of continuous-time Gaussian processes in terms of PSD [13]. These formulas provide useful engineering insights, but are not sufficient to prove rigorous coding theorems like Theorem 1. Our findings below therefore remain conjectures.

We first consider the scenario where the additive noise occupies infinite bandwidth, but has finite energy. (Unlike white noise as considered in Section 3.1, colored noise can have infinite bandwidth.) Formally, let the stationary Gaussian noise Z(t), \(t\in \mathbb {R}\), have PSD N(f), \(f\in \mathbb {R}\), that is positive for all \(f\in \mathbb {R}\) and symmetric around f=0, and satisfies

$${\kern65pt} \int_{-\infty}^{\infty} \text{\usefont{U}{eur}{m}{n}\selectfont {N}}(f) \ \mathrm{d} f < \infty. $$

We choose the input signal X(t), \(t\in \mathbb {R}\), to be generated from a stationary Gaussian process with PSD

$${\kern35pt} \text{\usefont{U}{eur}{m}{n}\selectfont {S}}(f) = \left\{\begin{array}{ll} \beta \text{\usefont{U}{eur}{m}{n}\selectfont {N}}(f),& f\in\,[\!-\text{\usefont{U}{eur}{m}{n}\selectfont{W}},\text{\usefont{U}{eur}{m}{n}\selectfont{W}}]\\ 0,&\text{otherwise,}\end{array}\right. $$

where β and W will be specified later. (Recall that there is no bandwidth constraint on the input, so W can be arbitrarily large.) We then have (Theorem 10.5.1 of [13])

$$\begin{array}{*{20}l} &{\!\mathscr{D}(Y_{-\infty}^{\infty} \| Z_{-\infty}^{\infty})}\\ \qquad & = \text{\usefont{U}{eur}{m}{n}\selectfont {T}}\cdot \frac{1}{2}\int_{-\text{\usefont{U}{eur}{m}{n}\selectfont {W}}}^{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}} \left(\frac{\text{\usefont{U}{eur}{m}{n}\selectfont {S}}(f)}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}(f)} - \log \left(1+\frac{\text{\usefont{U}{eur}{m}{n}\selectfont {S}}(f)}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}(f)}\right)\right)\!\mathrm{d} f \\ & \le \text{\usefont{U}{eur}{m}{n}\selectfont {T}} \cdot \frac{1}{4} \int_{-\text{\usefont{U}{eur}{m}{n}\selectfont {W}}}^{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}} \left(\frac{\text{\usefont{U}{eur}{m}{n}\selectfont {S}}(f)}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}(f)}\right)^{2} \!\mathrm{d} f \\ & = \beta^{2} \text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}. \end{array} $$

Hence, the covertness condition is satisfied if we choose

$$\begin{array}{@{}rcl@{}} \beta = \sqrt{\frac{\delta}{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}}}. \end{array} $$

With this choice, we can compute the mutual information using ([13], Theorem 10.3.1):

$$\begin{array}{*{20}l} \!\text{\usefont{U}{eur}{m}{n}\selectfont {I}}(X_{-\infty}^{\infty}; Y_{-\infty}^{\infty}) & = \text{\usefont{U}{eur}{m}{n}\selectfont {T}} \cdot \int_{-\text{\usefont{U}{eur}{m}{n}\selectfont {W}}}^{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}} \frac{1}{2} \log \left(1+\frac{\text{\usefont{U}{eur}{m}{n}\selectfont {S}}(f)}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}(f)}\right) \!\mathrm{d} f\\* & = \frac{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}}{2} \log \left(1+ \beta\right)\\* & \approx \frac{\sqrt{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\delta}}{2}. \end{array} $$

Since W can be chosen arbitrarily large, we are allowed to choose it as a function of T such that

$${\kern65pt} {\lim}_{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\to\infty} \frac{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}\delta}{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}} = \infty, $$

in which case the per-second mutual information, given by (67) divided by T, will grow to infinity as T grows large. Also note that the power in the above-chosen input signal is given by

$${\kern35pt} \beta\cdot \int_{-\text{\usefont{U}{eur}{m}{n}\selectfont {W}}}^{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}} \text{\usefont{U}{eur}{m}{n}\selectfont {N}}(f)\ \mathrm{d} f \le \beta\cdot \int_{-\infty}^{\infty} \text{\usefont{U}{eur}{m}{n}\selectfont {N}}(f)\ \mathrm{d} f. $$

For any δ (which may be a function of T), one can choose W to grow fast enough so that β as defined in (66) tends to zero as T→. This will ensure that (69) tends to zero as T grows large, namely, that the input power vanishes as T grows large.

To summarize, we make the following conjecture.

Conjecture 1

If the Gaussian noise process has PSD N(f) that is positive on the entire real line, then the per-second covert communication capacity without bandwidth constraint on the input is infinity. Furthermore, this holds irrespectively of whether an average-power constraint is imposed on the input or not.

We next consider the case where the noise is band-limited:

$${\kern35pt} \text{\usefont{U}{eur}{m}{n}\selectfont {N}}(f) > 0 \quad \Longleftrightarrow\quad |f|\le \text{\usefont{U}{eur}{m}{n}\selectfont {W}}_{0}, $$

where W0 is a constant that does not depend on T. Assume that the input follows some stationary Gaussian process. Note that, if the input PSD S(f) is positive on any interval where N(f)=0, then \(\mathscr {D}(Y_{-\infty }^{\infty }\| Z_{-\infty }^{\infty })\) will be infinity. Hence, the input process must also be limited to the frequencies in [−W0,W0]. Let

$$ \lambda(f) \triangleq \text{\usefont{U}{eur}{m}{n}\selectfont {S}}(f)/\text{\usefont{U}{eur}{m}{n}\selectfont {N}}(f), \qquad f\in[-\text{\usefont{U}{eur}{m}{n}\selectfont{W}}_{0},\text{\usefont{U}{eur}{m}{n}\selectfont{W}}_{0}]. $$

By the covertness condition (7) and by Theorem 10.5.1 of [13], we require

$$\begin{array}{*{20}l}{\kern15pt} \delta & \ge \!\mathscr{D}(Y_{-\infty}^{\infty} \| Z_{-\infty}^{\infty}) \\ & = \text{\usefont{U}{eur}{m}{n}\selectfont {T}} \cdot \frac{1}{2} \int_{-\text{\usefont{U}{eur}{m}{n}\selectfont {W}}_{0}}^{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}_{0}} \left(\lambda(f) - \log (1+\lambda(f))\right) \!\mathrm{d} f. \end{array} $$

The integrand is convex in λ(f), so we obtain

$${\kern55pt} \bar{\lambda} - \log (1+\bar{\lambda}) \le \frac{\delta}{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}_{0}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}} $$


$${\kern45pt} \bar{\lambda} \triangleq \frac{1}{2\text{\usefont{U}{eur}{m}{n}\selectfont {W}}_{0}} \int_{-\text{\usefont{U}{eur}{m}{n}\selectfont {W}}_{0}}^{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}_{0}} \lambda(f)\mathrm{d} f. $$

From (73) we obtain that, for large T,

$${\kern65pt} \bar{\lambda} \lesssim \sqrt{\frac{2\delta}{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}_{0}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}}}. $$

This implies the following (approximate) upper bound on the mutual information:

$$\begin{array}{*{20}l} \!\text{\usefont{U}{eur}{m}{n}\selectfont {I}}(X_{-\infty}^{\infty};Y_{-\infty}^{\infty}) & = \text{\usefont{U}{eur}{m}{n}\selectfont {T}}\cdot \int_{-\text{\usefont{U}{eur}{m}{n}\selectfont {W}}_{0}}^{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}_{0}} \frac{1}{2}\log(1+\lambda(f)) \mathrm{d} f \\ & \le \text{\usefont{U}{eur}{m}{n}\selectfont {W}}_{0}\text{\usefont{U}{eur}{m}{n}\selectfont {T}} \log(1+\bar{\lambda}) \\* & \lesssim \sqrt{2 \text{\usefont{U}{eur}{m}{n}\selectfont {W}}_{0}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}\delta}, \end{array} $$

where the first inequality follows because the integrand is concave in λ(f). This expression is the same as in the case where N(f) is constant for f[ −W0,W0]. Also note that, if we choose

$${\kern15pt} \lambda(f) ={\bar{\lambda}} \approx {\sqrt{\frac{2\delta}{\text{\usefont{U}{eur}{m}{n}\selectfont {W}}_{0}\text{\usefont{U}{eur}{m}{n}\selectfont {T}}}}},\quad f\in\,[\!-{\text{\usefont{U}{eur}{m}{n}\selectfont{W}}_{0}},{\text{\usefont{U}{eur}{m}{n}\selectfont{W}}_{0}}], $$

then we obtain approximate equality in both (72) and (76). We hence make the following conjecture.

Conjecture 2

Corollary 3 holds whenever the Gaussian noise Z(t), \(t\in \mathbb {R}\), has PSD that is positive within [ −W0,W0] and zero elsewhere.

Deficiency of a time-limited model

If there is no bandwidth limit on the input signal x(·), as is the case in Corollary 3, then x(·) can be made to be strictly time-limited to an interval [ 0,T]. One naturally wonders whether it is possible to formulate a time-limited covert communication model, where all parties have access only to the interval [ 0,T]. In this section, we show that, at least in the way we construct it below, such a model is invalid, because it leads to serious artifacts.

Our time-limited model is characterized as follows.

  1. 1.

    The transmitter maps a message to x(t), t[ 0,T].

  2. 2.

    The receiver maps y(t), t[ 0,T], to a decoded message.

  3. 3.

    The covertness constraint is

    $${\kern65pt} \!\mathscr{D}(X_{0}^{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}} \| Y_{0}^{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}}) \le \delta. $$

Let \(\bar {\text {\usefont {U}{eur}{m}{n}\selectfont {M}}} (\text {\usefont {U}{eur}{m}{n}\selectfont {T}},\epsilon,\delta)\) denote the largest possible cardinality of the message set such that the above conditions are satisfied and that the average probability of a decoding error is at most ε.

Let us now assume that the additive noise Z(·) has PSD given by (61). Under the model in Section 2, we have shown that the maximum amount of information that can be communicated in T seconds is proportional to \(\sqrt {\text {\usefont {U}{eur}{m}{n}\selectfont {T}}\delta }\); see Corollary 3. Under the new time-limited model, however, one can communicate an arbitrarily large amount of information over any fixed period of time.

Theorem 3

Let Z(t), \(t\in \mathbb {R}\), have PSD given by (61). Under the above model, for any positive T, ε, and δ,

$${\kern65pt} \bar{\text{\usefont{U}{eur}{m}{n}\selectfont {M}}} (\text{\usefont{U}{eur}{m}{n}\selectfont {T}},\epsilon,\delta) = \infty. $$


Only the direct part needs proof. To this end, for a positive integer k, we generate k3 IID Gaussian random variables {Xi} of mean zero and variance k−2. Let

$${\kern30pt} X(t) = \left\{\begin{array}{ll}\sum_{i=1}^{k^{3}} X_{i} \psi_{i}(t),& t\in[\!0,\text{\usefont{U}{eur}{m}{n}\selectfont{T}}],\\0, & \text{otherwise,}\end{array}\right. $$

where {ψi(·)} are the PSWFs for W0 and T; see Section 2.1. Clearly, X(t) is strictly time-limited to [ 0,T]. By (12), the channel can be reduced, for both the warden and the receiver, to k3 parallel Gaussian channels

$$\begin{array}{@{}rcl@{}} Y_{i} = X_{i}+Z_{i},\quad i\in\{1,\ldots,k^{3}\}. \end{array} $$

For covertness, we have the following bound:

$$\mathscr{D}(Y_{0}^{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}} \| Z_{0}^{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}}) = k^{3} \left(\frac{2k^{-2}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}} - \log \left(1+\frac{2k^{-2}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}\right) \right) \le \frac{2}{k\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}^{2}}. $$

Hence, when we let k, the covertness condition (78) will be met for any positive constant δ. The input-output mutual information can be calculated to be

$${\kern25pt} \!\text{\usefont{U}{eur}{m}{n}\selectfont {I}}(X_{0}^{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}};Y_{0}^{\text{\usefont{U}{eur}{m}{n}\selectfont {T}}}) = \frac{k^{3}}{2} \log \left(1+ \frac{2k^{-2}}{\text{\usefont{U}{eur}{m}{n}\selectfont {N}}_{0}}\right), $$

which grows to infinity when k. As in Section 2.4, one can use information-spectrum methods to show that the amount of information that can be communicated with arbitrarily small probability of error indeed grows to infinity with k; details are omitted. □

We provide some intuitive explanation to this artifact. A time-limited warden cannot employ an ideal low-pass filter, because the impulse response of such a filter occupies the entire real line. Hence, the time-limited warden cannot fully exploit the fact that the noise is band-limited. It is perhaps also interesting to understand this from the perspective of memory. Because the additive noise Z(·) has memory (due to its finite bandwidth), its values on (−,0) and (T,) can provide information about its values within [ 0,T], helping the warden detect communication. For example, consider a communication scheme where X(0)≠0 with a nonzero probability (which would be the case for the scheme used in the proof of Theorem 3). A time-unlimited warden will see a discontinuity in Y(·) at t=0, from which it can immediately determine that communication is taking place. This is not possible for a warden that is limited to [ 0,T].


We provided a rigorous mathematical framework for studying covert communication over the continuous-time Gaussian channel. We then used this framework to study the scenario where the input bandwidth can be infinitely large. We showed that, roughly speaking, over an AWGN channel where the transmitter can employ unbounded bandwidth, covert communication has the same per-second capacity as standard, non-covert communication.

We pointed out that one must be careful when formulating the continuous-time model. In particular, we believe that the model should allow the warden to observe not only the time window when communication might take place, but also before and after that time window. Essentially, this is because the channel has memory. The same issue would also arise in discrete-time channels with memory, unless one assumes, for example, that the channel behaves independently before, during, and after the communication window, as in [21, 22].

It remains to prove Conjectures 1 and 2 on colored noise in a fashion similar to our treatment of white noise. Doing so seems to require additional tools in functional analysis.

Change history

  • 30 January 2020

    Following publication of the original article [1], the authors flagged that the article had published with an error in one of the equations (Eq. 46), in addition to a few minor formatting errors (concerning spacing and parentheses).


  1. 1.

    Optimality of BPSK in the discrete-time case was first observed by the author in 2015. That result has circulated among some researchers as private communication.



Additive white Gaussian noise


Binary phase shift keying


Probability density function


Power spectral density


Prolate spheroidal wave function


  1. 1

    B. A. Bash, D. Goekel, D. Towsley, Limits of reliable communication with low probability of detection on AWGN channels. IEEE Sel. J. Areas Commun.31(9), 1921–1930 (2013).

    Article  Google Scholar 

  2. 2

    P. H. Che, M. Bakshi, S. Jaggi, Reliable deniable communication: hiding messages in noise, (2013).

  3. 3

    L. Wang, G. W. Wornell, L. Zheng, Fundamental limits of communication with low probability of detection. IEEE Trans. Inform. Theory. 62(6), 3493–3503 (2016).

    MathSciNet  Article  Google Scholar 

  4. 4

    M. Bloch, Covert communication over noisy channels: a resolvability perspective. IEEE Trans. Inform. Theory. 62(5), 2334–2354 (2016).

    MathSciNet  Article  Google Scholar 

  5. 5

    M. Tahmasbi, M. Bloch, First- and second-order asymptotics in covert communication. IEEE Trans. Inform. Theory. 65(4), 2190–2212 (2019).

    MathSciNet  Article  Google Scholar 

  6. 6

    Q Zhang, M Bakshi, S Jaggi, Computationally efficient covert communication. Subm. to IEEE Trans. Inform. Theory (2018). arXiv:1607.02014v2.

  7. 7

    C. E Shannon, A mathematical theory of communication. Bell Syst. Tech. J.27:, 379–423 and 623–656 (1948).

    MathSciNet  Article  Google Scholar 

  8. 8

    A. Lapidoth, A Foundation in Digital Communication, Second ed. (Cambridge University Press, 2017).

  9. 9

    A. D. Wyner, Capacity of the band-limited Gaussian channel. Bell Syst. Tech. J.45:, 359–395 (1966).

    Article  Google Scholar 

  10. 10

    R. G. Gallager, Information Theory and Reliable Communication (Wiley, 1968).

  11. 11

    S. Verdú, On channel capacity per unit cost. IEEE Trans. Inform. Theory. 36:, 1019–1030 (1990).

    MathSciNet  Article  Google Scholar 

  12. 12

    M. Simon, J. Omura, R. Scholtz, B. Levitt, Spread Spectrum Communications Handbook (McGraw-Hill, 1994).

  13. 13

    M. S. Pinsker, Information and Information Stability of Random Variables and Processes (San Francisco, Holden-Day, 1964).

    Google Scholar 

  14. 14

    L. Wang, The continuous-time Poisson channel has infinite covert communication capacity, (2018).

  15. 15

    Q. Zhang, M. R. Bloch, M. Bakshi, S. Jaggi, Undetectable radios: covert communication under spectral mask constraints, (2019).

  16. 16

    D. Slepian, H. Landau, H. Pollak, Prolate spheroidal wave functions, Fourier analysis and uncertainty – I & II. Bell Syst. Tech. J.40:, 43–84 (1961).

    MathSciNet  Article  Google Scholar 

  17. 17

    D. Slepian, Some asymptotic expansions of prolate spheroidal wave functions. Math. J. Phys.44:, 99–140 (1965).

    MathSciNet  Article  Google Scholar 

  18. 18

    T. M. Cover, J. A. Thomas, Elements of Information Theory, second ed. (Wiley, New York, 2006).

    Google Scholar 

  19. 19

    S. Verdú, T. S. Han, A general formula for channel capacity. IEEE Trans. Inform. Theory. 40(4), 1147–1157 (1994).

    Article  Google Scholar 

  20. 20

    T. S. Han, Information Spectrum Methods in Information Theory, (2003).

    Google Scholar 

  21. 21

    P. H. Che, M. Bakshi, C. Chan, S. Jaggi, Reliable deniable communication with channel uncertainty, (2014).

  22. 22

    T. Sobers, B. Bash, S. Guha, D. Towsley, D. Goeckel, Covert communication in the presence of an uninformed jammer. IEEE Trans. Wirel. Comm.16(9), 6193–6206 (2017).

    Article  Google Scholar 

Download references


The author declares no funding for this research.

Author information




LW is responsible for the research and the writing of the manuscript. The author read and approved the final manuscript.

Corresponding author

Correspondence to Ligong Wang.

Ethics declarations

Competing interests

The author declares that he has no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original version of this article was revised to correct an error with equation (46)

Rights and permissions

, corrected publication 2020Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, L. On Gaussian covert communication in continuous time. J Wireless Com Network 2019, 283 (2019).

Download citation


  • Covert communication
  • Low probability of detection
  • Gaussian channel
  • Continuous time
  • Waveform channel
  • Prolate spheroidal wave functions