The capacity of a class of state-dependent relay channel with orthogonal components and side information at the source and the relay

Deng, Zhixiang; Wang, Baoyun; Lang, Fei

doi:10.1186/1687-1499-2014-59

Research
Open access
Published: 16 April 2014

The capacity of a class of state-dependent relay channel with orthogonal components and side information at the source and the relay

Zhixiang Deng^1,4,
Baoyun Wang^1,2,3 &
Fei Lang¹

EURASIP Journal on Wireless Communications and Networking volume 2014, Article number: 59 (2014) Cite this article

1010 Accesses
1 Citations
Metrics details

Abstract

In this paper, a class of state-dependent relay channel with orthogonal channels from the source to the relay and from the source and the relay to the destination is studied. The two orthogonal channels are corrupted by two independent channel states S_R and S_D, respectively, which are known to both the source and the relay. The lower bounds on the capacity are established for the channel either with non-causal channel state information or with causal channel state information. Further, we show that the lower bound with non-causal channel state information is tight if the receiver output Y is a deterministic function of the relay input X_r, the channel state S_D, and one of the source inputs X_D, i.e., Y = f(X_D, X_r, S_D), and the relay output Y_r is restricted to be controlled by only the source input X_R and the channel state S_R, i.e., the channel from the source to the relay is governed by the conditional probability distribution $P_{Y_{r} | X_{R}, S_{R}}$ . The capacity for this class of semi-deterministic orthogonal relay channel is characterized exactly. The results are then extended to the Gaussian cases, modeling the channel states as additive Gaussian interferences. The capacity is characterized for the channel when the channel state information is known non-causally. However, when the channel state information is known causally, the capacity cannot be characterized in general. In this case, the lower bound on the capacity is established, and the capacity is characterized when the power of the relay is sufficiently large. Some numerical examples of the causal state information case are provided to illustrate the impact of the channel state and the role of the relay in information transmission and in cleaning the channel state.

1 Introduction

We consider a state-dependent relay channel with orthogonal components as shown in Figure 1. The channel from the source to the relay and the channel from the source and the relay to the destination are assumed orthogonal. The source wants to send a message W to the destination with the help of the relay in n channel uses. Through a given memoryless probability law $P_{Y_{r} | X_{R}, X_{r}, S_{R}} P_{Y | X_{D}, X_{r}, S_{D}}$ , the channel outputs $Y_{r}^{n}$ for the relay is controlled by the source inputs $X_{R}^{n}$ , the relay inputs $X_{r}^{n}$ _, and the channel state $S_{R}^{n}$ , while the channel outputs Yⁿ for the destination is controlled by the source input $X_{D}^{n}$ , the relay inputs $X_{r}^{n}$ , and the channel state $S_{D}^{n}$ . The state sequences $S_{R}^{n} = (S_{R, 1}, S_{R, 2}, \dots, S_{R, n})$ and $S_{D}^{n} = (S_{D, 1}, S_{D, 2}, \dots, S_{D, n})$ are independent and identically distributed (i.i.d.) with $S_{R, i} ~ Q_{S_{R}} (s_{R, i})$ and $S_{D, i} ~ Q_{S_{D}} (s_{D, i})$ , respectively. We assume S_R and S_D are independent. The channel state information about S_R and S_D is known to the source and the relay causally (that is, only $S_{R}^{i}$ and $S_{D}^{i}$ are known before transmission i takes place) or non-causally (that is, entire $S_{R}^{n}$ and $S_{D}^{n}$ are known before communication commences). The destination estimates the message sent by the source from its received channel output Yⁿ. In this paper, we study the capacity of this model.

1.1 Background

In many communication models, the communicating parties typically have some knowledge on the communicating channel or attempt to learn about the channel. State-dependent channels have brought wide attention in recent years [1]. Shannon first considered a single-user channel, wherein the channel state information was causally known to the transmitter [2]; the capacity of this channel was characterized. Gel'fand and Pinsker [3] derived a method to determine the capacity of a channel when the channel state information was non-causally known to the transmitter; this method was later called the Gel'fand-Pinsker (GP) coding scheme. In [4], Costa studied a Gaussian channel with additive white Gaussian noise (AWGN) and additive Gaussian interference known non-causally to the transmitter; it was demonstrated that with dirty paper coding (DPC), capacity can be achieved as if no interference existed in the channel.

Extensions to multiple user channels were performed by Gel'fand and Pinsker in [5], where it was shown that interference cancellation was possible in the Gaussian broadcast channel (BC) and Gaussian multiple-access channel (MAC). In multiple-user state-dependent channels, the channel state information may be known to all the users or only some of them. In [6], Sigurjonsson and Kim characterized the capacity of a degraded broadcast channel and the capacity of a physically degraded relay channel where the channel state information was causally known to the transmitters. Inner bounds for the two-user BC with non-causal side information at the transmitter were derived by extending Marto's achievable scheme to the state-dependent channels in [7]. In [8], Steinberg derived the inner and outer bounds for a degraded BC with non-causal side information and characterized the capacity region when the side information was provided to the transmitter in a causal manner. In [9], information theoretic performance limits for three classes of two-user state-dependent discrete memoryless BCs with non-causal side information at the encoder were derived.

The state-dependent two-user MAC with state information non-causally known to one of the encoders was considered in [10] and [11]. For the MAC with asymmetric channel state information at the transmitters and full channel state information at the decoder, a single-letter capacity region was characterized when the channel state available at one of the encoders was a subset of the channel state available at the other encoder [12]. However, for the general case, only the inner and outer bonds were derived. It is not easy to characterize the explicit capacity region for general state-dependent MACs even when the channel state information is known to all transmitters. Capacity regions are only characterized in some special cases, e.g., Gaussian MAC with additive interference known to both encoders [13]. In some cases where the cooperation between the transmitters is allowed, capacity regions are also characterized, e.g., in [14], explicit capacity region was characterized for the MAC with one informed encoder which transmitted a common message and a private message, while the uninformed encoder only transmitted a common message; in [15], the capacity region was derived for a two-user dirty paper Gaussian MAC with conferencing encoders.

The relay channels capture both the MAC and BC characteristics. The state-dependent relay channels were studied in [16–21]. Zaidi et al. [16] studied the relay channel with non-causal channel state information known to only the relay. The lower and upper bounds were derived by a coding scheme at the relay that used a combination of codeword splitting, Gel'f and-Pinsker binning, and decode-and-forward (DF) relaying. When the channel state information was known only at the source node, the lower and upper bounds were obtained in [17–20]. In [17], the coding scheme for the lower bound used techniques of rate splitting at the source, partial decode-and-forward (PDF) relaying, and a GP-like binning scheme. In order to derive the lower bound of the capacity, [18] proposed two achievable schemes: (i) state description by which the source described the channel state to the relay and the destination and (ii) analog input description by which the source firstly computed the appropriate input that the relay would send had the relay known the channel state and then transmitted this appropriate input to the relay. With the same achievable schemes proposed in [18], the authors [19] obtained two corresponding lower bounds for the state-dependent relay channel with orthogonal components and channel state information known non-causally to the source. A similar orthogonal relay channel that was corrupted by an interference which was known non-causally to the source was considered in [20], in which several transmission strategies were proposed, assuming the interference had structure. Akhbari et al. [21] considered a state-dependent relay channel in three different cases: only the relay or only the source or both the source and the relay knew the channel state information non-causally. Lower bounds of the capacity were established based on using GP coding and compress-and-forward (CF) relaying for the three cases.

1.2 Motivation

State-dependent channels with state information available at the encoders can be used to model many systems, such as information embedding [22–24] which enables encoding a message into a host signal, computer memories with defective cells [25], communication systems with cognitive radios, etc. For the above examples, we are more interested in the communication systems with cognitive radios. In order to improve the frequency spectrum efficiency in wireless systems, some secondary users which are capable of acquiring some knowledge about the primary communication are introduced into an existing primary communication system [26]. Obtaining the knowledge of the primary communication, the secondary users can adapt their coding schemes to mitigate the interference caused by the primary communication. In such models, the channel state can be viewed as the signals of the primary communication and the informed encoders can be viewed as cognitive users [11].

For the state-dependent relay channel with orthogonal components considered in this paper, the channel states S_R and S_D are viewed as the signals of corresponding primary communication; the source and the relay are viewed as the secondary users which are capable of acquiring the channel state information. Thus, the model studied in this paper can be viewed as a secondary relay communication with cognitive source and cognitive relay. We are interested in studying the capacity of this model.

However, it is tedious to characterize the explicit capacity of the relay channels even if the channel is state-independent. The capacity for the state-independent relay channel is only characterized in some special channels, e.g., physically degraded/reversely degraded relay channel [27], a class of deterministic relay channels [28] and a class of relay channels with orthogonal components [29]. To the best of our knowledge, explicit capacity results for the state-dependent relay channels with channel state information known to a part of the transmitters or all the transmitters were derived mainly in two cases: (i) physically degraded relay channels with state information causally known to both the source and the relay and (ii) Gaussian physically degraded relay channels with the channel state information non-causally known to the source and the relay. However, when the relay channel is corrupted by the channel state, explicit capacity has not been characterized yet, even if the channel state has structure and the structure is known to the source. In this paper, we try to find some capacity results of the state-dependent relay channel with orthogonal components.

1.3 Main contributions and organization of the paper

We investigate a state-dependent relay channel with orthogonal components, where the source communicates with the relay through a channel (say channel 1) orthogonal to another channel (say channel 2) through which the source and the relay communicate with the destination. We assume that channel 1 and channel 2 are affected by two independent channel states S_R and S_D, respectively. The channel state information about S_R and S_D is known to both the source and the relay non-causally or causally. In this setup, the main results of this paper are summarized as follows:

(1)
A lower bound on the capacity of the channel is established when the channel state information is known to the source and the relay non-causally. The achievability is based on superposition coding at the source, PDF relaying at the relay, and cooperative GP coding at the source and the relay.
(2)
When the channel state information is known to the source and the relay causally, an achievable rate of this channel is derived in a similar way as in the non-causal channel state information case, except that the auxiliary random variables U and U _r are independent of the channel state S _R and S _D.
(3)
We show that the exact capacity for the channel with non-causal channel state information at the source and the relay can be characterized if the receiver output Y is a deterministic function of the relay input X _r, the channel state S _D, and one of the source inputs X _D, i.e., Y = f(X _D, X _r, S), and the relay output Y _r is restricted to be controlled by only the source input X _R and the channel state S _R, i.e., the channel from the source to the relay is governed by the conditional probability distribution $P_{Y_{r} | X_{R}, S_{R}}$ .
(4)
Explicit capacity is also characterized for the Gaussian orthogonal relay channel with additive Gaussian interferences known to the source and the relay non-causally.
(5)
For the Gaussian orthogonal relay channel with additive interferences known to the source and the relay causally, the capacity is derived when the power of the relay is sufficiently large.

The rest of the paper is organized as follows. In Section 2, we present the system model and some definitions as well as notations that will be used throughout the paper. Section 3 is devoted to establishing single-letter expressions for the lower bounds on the capacity of the discrete memoryless state-dependent orthogonal relay channel with channel state information known to the source and the relay either non-causally or causally. In Section 4, we show that when the channel state information is known non-causally, Y = f(X_D, X_r, S_D) and the channel from the source to the relay is governed by the conditional probability distribution $P_{Y_{r} | X_{R}, S_{R}}$ , the lower bound derived in Section 3 is tight; thus, the capacity is characterized exactly. In Section 5, the results are extended to the Gaussian cases. In Section 6, some numerical results are provided to illustrate the impact of the additive interferences and the role of the relay in information transmission and in cleaning the interference. In Section 7, we conclude this paper.

2 Notations and problem setup

Throughout this paper, random variables will be denoted by capital letters, while deterministic realizations thereof will be denoted by lower case letters. Vectors will be denoted by the boldface letters. The shorthand notation x_i^j is used to abbreviate (x_i, x_{i + 1}, …, x_j), xⁱ is used to abbreviate (x₁, x₂, …, x_i), and x_i is used to denote the i th element of xⁿ, where 1 ≤ i ≤ j ≤ n. The probability law of a random variable X will be denoted by P_X, and the conditional probability distribution of Y given X will be denoted by P_Y|X. The alphabet of a scalar random variable X will be designated by the corresponding calligraphic letter $X$ . The cardinality of a set $J$ will be denoted by $|J|$ . $T_{ϵ}^{n} (X)$ denotes a set of strongly ϵ-typical sequences $x^{n} \in X^{n}$ , while $A_{ϵ}^{n} (X)$ denotes a set of weakly ε-typical sequences $x^{n} \in X^{n}$ , where ϵ > 0. E(•) denotes expectation; I(•;•) denotes the mutual information between two random variables. $N (0, σ^{2})$ denotes a Gaussian distribution with zero mean and variance σ².

As shown in Figure 1, we consider the state-dependent relay channel with orthogonal components denoted by $P_{Y, Y_{r} | X_{R}, X_{D}, X_{r}, S_{D}, S_{R}}$ , where $Y \in Y$ and $Y_{r} \in Y_{r}$ are the channel outputs from the destination and the relay, respectively. $X_{R} \in X_{R}$ and $X_{D} \in X_{D}$ are the orthogonal channel inputs from the source, while $X_{r} \in X_{r}$ is the channel input from the relay. $S_{R} \in S_{R}$ and $S_{D} \in S_{D}$ denote the random channel states that corrupt channel 1 and channel 2, respectively. The channel states S_R,i and S_D,i at time instant i are independently drawn from the distribution $Q_{S_{R}}$ and $Q_{S_{D}}$ _, respectively. The channel state information S_R and S_D is known to both the source and the relay non-causally or causally.

The message W is uniformly distributed over the set $W = \{1, 2, \dots, M\}$ . The source transmits a message W to the destination with the help of a relay in n channel uses. Let $X_{R}^{n} = (X_{R, 1}, \dots, X_{R, n})$ , $X_{D}^{n} = (X_{D, 1}, \dots, X_{D, n})$ , and $X_{r}^{n} = (X_{r, 1}, \dots, X_{r, n})$ be the channel inputs of the source and the relay, respectively. The relay channel is said to be memoryless and to have orthogonal components if

\begin{array}{l} P (y_{r}^{n}, y^{n} | x_{R}^{n}, x_{D}^{n}, x_{r}^{n}, s_{D}^{n}, s_{R}^{n}) \\ = \prod_{i = 1}^{n} P (y_{r, i} | x_{r, i}, x_{R, i}, s_{R, i}) P (y_{i} | x_{r, i}, x_{D, i}, s_{D, i}) \end{array}

(1)

A (M, n) code for the state-dependent relay channel with channel state information non-causally known to the source and the relay consists of an encoding function at the source

φ^{n} : \{1, 2, \dots, M\} \times S_{D}^{n} \times S_{R}^{n} \to X_{D}^{n} \times X_{R}^{n}

(2)

a sequence of encoding functions at the relay

φ_{r, i} : Y_{r}^{i - 1} \times S_{D}^{n} \times S_{R}^{n} \to X_{r}

(3)

for i = 1, 2, …, n, and a decoding function at the destination

ϕ^{n} : Y^{n} \to \{1, 2, \dots, M\} .

The information rate R is defined as $R = \frac{1}{n} {log}_{2} M$ bits per transmission

An (ϵ_n, n, R) code for the state-dependent relay channel with orthogonal components and non-causal state information is a code having average probability of error smaller than ϵ_n, i.e.,

Pr (W \neq ϕ^{n} (y^{n})) \leq ϵ_{n}

The rate R is said to be achievable if there exists a sequence of (ϵ_n, n, R) codes with lim_{n → ∞}ϵ_n = 0. The capacity of the channel is defined as the supremum of the set of achievable rates.

The definition of an (ϵ_n, n, R) code for the state-dependent relay channel with orthogonal components and causal channel state information at the source and the relay is similar to that of the state-dependent relay channel with non-causal state information, except that the encoder consists of a sequence of maps {φ_i}_i=1ⁿ, and ${\{φ_{r, i}\}}_{i = 1}^{n}$ , where i means the time index. Thus, the encoder mappings in (2) to (3) are replaced by

φ_{i} : \{1, 2, \dots, M\} \times S_{D}^{i} \times S_{R}^{i} \to X_{D} \times X_{R}

(4)

φ_{r, i} : Y_{r}^{i - 1} \times S_{D}^{i} \times S_{R}^{i} \to X_{r},

(5)

respectively, where, i = 1, 2, …, n. The definitions of achievable rate and capacity remain the same as in the non-causal state information case.

3 Discrete memoryless case

In this section, it is assumed that the alphabets $X_{D}$ , $X_{R}$ , $X_{r}$ , $Y_{r}$ , $Y$ , $S_{D}$ , and $S_{R}$ are finite. Lower bounds on the capacity of the channel with non-causal channel state information or causal channel state information are established, respectively. In the proofs of the lower bounds in the discrete memoryless case, strong typicality is used.

3.1 Non-causal channel state information

The following theorem provides a lower bound on the capacity of the state-dependent orthogonal relay channel with channel state information non-causally known to the source and the relay.

Theorem 1 For the orthogonal relay channel with channel state information non-causally known to both the source and the relay, the following rate is achievable

\begin{array}{l} R \leq max min & \{I (X_{R}; Y_{r} | U_{r}, X_{r}, S_{R}) + I (U; Y | U_{r}) \\ - I (U; S_{D} | U_{r}), I (U, U_{r}; Y) - I (U, U_{r}; S_{D})}, \end{array}

(6)

where the maximization is over all measures on $S_{D} \times S_{R} \times X_{r} \times U_{r} \times X_{R}, \times X_{D} \times Y_{r} \times Y$ of the form

\begin{array}{l} P_{S_{R}, S_{D}, X_{r}, U_{r}, U, X_{R}, X_{D}, Y_{r}, Y} & = Q_{S_{R}} Q_{S_{D}} P_{U_{r} | S_{D}} P_{X_{r} | U_{r}, S_{R}, S_{D}} \\ \times P_{X_{R} | U_{r}, S_{R}} P_{U, X_{D} | U_{r}, S_{D}} \\ \times P_{Y_{r} | X_{R}, X_{r}, S_{R}} P_{Y | X_{D}, X_{r}, S_{D}} \end{array}

(7)

$U_{r} \in U_{r}$ and $U \in U$ are auxiliary random variables with

|U_{r}| \leq |S_{D}| |S_{R}| |X_{D}| |X_{r}| + 1

(8)

|U| \leq |S_{D}| |S_{R}| |X_{r}| |X_{D}| (|S_{D}| |S_{R}| |X_{r}| |X_{D}| + 1) + 1

(9)

Remark 1 Since the source and the relay know the channel state information non-causally, with PDF relaying, they can transmit the messages to the destination cooperatively with GP coding, namely, cooperative GP coding. The source communicates with the relay treating $s_{R}^{n}$ as a time-sharing sequence for the same reason that the channel state information is known to both the source and the relay non-causally.

3.1.1 Outline of the proof of Theorem 1

We now give a description of the coding scheme to derive the lower bound in Theorem 1. Detailed error analysis of Theorem 1 is given in Appendix 1. The achievable scheme is based on the combination of superposition coding at the source, PDF relaying at the relay, and cooperative GP coding at the source and the relay.

The message W is divided into two parts $W_{D} \in [1, 2^{n R_{D}}]$ and $W_{D} \in [1, 2^{n R_{D}}]$ . Consider B + 1 blocks, each of n symbols. Let $s_{D}^{n} (k)$ and $s_{R}^{n} (k)$ be the state sequences in block k, k = 1, 2, …, B + 1. A sequence of B messages w(k) ∈ [1, 2^nR], with w(k) = (w_D(k), w_R(k)), $w_{D} (k) \in [1, 2^{n R_{D}}]$ , $w_{R} (k) \in [1, 2^{n R_{R}}]$ , and R = R_D + R_R, are sent over the channel in n(B + 1) transmissions. During each of the first B blocks, the source encodes $w_{D} (k) \in [1, 2^{n R_{D}}]$ and sends it over the channel. Since both the source and the relay know the channel state sequence $s_{R}^{n} (k)$ , the source encodes $w_{R} (k) \in [1, 2^{n R_{R}}]$ by treating $s_{R}^{n} (k)$ as a time-sharing sequence [30]. The message w_R(k) is expressed as a unique set $\{m_{s_{R}} (k) : s_{R} \in S_{R}\}$ with $|S_{R}|$ sub-messages. For each $s_{R} \in S_{R}$ , every sub-message $m_{s_{R}} (k)$ in the set is associated with a codeword $x_{R}^{n} (m_{s_{R}} (k))$ from a corresponding sub-codebook $C_{s_{R}}$ . The set of codewords $\{x_{R}^{n} (m_{s_{R}} (k)) : s_{R} \in S_{R}\}$ are sent over the channel multiplexed according to the state sequence $s_{R}^{n} (k)$ . The relay demultiplexes the received sequence $y_{r}^{n} (k)$ into sub-sequences according to the state sequence $s_{R}^{n} (k)$ and decode each sub-message $m_{s_{R}} (k)$ . Consequently, w_R(k) is decoded at the relay. The coding scheme is illustrated in Figure 2. With PDF relaying, the relay re-encodes w_R(k) and sends it to the destination cooperatively with the source. In the last block B + 1, no new message is sent and let w(B + 1) = (w_D(B + 1), w_R(B + 1)) = (1, 1). The average information rate R(B/(B + 1)) of the message over B + 1 blocks approaches R as B → ∞.

3.1.2 Codebook generation

Fix a measure $P_{S_{R}, S_{D}, X_{r}, U_{r}, U, X_{R}, X_{D}}$ of the form (7).

(i)
Generate $2^{n (R_{R} + R_{r, s})}$ i.i.d. codewords $\{u_{r}^{n} ({\tilde{w}}_{R}, j_{r})\}$ indexed by ${\tilde{w}}_{R} = 1, 2, \dots, 2^{n R_{R}}$ , $j_{r} = 1, 2, \dots, 2^{n R_{r, s}}$ , each with i.i.d. components drawn according to $P_{U_{r}}$ .
(ii)
For each $u_{r}^{n} ({\tilde{w}}_{R}, j_{r})$ , generate $2^{n (R_{D} + R_{d, s})}$ i.i.d. codewords $\{u^{n} (w_{D}, j_{d} | {\tilde{w}}_{R}, j_{r})\}$ indexed by $w_{D} = 1, 2, \dots, 2^{n R_{D}}$ , $j_{d} = 1, 2, \dots, 2^{n R_{d, s}}$ , each with i.i.d. components drawn according to $P_{U | U_{r}}$ .
(iii)
For each $u_{r}^{n} ({\tilde{w}}_{R}, j_{r})$ and for each $s_{R} \in S_{R}$ , randomly and independently generate $2^{n R_{s_{_{R}}}}$ sequences $\{x_{R}^{n} (m_{s_{R}} | s_{R}, {\tilde{w}}_{R}, j_{r})\}$ indexed by $m_{s_{R}} \in [1, 2^{n R_{s_{_{R}}}}]$ , each with i.i.d. components according to $P_{X_{R} | U_{r}, S_{R}}$ . These sequences constitute the sub-codebook $C_{s_{R}} (s_{R} \in S_{R})$ . There are $|S_{R}|$ such sub-codebooks for each $u_{r}^{n} ({\tilde{w}}_{R}, j_{r})$ . Set $R_{R} = \sum_{s_{R} \in S_{R}} R_{s_{R}}$ .

3.1.3 Encoding

We pick up the story in block k. Let w(k) = (w_D(k), w_R(k)) ∈ {1, 2, …, 2^nR}, where $w_{D} (k) \in \{1, 2, \dots, 2^{n R_{D}}\}$ , $w_{R} (k) \in \{1, 2, \dots, 2^{n R_{R}}\}$ , be the new message to be sent from the source node at the beginning of block k. The encoding at the beginning of block k is as follows.

(i)
The relay knows w _R(k − 1) (this will be justified below) and searches the smallest $j_{r} (k) \in \{1, 2, \dots, 2^{n R_{r, s}}\}$ , such that $u_{r}^{n} (w_{R} (k - 1), j_{r} (k))$ is jointly typical with $s_{D}^{n} (k)$ . If no such j _r(k) exists, an error is declared and j _r(k) is set to 1. By the covering lemma [31], this error probability tends to 0 as n approaches infinity, if R _r,s satisfies
$R_{r, s} \geq I (U_{r}; S_{D})$
(10)

Given $u_{r}^{n} (w_{R} (k - 1), j_{r} (k))$ , $s_{R}^{n} (k)$ , and $s_{D}^{n} (k)$ , the relay sends a vector $x_{r}^{n} (k)$ with i.i.d. components drawn according to the marginal $P_{X_{r} | U_{r}, S_{D}, S_{R}}$ .
(ii)
The source also knows w _R(k − 1) and $s_{D}^{n} (k)$ , thereby knows $u_{r}^{n} (w_{R} (k - 1), j_{r} (k))$ . Then, the source searches $j_{d} (k) \in \{1, 2, \dots, 2^{n R_{d, s}}\}$ , such that u ⁿ(w _D(k), j _d(k)|w _R(k − 1), j _r(k)) is jointly typical with $s_{D}^{n} (k)$ given $u_{r}^{n} (w_{R} (k - 1), j_{r} (k))$ . If no such j _d(k) exists, an error is declared and j _d(k) is set to 1. By the covering lemma [31], this error probability tends to 0 as n approaches infinity, if R _d,s satisfies
$R_{d, s} \geq I (U; S_{D} | U_{r})$
(11)

Given uⁿ(w_D(k), j_d(k)|w_R(k − 1), j_r(k)), $u_{r}^{n} (w_{R} (k - 1), j_{r} (k))$ , and $s_{D}^{n} (k)$ , the source then sends a vector $x_{D}^{n} (k)$ with i.i.d. components drawn according to the marginal $P_{X_{D} | U_{r}, U, S_{D}}$ .
(iii)
Meanwhile, to send a message w _R(k) ∈ [1, 2^nR], express it as a unique set of messages $\{m_{s_{R}} (k) : s_{R} \in S_{R}\}$ . The source, knowing the codeword $u_{r}^{n} (w_{R} (k - 1), j_{r} (k))$ , considers the set of codewords $\{x_{R}^{n} (m_{s_{R}} (k) | s_{R}, w_{R} (k - 1), j_{r} (k)) : s_{R} \in S_{R}\}$ . Store each codeword in a first-in-first-out (FIFO) buffer of length n. A multiplexer is used to choose a symbol at each transmission time i ∈ [1, n] from one of the FIFO buffers according to the state s _R,i(k). Then, the chosen symbol is transmitted.

3.1.4 Decoding

At the end of block k, the relay and the destination observe $y_{r}^{n} (k)$ and yⁿ(k), respectively.

(i)
Having successfully decoded w _R(k − 1) in block k − 1 and knowing $u_{r}^{n} (w_{R} (k - 1), j_{r} (k))$ , $x_{r}^{n} (k)$ and $s_{R}^{n} (k)$ , the relay node estimates ${\hat{w}}_{R} (k)$ from $y_{r}^{n} (k)$ . According to the state sequence $s_{R}^{n} (k)$ , the relay demultiplexes $y_{r}^{n} (k)$ , $u_{r}^{n} (w_{R} (k - 1), j_{r} (k))$ and $x_{r}^{n} (k)$ into subsequences $\{y_{r, s_{R}}^{n_{s_{R}} (k)} (k), s_{R} \in S_{R}\}$ , $\{u_{r, s_{R}}^{n_{s_{R}} (k)} (w_{R} (k - 1), j_{r} (k)), s_{R} \in S_{R}\}$ and $\{x_{r, s_{R}}^{n_{s_{R}} (k)} (k), s_{R} \in S_{R}\}$ , respectively, where $\sum_{s_{R} \in S_{R}} n_{s_{R}} (k) = n$ . Assuming $s_{R}^{n} (k) \in T_{ϵ}^{n} (S_{R})$ , and thus $n_{s_{R}} (k) \geq n (1 - ϵ) p (s_{R})$ for all $s_{R} \in S_{R}$ , it finds for each $s_{R} \in S_{R}$ a unique ${\hat{m}}_{s_{R}} (k)$ such that the codeword sub-sequence $x_{R}^{n (1 - ϵ) p (s_{R})} ({\hat{m}}_{s_{R}} (k) | s_{R}, w_{R} (k - 1), j_{r} (k))$ is jointly typical with $y_{r, s_{R}}^{n (1 - ϵ) p (s_{R})} (k)$ , given $u_{r, s_{R}}^{n (1 - ϵ) p (s_{R})} (w_{R} (k - 1), j_{r} (k))$ and $x_{r, s_{R}}^{n (1 - ϵ) p (s_{R})} (k)$ . By the law of large numbers (LLN) and the packing lemma [31], the probability error of each decoding step approaches 0 as n → ∞ if $R_{s_{R}} \leq p (s_{R}) I (X_{R}; Y_{r} | U_{r}, X_{r}, S_{R} = s_{R})$ . Therefore, the total probability error in decoding ${\hat{w}}_{R} (k)$ approaches 0 for sufficiently large n if the following condition is satisfied:
$\begin{array}{l} R_{R} = \sum_{s_{R} \in S_{R}} R_{s_{R}} \\ \leq I (X_{R}; Y_{r} | U_{r}, X_{r}, S_{R}) \end{array}$
(12)
(ii)
Observing y ⁿ(k), the destination finds a pair $({\hat{w}}_{R} (k - 1), {\hat{w}}_{D} (k))$ such that
$\begin{array}{l} (u^{n} ({\hat{w}}_{D} (k), {\hat{j}}_{d} (k) | {\hat{w}}_{R} (k - 1), {\hat{j}}_{r} (k)), u_{r}^{n} ({\hat{w}}_{R} (k - 1), {\hat{j}}_{r} (k)), \\ y^{n} (k)) \in T_{ϵ}^{n} (U, U_{r}, Y) \end{array}$

for some ${\hat{j}}_{d} (k) \in \{1, 2, \dots, 2^{n R_{d, s}}\}$ and ${\hat{j}}_{r} (k) \in \{1, 2, \dots, 2^{n R_{r, s}}\}$ . If there is no such pair or it is not unique, an error is declared. By the packing lemma [31], it can be shown that for sufficiently large n, decoding is correct with high probability if

\begin{array}{l} R_{D} + R_{d, s} \leq I (U; Y | U_{r}) \\ R_{D} + R_{d, s} + R_{R} + R_{r, s} \leq I (U, U_{r}; Y) \end{array}

(13)

Combining (11) to (13), w(k − 1) = (w_D(k − 1), w_R(k − 1)) is decoded correctly with high probability at the end of block k, if

\begin{array}{l} R \leq I (X_{R}; Y_{r} | U_{r}, X_{r}, S_{R}) + I (U; Y | U_{r}) - I (U; S_{D} | U_{r}) \\ R \leq I (U, U_{r}; Y) - I (U, U_{r}; S_{D}) \end{array}

(14)

The detailed analysis of error probability is shown in Appendix 1.

3.2 Causal channel state information

In many practical communication systems, the state sequences are not known to the encoders in advance. For the case that the channel state information is provided to the source and the relay causally, the capacity is lower bounded as the following theorem.

Theorem 2 The capacity of the orthogonal relay channel with channel state information causally known to both the source and relay is lower bounded by

C_{CS} \geq max_{\begin{array}{l} p (s_{D}) p (s_{R}) p (u_{r}) p (x_{R} | u_{r}, s_{R}) p (u | u_{r}, s_{D}) \\ x_{r} = f_{r} (u_{r}, s_{D}, s_{R}), x_{D} = f_{D} (u, s_{D}) \end{array}} min \{I (X_{R}; Y_{r} | U_{r}, X_{r}, S_{R}) + I (U; Y | U_{r}), I (U, U_{r}; Y)\},

(15)

where $U_{r} \in U_{r}$ and $U \in U$ are auxiliary random variables with

|U_{r}| \leq |S_{D}| |S_{R}| |X_{D}| |X_{r}| + 1

(16)

|U| \leq |S_{D}| |S_{R}| |X_{r}| |X_{D}| (|S_{D}| |S_{R}| |X_{r}| |X_{D}| + 1) + 1

(17)

Remark 2 The achievable rate region in Theorem 2 is obtained by specializing the expression for the region in Theorem 1 to the case where the auxiliary random variables U and U_rare independent of S_Dand S_R. This is similar to the relation between the expression for the capacity of the state-dependent channel with causal channel state information introduced by Shannon[2]and its non-causal counterpart, the Gel'fand-Pinsker channel[3].

Proof The achievability poof is derived in a similar way as in the non-causal channel state information case except that the auxiliary random variables U and U_r are independent of the channel states S_D and S_R, and the channel inputs of the source and the relay are restricted to the mappings x_D = f_D(u, s_D) and x_r = f_r(u_r, s_D, s_R), respectively, where f_D(⋅) and f_r(⋅) are deterministic functions. The details are omitted for brevity.

4 Semi-deterministic orthogonal relay channel with non-causal channel state information

In this section, we show that the lower bound derived in Theorem 1 is tight for a class of semi-deterministic orthogonal relay channel, where, the output Y of the destination is a deterministic function of X_D, X_r and S_D, i.e., Y = f(X_D, X_r, S_D), and the output Y_r of the relay node is controlled only by X_R and S_R, i.e., the channel from the source to the relay is governed by the conditional distribution $P_{Y_{r} | X_{R}, S_{R}}$ . This assumption is reasonable in many cases, e.g., when the two orthogonal channels use two different frequency bands, the received signal Y_r at the relay node will not be affected by its input signal X_r. The channel can be expressed as

P (y_{r}, y | x_{R}, x_{D}, x_{r}, s_{D}, s_{R}) = P (y_{r} | x_{R}, s_{R}) 1 \{y = f (x_{D}, x_{r}, s_{D})\}

(18)

where, f(·) is a deterministic function and 1{·} denotes the indicator function. The channel state information on S_R and S_D is known to both the source and the relay non-causally. The capacity of this class of semi-deterministic orthogonal relay channel is characterized as shown in the following theorem.

Theorem 3 The capacity of the channel ( 18) with the channel state information known non-causally to the source and the relay is characterized as

C = max min \{I (X_{R}; Y_{r} | S_{R}) + H (Y | U_{r}, S_{D}), H (Y) - I (U_{r}, Y; S_{D})\},

(19)

where the maximization is over all measures on $S_{D} \times S_{R} \times X_{r} \times U_{r} \times X_{R} \times X_{D} \times Y_{r} \times Y$ of the form

\begin{array}{l} P (s_{D}, s_{R}, x_{r}, u_{r}, x_{R}, x_{D}, y_{r}, y) \\ = Q (s_{D}) Q (s_{R}) P (x_{R} | s_{R}) P (u_{r}, x_{r}, x_{D} | s_{D}) \\ \times P (y_{r} | x_{R}, s_{R}) 1 {y = f (x_{D}, x_{r}, s_{D})} \end{array}

(20)

$U_{r} \in U_{r}$ is an auxiliary random variables with

|U_{r}| \leq |S_{D}| |S_{R}| |X_{D}| |X_{r}| + 1

(21)

and 1{·} denotes the indicator function.

Proof The achievability follows from Theorem 1. First note that the joint distribution of (20) can also be written as

\begin{array}{l} P (s_{D}, s_{R}, x_{r}, u_{r}, x_{R}, x_{D}, y_{r}, y) \\ = Q (s_{D}) Q (s_{R}) P (x_{R} | s_{R}) P (u_{r}, y | s_{D}) P (x_{r}, x_{D} | s_{D}, u_{r}, y) \\ \times P (y_{r} | x_{R}, s_{R}) \end{array}

(22)

with additional requirement that

y = f (x_{D}, x_{r}, s_{D}) .

(23)

Note that, when $P_{U_{r}, Y, S_{D}} (u_{r}, y, s_{D})$ is fixed, all the items on the right-hand side (RHS) of (19) are fixed except for I(X_R; Y_r|S_R), which is independent of $P_{X_{r}, X_{D} | S_{D}, U_{r}, Y} (x_{r}, x_{D} | s_{D}, u_{r}, y)$ . Therefore, the maximization over all joint distributions of the form (20) can be replaced by the maximization only over those distributions, where x_r and x_D are two deterministic functions of (s_D, u_r, y), i.e., of the form

\begin{array}{l} P (s_{D}, s_{R}, x_{r}, u_{r}, x_{R}, x_{D}, y_{r}, y) \\ = Q (s_{D}) Q (s_{R}) P (x_{R} | s_{R}) P (u_{r}, y | s_{D}) 1 {x_{r} = g_{r} (u_{r}, s_{D})} \\ \times 1 {x_{D} = g_{d} (y, u_{r}, s_{D})} P (y_{r} | x_{R}, s_{R}) \end{array}

(24)

for some mappings g_r: (u_r, s_D) → x_r, g_d: (y, u_r, s_D) → x_D and subject to (23). Thus, we only have to prove the achievability of the rate that satisfies (19) for some distribution of the form (24).

The achievability follows directly from Theorem 1 by taking U = Y since Y = f(X_D, X_r, S_D), letting X_R be independent of U_r and X_r considering the fact that Y_r is only determined by X_R and S_R, and by setting x_r = g_r(u_r, s_D), x_D = g_d(y, u_r, s_D). Note that with these choices of the random variables, if we chose stochastic kernels $P_{X_{R} | S_{R}}$ and $P_{U_{r}, Y | S_{D}}$ , two deterministic mappings g_r:(u_r, s_D) → x_r and g_d:(y, u_r, s_D) → x_D, combined with $Q_{S_{D}} Q_{S_{R}}$ and the channel law, the joint distribution (24) for which (23) is satisfied will be determined.

The proof of the converse is as follows.

Consider an (ϵ_n, n, R) code with an average error probability P_e⁽ⁿ⁾ ≤ ε_n. By Fano's inequality, we have

H (W | Y^{n}) \leq nR P_{e}^{(n)} + 1 = n δ_{n}

(25)

where δ_n → 0 as n → + ∞. Thus,

nR = H (W) \leq I (W; Y^{n}) + n δ_{n}

(26)

Defining the auxiliary random variable ${\bar{U}}_{r, i} = (Y^{i - 1}, S_{D, i + 1}^{n})$ , we have

\begin{array}{l} I (W; Y^{n}) & \leq I (W; Y^{n}, Y_{r}^{n}) \\ \leq I (W; Y^{n}, Y_{r}^{n} | S_{D}^{n}, S_{R}^{n}) \\ = \sum_{i} I (W; Y_{i}, Y_{r, i} | Y^{i - 1}, Y_{r}^{i - 1}, S_{D}^{n}, S_{R}^{n}) \\ = \sum_{i} I (W; Y_{r, i} | Y^{i - 1}, Y_{r}^{i - 1}, S_{D}^{n}, S_{R}^{n}) \\ + \sum_{i} I (W; Y_{i} | Y^{i - 1}, Y_{r}^{i - 1}, Y_{r, i}, S_{D}^{n}, S_{R}^{n}), \end{array}

(27)

where the second inequality follows from the fact that $S_{D}^{n}$ and $S_{R}^{n}$ are independent of W.

Calculate the two terms in (27) separately as follows:

\begin{array}{l} \sum_{i} I (W; Y_{r, i} | Y^{i - 1}, Y_{r}^{i - 1}, S_{D}^{n}, S_{R}^{n}) \\ = \sum_{i} H (Y_{r, i} | Y^{i - 1}, Y_{r}^{i - 1}, S_{D}^{n}, S_{R}^{n}) \\ - H (Y_{r, i} | Y^{i - 1}, Y_{r}^{i - 1}, S_{D}^{n}, S_{R}^{n}, W) \\ \overset{(a)}{=} \sum_{i} H (Y_{r, i} | Y^{i - 1}, Y_{r}^{i - 1}, S_{D}^{n}, S_{R}^{n}) \\ - H (Y_{r, i} | Y^{i - 1}, Y_{r}^{i - 1}, S_{D}^{n}, S_{R}^{n}, W, X_{R, i}) \\ \overset{(b)}{\leq} \sum_{i} H (Y_{r, i} | S_{R, i}) - H (Y_{r, i} | S_{R, i}, X_{R, i}) \\ = \sum_{i} I (X_{R, i}; Y_{r, i} | S_{R, i}), \end{array}

(28)

where (a) holds since X_R,i is a function of $(W, S_{D}^{n}, S_{R}^{n})$ ; (b) follows from the fact that conditioning reduces entropy and the Markov chain $(Y^{i - 1}, Y_{r}^{i - 1}, S_{D}^{n}, S_{R}^{i - 1}, S_{R, i + 1}^{n}, W) \leftrightarrow (X_{R, i}, S_{R, i}) \leftrightarrow Y_{r, i}$ .

\begin{array}{l} \sum_{i} I (W; Y_{i} | Y^{i - 1}, Y_{r}^{i - 1}, Y_{r, i}, S_{D}^{n}, S_{R}^{n}) \\ = \sum_{i} H (Y_{i} | Y^{i - 1}, Y_{r}^{i - 1}, Y_{r, i}, S_{D}^{n}, S_{R}^{n}) \\ - H (Y_{i} | Y^{i - 1}, Y_{r}^{i - 1}, Y_{r, i}, S_{D}^{n}, S_{R}^{n}, W) \\ = \sum_{i} H (Y_{i} | Y^{i - 1}, Y_{r}^{i - 1}, Y_{r, i}, S_{D}^{n}, S_{R}^{n}) \\ - H (Y_{i} | Y^{i - 1}, Y_{r}^{i - 1}, Y_{r, i}, S_{D}^{n}, S_{R}^{n}, W, X_{D, i}, X_{r, i}) \\ \overset{(a)}{=} \sum_{i} H (Y_{i} | Y^{i - 1}, Y_{r}^{i - 1}, Y_{r, i}, S_{D}^{n}, S_{R}^{n}) \\ \overset{(b)}{\leq} \sum_{i} H (Y_{i} | Y^{i - 1}, S_{D, i + 1}^{n}, S_{D, i}) \\ = \sum_{i} H (Y_{i} | {\bar{U}}_{r, i}, S_{D, i}), \end{array}

(29)

where (a) holds since X_r,i is a function of $(Y_{r}^{i - 1}, S_{D}^{n}, S_{R}^{n})$ ; (b) follows from the fact that conditioning reduces entropy.

From (26) to (29), we have

R \leq \frac{1}{n} (\sum_{i} I (X_{R, i}; Y_{r, i} | S_{R, i}) + H (Y_{i} | U_{r, i}, S_{R, i})) + δ_{n}

(30)

The proof of the bound I(W; Yⁿ) given in the second term in (19) is as follows:

\begin{array}{l} I (W; Y^{n}) = \sum_{i} I (W; Y_{i} | Y^{i - 1}) \\ \leq \sum_{i} I (W, Y^{i - 1}; Y_{i}) \\ = \sum_{i} I (W, Y^{i - 1}, S_{D, i + 1}^{n}; Y_{i}) - I (S_{D, i + 1}^{n}; Y_{i} | W, Y^{i - 1}) \\ \overset{(a)}{=} \sum_{i} I (W, Y^{i - 1}, S_{D, i + 1}^{n}; Y_{i}) - I (Y^{i - 1}; S_{D, i} | W, S_{D, i + 1}^{n}) \\ \overset{(b)}{=} \sum_{i} H (Y_{i}) - H (Y_{i} | W, Y^{i - 1}, S_{D, i + 1}^{n}) \\ - I (W, Y^{i - 1}, S_{D, i + 1}^{n}; S_{D, i}) \\ = \sum_{i} H (Y_{i}) - H (Y_{i} | W, Y^{i - 1}, S_{D, i + 1}^{n}) \\ - \sum_{i} (I (W, Y^{i - 1}, S_{D, i + 1}^{n}, Y_{i}; S_{D, i})) \\ - I (Y_{i}; S_{D, i} | W, Y^{i - 1}, S_{D, i + 1}^{n}) \\ \overset{(c)}{\leq} \sum_{i} H (Y_{i}) - I (W, Y^{i - 1}, S_{D, i + 1}^{n}, Y_{i}; S_{D, i}) \\ \leq \sum_{i} H (Y_{i}) - I (Y^{i - 1}, S_{D, i + 1}^{n}, Y_{i}; S_{D, i}) \\ = \sum_{i} H (Y_{i}) - I ({\bar{U}}_{r, i}, Y_{i}; S_{D, i}), \end{array}

(31)

where (a) holds due to Csiszar and Korner's sum identity; (b) follows since S_D,i is independent of $(W, S_{D, i + 1}^{n})$ _, and (c) follows from the fact that $H (Y_{i} | W, Y^{i - 1}, S_{D, i + 1}^{n}) \geq I (Y_{i}; S_{D, i} | W, Y^{i - 1}, S_{D, i + 1}^{n})$ .

By (26) and (31),

R \leq \frac{1}{n} \sum_{i} (H (Y_{i}) - I ({\bar{U}}_{r, i}, Y_{i}; S_{D, i})) + δ_{n}

(32)

From the above, we have

\begin{array}{l} R \leq \frac{1}{n} (\sum_{i} I (X_{R, i}; Y_{r, i} | S_{R, i}) + \sum_{i} H (Y_{i} | {\bar{U}}_{r, i}, S_{D, i})) + δ_{n} \\ R \leq \frac{1}{n} \sum_{i} (H (Y_{i}) - I ({\bar{U}}_{r, i}, Y_{i}; S_{D, i})) + δ_{n} \end{array}

(33)

Introduce a time-sharing random variable T, which is uniformly distributed over {1, 2, …, n} and denote the collection of random variables

(X_{R}, X_{r}, Y_{r}, Y, {\bar{U}}_{r}, S_{D}, S_{R}) = (X_{R, T}, X_{r, T}, Y_{r, T}, Y_{T}, {\bar{U}}_{r, T}, S_{D, T}, S_{R, T}) .

Considering the first bound in (33), we have

\begin{array}{l} \frac{1}{n} (\sum_{i} I (X_{R, i}; Y_{r, i} | S_{R, i}) + \sum_{i} H (Y_{i} | {\bar{U}}_{r, i}, S_{D, i})) \\ = I (X_{R}; Y_{r} | S_{R}, T) + H (Y | {\bar{U}}_{r}, S_{D}, T) \\ = H (Y_{r} | S_{R}, T) - H (Y_{r} | X_{R}, S_{R}, T) + H (Y | {\bar{U}}_{r}, S_{D}, T) \\ \leq I (X_{R}; Y_{r} | X_{r}, S_{R}) + H (Y | {\bar{U}}_{r}, S_{D}, T), \end{array}

(34)

where the last step follows from the fact that T is independent of all the other variables and the Markov chain T ↔ (X_R, S_R) ↔ Y_r.

Similarly, considering the second bound in (33), we have

\begin{array}{l} \frac{1}{n} \sum_{i} (H (Y_{i}) - I ({\bar{U}}_{r, i}, Y_{i}; S_{D, i})) \\ = H (Y | T) - I ({\bar{U}}_{r}, Y; S_{D} | T) \\ \leq H (Y) - I ({\bar{U}}_{r}, T, Y; S_{D}) + I (T; S_{D}) \\ = H (Y) - I ({\bar{U}}_{r}, T, Y; S_{D}) \end{array}

(35)

Defining $U_{r} = ({\bar{U}}_{r}, T)$ , we get

\begin{array}{l} R \leq I (X_{R}; Y_{r} | S_{R}) + H (Y | U_{r}, S_{D}) + δ_{n} \\ R \leq H (Y) - I (U_{r}, Y; S_{D}) + δ_{n} \end{array}

(36)

Therefore, for a given sequence of (ϵ_n, n, R) code with ϵ_n going to zero as n goes to infinity, there exists a measure of the form $P_{S_{D} S_{R}, X_{r}, X_{R}, X_{D}} = Q_{S_{D}} Q_{S_{R}} P_{X_{r} | S_{D} S_{R}} P_{X_{R}, X_{D} | X_{r}, S_{D}, S_{R}}$ , such that the rate R essentially satisfies (19).

Considering the facts that I(X_R; Y_r|S_R) is determined by the joint distribution $P_{X_{R}, S_{R}, Y_{r}}$ and the other three items on the RHS of (19) is independent of $P_{X_{R}, S_{R}, Y_{r}}$ , the maximum in (19) taken over all joint probability mass functions $P_{S_{D}, S_{R}, X_{r}, U_{r}, X_{R}, X_{D}, Y_{r}, Y}$ is equivalent to that taken over all joint probability mass functions of the form

\begin{array}{l} P (s_{D}, s_{R}, x_{r}, u_{r}, x_{R}, x_{D}, y_{r}, y) \\ = Q (s_{D}) Q (s_{R}) P (x_{R} | s_{R}) P (u_{r}, x_{r}, x_{D} | s_{D}) \\ \times P (y_{r} | x_{R}, s_{R}) 1 {y = f (x_{D}, x_{r}, s_{D})} \end{array}

The bound of the cardinality of $U_{r}$ can be proven in a similar way as that proven in Theorem 1. It is omitted here for brevity.

This concludes the proof.

5 Memoryless Gaussian case

In this section, we study a state-dependent Gaussian relay channel with orthogonal components in which the channel states and the noise are additive and Gaussian. As shown in Figure 3, we consider the state-dependent Gaussian orthogonal relay channel, where channel 1 (dashed line) uses a different frequency band as compared to that used by channel 2 (solid line). The two orthogonal channels, channel 1 and channel 2, are corrupted by two independent additive Gaussian interferences S_R and S_D, respectively, which are known to the source and the relay. The channel can be described as

Y_{r} = X_{R} + S_{R} + Z_{r}

(37)

Y = X_{D} + X_{r} + S_{D} + Z_{d}

(38)

where Y_r and Y are the channel outputs of the relay and the destination, respectively; the Gaussian i.i.d. random variables (X_R, X_D) and X_r are channel inputs from the source and the relay with the average power constraints $E (X_{R}^{2}) + E (X_{D}^{2}) \leq P$ and $E (X_{r}^{2}) \leq γP$ . The additive interferences S_R, S_D and the noises Z_r, Z_d are assumed to be zero-mean Gaussian i.i.d. with $E (S_{R}^{2}) = Q_{R}$ , $E (S_{D}^{2}) = Q_{D}$ and $E (Z_{r}^{2}) = E (Z_{d}^{2}) = N$ . Further, we assume that S_R, S_D, Z_r, and Z_d are independent mutually. As in the discrete memoryless case, we will discuss the capacity of the channel when the additive interference sequences are known to the source and the relay non-causally and causally, respectively.

5.1 Channel state information non-causally known to the source and the relay

For the channel shown in Figure 3, when the channel state information is known non-causally to the source and the relay, using cooperative DPC, the capacity is characterized as in the following theorem.

Theorem 4 The capacity of the Gaussian orthogonal relay channel with the channel state information non-causally known to both the source and the relay is given by

\begin{array}{l} C (P, γP) & = max_{0 \leq β, ρ \leq 1} min \\ \times \{C (\frac{\bar{β} P}{N}) + C (\frac{β (1 - ρ^{2}) P}{N}), C (\frac{(β + γ + 2 ρ \sqrt{βγ}) P}{N})\}, \end{array}

(39)

where $C (x) = \frac{1}{2} {log}_{2} (1 + x)$ and $\bar{β} = 1 - β$ .

Remark 3 As in many other dirty paper channels with channel state information known non-causally at the encoders, with dirty paper coding, the capacity of the channel considered here is as same as that of the state-independent relay channel with orthogonal components. In fact, ( 39 ) also characterizes the capacity of the state-independent Gaussian orthogonal relay channel. Therefore, no matter the channel state information is either causally or non-causally known to the source and the relay, ( 39 ) serves as an upper bound on the capacity of the channel shown in Figure 3 .

Proof We only need to prove the achievability of (39) since the expression in (39) characterizes the capacity of the state-independent orthogonal relay channel [29] which obviously serves as an upper bound of the channel in this paper.

For the channel given by (37) and (38), we evaluate the achievable rate in (6) with the choice of the jointly Gaussian random variables U, U_r, S_R, S_D, X_R, X_D, and X_R given by

U_{0} = X_{D, 0} + α (1 - α_{r}) S_{D}

(40)

U_{r} = (1 + ρ \sqrt{β / γ}) X_{r} + α_{r} S_{D}

(41)

U = U_{0} + \frac{ρ \sqrt{β / γ}}{(1 + ρ \sqrt{β / γ})} U_{r}

(42)

X_{D} = X_{D 0} + ρ \sqrt{β / γ} X_{r},

(43)

where $E (X_{D}^{2}) = βP$ , $E (X_{R}^{2}) = \bar{β} P$ , $E (X_{r}^{2}) = γP$ , $E (X_{r} X_{D}) = ρ \sqrt{βγ} P$ and $X_{D, 0} ~ N (0, (1 - ρ^{2}) βP)$ is independent of X_r. The parameter β is the ratio of the source power allocated to X_D, while $\bar{β} = 1 - β$ is the ratio of the source power allocated to X_R. The parameter ρ is the correlation coefficient between X_r and X_D. With the above definitions of the random variables, it is straightforward to show the achievable rate in (39). The calculation is straightforward, thus omitted for brevity.

However, the calculation above is somewhat algebraic. Proceeding similarly to Costa's dirty paper coding, we extend the result in Theorem 1 for the discrete memoryless (DM) case to memoryless channels with discrete time and continuous alphabets by standard arguments [32]. An alternative proof is outlined in Appendix 2.

5.2 Channel state information known at the source and the relay causally

When the channel state information is known to the source and the relay causally, the capacity is not characterized in general. The following theorem gives a lower bound on the capacity.

Theorem 5 For the Gaussian orthogonal relay channel with the channel state information causally known to the source and the relay, the following rate is achievable:

\begin{array}{l} R (P, γP) \leq max_{\begin{array}{l} 0 \leq β \leq 1 \\ - 1 \leq ρ_{d, s}, ρ_{r, s}, ρ_{d, r} \leq 1 \end{array}} & min \{C (\frac{\bar{β} P}{N}) + C (\frac{(1 - ρ_{d, s}^{2}) (1 - ρ_{d, r}^{2}) βP}{{(\sqrt{Q_{D}} + ρ_{d, s} \sqrt{βP} + ρ_{r, s} \sqrt{γP})}^{2} + N}), \\ C (\frac{(1 - ρ_{d, s}^{2}) βP + (1 - ρ_{r, s}^{2}) γP + 2 ρ_{d, r} \sqrt{(1 - ρ_{d, s}^{2}) (1 - ρ_{r, s}^{2}) βγ} P}{{(\sqrt{Q_{D}} + ρ_{d, s} \sqrt{βP} + ρ_{r, s} \sqrt{γP})}^{2} + N})\} \end{array}

(44)

where $C (x) = \frac{1}{2} {log}_{2} (1 + x)$ and $\bar{β} = 1 - β$ .

Remark 4 Since the interference S _R is additive and known to both the source and the relay, the relay can remove S _R completely before decoding the message from the source. Actually, the interference S _R does not affect the achievable rate.

Remark 5 The source and the relay expend parts $ρ_{d, s}^{2} βP$ and $ρ_{r, s}^{2} γP$ of their power respectively to clean S_Dfrom the channel and use the remaining power for cooperative information transmission. It is different from many other dirty paper channels with non-causal channel state information at the transmitters where the channel states can be completely cleaned by choosing appropriate auxiliary random variables, e.g., by dirty paper coding. If Q_D = 0, the entire power of the source and the relay will be used for information transmission, i.e., ρ_r,s = ρ_d,s = 0. This reduces to the capacity of the state-independent relay channel with orthogonal components as shown in[29]since S_Rdoes not affect the achievable rate.

Proof The result in Theorem 2 for the discrete memoryless case can be extended to memoryless channels with discrete time and continuous alphabets using standard techniques [32]. The proof follows through evaluation of the lower bound of Theorem 2 using the following jointly Gaussian input distribution. Fix 0 ≤ β ≤ 1, − 1 ≤ ρ_d,s, ρ_r,s, ρ_d,r ≤ 1 and $\bar{β} = 1 - β$ . Let $X_{R} ~ N (0, \bar{β} P)$ , $U_{r} ~ N (0, (1 - ρ_{r, s}^{2}) γP)$ , $U^{'} ~ N (0, (1 - ρ_{d, s}^{2}) (1 - ρ_{d, r}^{2}) βP)$ , where U_r and U′ are independent. Let $U = ρ_{d, r} \sqrt{\frac{(1 - ρ_{d, s}^{2}) βP}{(1 - ρ_{r, s}^{2}) γP}} U_{r} + U^{'}$ . We define $X_{r} = U_{r} + ρ_{r, s} \sqrt{γP / Q_{D}} S_{D}$ and $X_{D} = U + ρ_{d, s} \sqrt{βP / Q_{D}} S_{D}$ . With these definitions, it can be easily verified that $U ~ N (0, (1 - ρ_{d, s}^{2}) βP)$ , $X_{r} ~ N (0, γP)$ , and $X_{D} ~ N (0, βP)$ . Note that U, U_r, and U′ are independent of S_D. Obviously, from these definitions, it is evident that $E (X_{R}^{2}) + E (X_{D}^{2}) \leq P$ and $E (X_{r}^{2}) \leq γP$ . Through straightforward algebra, it can be shown that the evaluation of the lower bound in Theorem 2 using the above choices gives the lower bound in Theorem 5. The computation details are omitted here for brevity.

We next characterize the capacity of the state-dependent Gaussian orthogonal relay channel with causal channel state information when the power of the relay is sufficiently large. As shown in Theorem 5, a part of the relay's power is used to clean the interference S_D. When the power of the relay is sufficiently large, the interference S_D can be cleaned completely and the capacity of the channel can be determined as shown in the following theorem.

Theorem 6 For the Gaussian orthogonal relay channel with the additive interference sequences known at the source and the relay causally, when the power of the relay satisfies

γP \geq (\frac{P}{4 N} + \frac{N}{4 P} + \frac{Q_{D}}{P} + \frac{1}{2}) P

(45)

the capacity can be characterized as

C (P, γP) = max_{0 \leq β \leq 1} C (\frac{\bar{β} P}{N}) + C (\frac{βP}{N})

(46)

Remark 6 When the power of the relay is sufficiently large such that the interference S _D is completely cleaned by the relay using part of its power and its remaining power is sufficiently large such that the relay-destination link does not constrain the achievable rate, the message sent from the source is split into two parts: one part is sent directly to the destination through a point-to-point source-destination channel and the other is sent to the destination through a two-hop source-relay-destination channel with DF relaying. The two parts are sent independently, and the rate can be expressed as the sum of the rates of the source-destination channel and the two-hop source-relay-destination channel (the rate of the later is constrained by the source-relay link).

Proof Define ρ = (ρ_d,s, ρ_d,r, ρ_r,s). We denote the two terms on the RHS in (44) as

R_{1} (β, ρ) = C (\frac{\bar{β} P}{N}) + C (\frac{(1 - ρ_{d, s}^{2}) (1 - ρ_{d, r}^{2}) βP}{{(\sqrt{Q_{D}} + ρ_{d, s} \sqrt{βP} + ρ_{r, s} \sqrt{γP})}^{2} + N})

(47)

R_{2} (β, ρ) = C (\frac{(1 - ρ_{d, s}^{2}) βP + (1 - ρ_{r, s}^{2}) γP + 2 ρ_{d, r} \sqrt{(1 - ρ_{d, s}^{2}) (1 - ρ_{r, s}^{2}) βγ} P}{{(\sqrt{Q_{D}} + ρ_{d, s} \sqrt{βP} + ρ_{r, s} \sqrt{γP})}^{2} + N})

(48)

Let R(β, ρ) = min{R₁(β, ρ), R₂(β, ρ)}. Then, $R (P, γP) \leq max_{\begin{array}{l} 0 \leq β \leq 1 \\ - 1 \leq ρ_{d, s}, ρ_{r, s}, ρ_{d, r} \leq 1 \end{array}} R (β, ρ)$ is achievable.

It is easy to verify that if γP ≥ Q_D, for any fixed β, R₁(β, ρ) is the maximal when $ρ = ρ_{1}^{*} = (0, 0, - \sqrt{Q_{D} / (γP)})$ . Denote the maximum of R₁(β, ρ) as R₁^*(β). Therefore, we have

R_{1}^{*} (β) = R_{1} (β, ρ_{1}^{*}) = C (\frac{\bar{β} P}{N}) + C (\frac{βP}{N})

(49)

R_{2} (β, ρ_{1}^{*}) = C (\frac{βP + γP - Q_{D}}{N})

(50)

Next, we will show the condition under which $R_{2} (β, ρ_{1}^{*})$ is always larger than R₁^*(β) for any β. Let

1 + \frac{βP + γP - Q_{D}}{N} \geq (1 + \frac{\bar{β} P}{N}) (1 + \frac{βP}{N})

(51)

The inequality in (51) is equivalent to

P^{2} β^{2} - (P^{2} - PN) β + γPN - Q_{D} N - PN \geq 0

(52)

It is easy to show that if

γ \geq \frac{P}{4 N} + \frac{N}{4 P} + \frac{Q_{D}}{P} + \frac{1}{2}

(53)

the inequality in (52) holds for any β. Thus, if $γ \geq \frac{P}{4 N} + \frac{N}{4 P} + \frac{Q_{D}}{P} + \frac{1}{2}$ , the following inequality is always satisfied for any β

R_{2} (β, ρ_{1}^{*}) \geq R_{1}^{*} (β)

(54)

For any β, we have

R (β, ρ_{1}^{*}) = min \{R_{1} (β, ρ_{1}^{*}), R_{2} (β, ρ_{1}^{*})\} = R_{1}^{*} (β)

(55)

Therefore,

R (P, γP) \leq max_{0 \leq β \leq 1} R_{1}^{*} (β) = max_{0 \leq β \leq 1} C (\frac{\bar{β} P}{N}) + C (\frac{βP}{N})

(56)

is achievable.

As mentioned in Remark 3, (39) serves as an upper bound on the capacity of the channel considered here. The converse proof follows by proving that (46) matches the upper bound in (39) if the condition in (45) is satisfied. We denote the two terms on the RHS in (39) as

C_{1} (β, ρ) = C (\frac{\bar{β} P}{N}) + C (\frac{β (1 - ρ^{2}) P}{N})

(57)

C_{2} (β, ρ) = C (\frac{(β + γ + 2 ρ \sqrt{βγ}) P}{N})

(58)

Let C(β, ρ) = min {C₁(β, ρ), C₂(β, ρ)}. Similar to steps from (47) to (55), it is easy to prove that for any β if $γ \geq \frac{P}{4 N} + \frac{N}{4 P} + \frac{1}{2}$

\begin{array}{l} C (β, 0) & = min \{C_{1} (β, 0), C_{2} (β, 0)\} \\ = C_{1} (β, 0) = C (\frac{\bar{β} P}{N}) + C (\frac{βP}{N}) \end{array}

(59)

Next, we have to prove that for any β, under the condition $γ \geq \frac{P}{4 N} + \frac{N}{4 P} + \frac{1}{2}$ , C(β, ρ) is maximized when ρ = 0. Denote the maximal of C(β, ρ) as C*(β), i.e., $C^{*} (β) = max_{- 1 \leq ρ \leq 1} C (β, ρ)$ . This can be proven by contradiction.

Assume that C(β, ρ) is maximized when ρ = ρ′ (ρ′ ≠ 0). By (59), we get

C (β, ρ^{'}) \geq C (β, 0) = C_{1} (β, 0)

(60)

However, we have

C (β, ρ^{'}) = min \{C_{1} (β, ρ^{'}), C_{2} (β, ρ^{'})\} \leq C_{1} (β, ρ^{'})

(61)

From (57), it is easy to verify that for any β, C₁(β, ρ) is maximized when ρ = 0. Thus, (60) and (61) are contradictory. This proves that for any β, $C^{*} (β) = C_{1} (β, 0) = C (\frac{\bar{β} P}{N}) + C (\frac{βP}{N})$ . Thus, the maximization problem in (39) is equivalent to the following maximization problem

C (P, γP) \leq max_{0 \leq β \leq 1} C (\frac{\bar{β} P}{N}) + C (\frac{βP}{N})

(62)

This completes the proof.

6 Numerical examples

In this section, we provide some numerical examples for the achievable rate in Theorem 5. With these examples, we will show the impact of the channel state and the role of the relay in information transmission and in cleaning the channel state.

For γ = 1, Figure 4 shows a comparison of the capacity of the state-independent (Q_R = Q_D = 0) relay channel with orthogonal components and the achievable rate derived in Theorem 5. Obviously, the larger the power of the additive interference, more power of the source and the relay will be used to clean the interference; this results in a lower achievable rate. As the power values (P) of the source and relay increase, a larger amount of interference can be cleaned, leaving more power for information transmission. Consequently, the achievable rate will approach the capacity of the state-independent relay channel with orthogonal components as P increases. This can also be verified from (44) such that if P ≫ Q_D, the impact of the additive interference S_D will be negligible with respect to P. The maximization problem of (44) is approximate to that of (39) by taking ρ_r,s → 0 and ρ_d,s → 0.

For P/N = 30, Figure 5 shows the role of the relay in cleaning the channel state. As the power of the relay increases, the achievable rate increases. In particular, when the power of the relay is sufficiently large such that the channel state can be cleaned completely and the relay-destination link does not become the bottleneck for the achievable rate, the achievable rate matches the upper bound. This has been proven in Theorem 6. Figure 5 vividly illustrates this result.

7 Conclusions

In this paper, we consider a state-dependent relay channel with orthogonal channels from the source to the relay and from the source and the relay to the destination. The orthogonal channels are affected by two independent channel states, respectively, and the channel state information is known to both the source and the relay either non-causally or causally. In the non-causal state information case, the lower bound on the capacity of the channel is established with superposition coding at the source, PDF relaying at the relay and cooperative GP coding at the source and the relay. We further show that if the output of the destination Y is a deterministic function of the relay input X_r, the channel state S_D and one of the source inputs X_D, i.e., Y = f(X_D, X_r, S_D) and the relay output Y_r is restricted to be controlled by only the source input X_R and the channel state S_R, the lower bound is tight, and the capacity can be characterized exactly. As for the causal channel state information case, the lower bound on the capacity is also derived. The expression for the achievable rate in the causal state information case can be interpreted as a special case of that for the achievable rate in the non-causal state information case, where the auxiliary random variables U and U_r are independent of S_R and S_D. This is similar to the relation between the expression for the capacity of the state-dependent channel with causal channel state information introduced by Shannon [2] and its non-causal counterpart, the Gel'fand-Pinsker channel [3].

Further, we investigate the Gaussian state-dependent relay channel with orthogonal components, modeling the channel states as additive Gaussian interferences. Capacity is characterized when the additive interference sequences are known non-causally. The expression for the capacity is the same as that for the capacity of the state-independent relay channel with orthogonal components. This observation is similar to the results for the multiple user state-dependent channels shown in [6]. When the state information is known causally, however, the capacity is not characterized in general. In this case, with carefully chosen auxiliary random variables, achievable rate is derived. It is shown that when the power of the relay is sufficiently large, the capacity can be characterized exactly. Finally, two numerical examples are given to illustrate the impact of the channel state and the role of the relay in information transmission and in cleaning the state. The simulation results show that the larger the power of the additive interference, the more power of the source and the relay will be spent to clean the interference; this results in a lower achievable rate. However, as the power P increases, the impact of the interference will be negligible if P ≫ Q_D and the achievable rate will approach the capacity of the state-independent relay channel with orthogonal components. The simulation results also illustrate that when the power of the relay satisfies $γP \geq (\frac{P}{4 N} + \frac{N}{4 P} + \frac{Q_{D}}{P} + \frac{1}{2}) P$ , the capacity of the channel can be characterized.

Appendices

Appendix 1

Proof of Theorem 1: Analysis of probability of error

The average probability of error is given by

\begin{array}{l} P_{e} & \leq \sum_{\begin{array}{l} s_{D}^{n} \notin T_{ϵ}^{n} (S_{D}) \\ s_{R}^{n} \notin T_{ϵ}^{n} (S_{R}) \end{array}} Pr (s_{D}^{n}) Pr (s_{R}^{n}) \\ + \sum_{\begin{array}{l} s_{D}^{n} \in T_{ϵ}^{n} (S_{D}) \\ s_{R}^{n} \in T_{ϵ}^{n} (S_{R}) \end{array}} Pr (s_{D}^{n}) Pr (s_{R}^{n}) \Pr (error | s_{D}^{n}, s_{R}^{n}) \end{array}

(63)

By the AEP, the first term $Pr (s_{D}^{n} \notin T_{ϵ}^{n} (S_{D})) Pr (s_{R}^{n} \notin T_{ϵ}^{n} (S_{R}))$ on the RHS of (63) goes to 0 as n → ∞. It is sufficient to upper bound the second term of the RHS of (63). We now examine the probabilities of the error events associated with the encoding and decoding steps. The error event is contained in the union of the following error events, where E_1k and E_2k correspond to the encoding steps in block k, $E_{3 s_{R} k}$ and $E_{4 s_{R} k}$ correspond to decoding the message ${\hat{m}}_{s_{R}} (k)$ at the relay in block k given S_R = s_R, the events E_5k and E_6k correspond to decoding w_D(k) at the destination in block k. The probability of error $\Pr (error | s_{D}^{n}, s_{R}^{n})$ is upper bounded as

\begin{array}{l} \Pr (error | s_{D}^{n}, s_{R}^{n}) \leq \Pr (E_{1 k}) + \Pr (E_{2 k}) \\ + \sum_{s_{R} \in S_{R}} P (S_{R} = s_{R}) \Pr (E_{3 s_{R} k} | E_{1 k}^{c} E_{2 k}^{c}) \\ + \sum_{s_{R} \in S_{R}} P (S_{R} = s_{R}) \Pr (E_{4 s_{R} k} | E_{1 k}^{c} E_{2 k}^{c} E_{3 s_{R} k}^{c}) \\ + \Pr (E_{5 k} | E_{1 k}^{c} E_{2 k}^{c}) + \Pr (E_{6 k} | E_{1 k}^{c} E_{2 k}^{c} E_{5 k}^{c}), \end{array}

where E_mk^c(m = 1, 2, 3s_R, 5) denotes the corresponding event complement of E_mk.

Let E_1k be the event that there is no sequence $u_{r}^{n} (w_{R} (k - 1), j_{r} (k))$ jointly typical with $s_{D}^{n} (k)$ , i.e.,

\begin{array}{l} E_{1 k} = \{∄ j_{r} (k) \in \{1, 2, \dots, 2^{n R_{r, s}}\} s . t . (u_{r}^{n} (w_{R} (k - 1), \\ j_{r} (k)), s_{D}^{n} (k) \in) T_{ϵ}^{n} (U_{r}, S_{D}) \} \end{array}

For $u_{r}^{n} (w_{R} (k - 1), j_{r} (k))$ and $s_{D}^{n} (k)$ generated independently with i.i.d. components according to $P_{U_{r}}$ and $Q_{S_{D}}$ , respectively, the probability that there exits at least one $j_{r} (k) \in \{1, 2, \dots, 2^{n R_{r, s}}\}$ such that $u_{r}^{n} (w_{R} (k - 1), j_{r} (k))$ is jointly typical with $s_{D}^{n} (k)$ is greater than $(1 - ϵ) 2^{- n (I (U_{r}; S_{D}) + δ (ϵ))}$ for n sufficiently large. There are $2^{n R_{r, s}}$ such $u_{r}^{n}$ s in each bin. Therefore, the probability of event E_1k is bounded by

Pr (E_{1 k}) \leq {[1 - (1 - ϵ) 2^{- n (I (U_{r}; S_{D}) + δ (ϵ))}]}^{2^{n R_{r, s}}}

(64)

Taking the logarithm on both sides of (64) and following from the inequality ln(x) ≤ x − 1, we have $\ln (Pr (E_{1 k})) \leq - (1 - ϵ) 2^{n (R_{r, s} - I (U_{r}; S_{D}) - δ (ϵ))}$ . Thus, if R_r,s > I(U_r; S_D) + δ(ϵ), Pr(E_1k) → 0 as n → ∞, where δ(ϵ) → 0 as ϵ → 0.

Let E_2k be the event that there is no sequence uⁿ(w_D(k), j_d(k)|w_R(k − 1), j_r(k)) jointly typical with $s_{D}^{n} (k)$ , given $u_{r}^{n} (w_{R} (k - 1), j_{r} (k))$ , i.e.,

\begin{array}{l} E_{2 k} = \{∄ j_{d} (k) \in \{1, 2, \dots, 2^{n R_{d, s}}\} s . t .\} \\ (u^{n} (w_{D} (k), j_{d} (k) | w_{R} (k - 1), j_{r} (k))), \\ u_{r}^{n} (w_{R} (k - 1), j_{r} (k)), s_{D}^{n} (k) \in T_{ϵ}^{n} (U, U_{r}, S_{D})\} \end{array}

Similar to the analysis on the probability of the event E_1k, if R_d,s > I(U; S_D|U_r) + δ(ϵ), Pr(E_2k) → 0 as n → ∞.

For each $s_{R} \in S_{R}$ , let $E_{3 s_{R} k}$ be the event that $x_{R}^{n (1 - ϵ) p (s_{R})} (m_{s_{R}} (k) | s_{R}, w_{R} (k - 1), j_{r} (k))$ is not jointly typical with $y_{r, s_{R}}^{n (1 - ϵ) p (s_{R})} (k)$ , given $u_{r, s_{R}}^{n (1 - ϵ) p (s_{R})} (w_{R} (k - 1), j_{r} (k))$ , $x_{r, s_{R}}^{n (1 - ϵ) p (s_{R})} (k)$ and S_R = s_R, i.e.,

\begin{array}{l} E_{3 s_{R} k} & = \{(x_{R}^{n (1 - ϵ) p (s_{R})} (m_{s_{R}} (k) | s_{R}, w_{R} (k - 1), j_{r} (k))), \\ u_{r, s_{R}}^{n (1 - ϵ) p (s_{R})} (w_{R} (k - 1), j_{r} (k)), x_{r, s_{R}}^{n (1 - ϵ) p (s_{R})} (k) \\ y_{r, s_{R}}^{n (1 - ϵ) p (s_{R})} (k)) \\ \notin T_{ϵ}^{n} (X_{R}, U_{r}, X_{r}, S_{R}, Y_{r}) given S_{R} = s_{R}\} \end{array}

By the LLN, for all $s_{R} \in S_{R}$ , $\Pr (E_{3 s_{R} k} | E_{1 k}^{c} E_{2 k}^{c}) \to 0$ as n → ∞. Consequently, $\sum_{s_{R} \in S_{R}} P (S_{R} = s_{R}) \Pr (E_{3 s_{R} k} | E_{1 k}^{c} E_{2 k}^{c}) \to 0$ as n → ∞.

For each $s_{R} \in S_{R}$ , let $E_{4 s_{R} k}$ be the event that $x_{R}^{n (1 - ϵ) p (s_{R})} ({\hat{m}}_{s_{R}} (k) | s_{R}, w_{R} (k - 1), j_{r} (k))$ is jointly typical with $y_{r, s_{R}}^{n (1 - ϵ) p (s_{R})} (k)$ , given $u_{r, s_{R}}^{n_{s_{R}} (k)} (w_{R} (k - 1), j_{r} (k))$ , $x_{r, s_{R}}^{n (1 - ϵ) p (s_{R})} (k)$ and S_R = s_R for some ${\hat{m}}_{s_{R}} (k) \neq m_{s_{R}} (k)$ , i.e.,

\begin{array}{l} E_{4 s_{R} k} & = \{\exists {\hat{m}}_{s_{R}} (k) \in \{1, 2, \dots, 2^{n R_{s_{R}}}\} s . t . {\hat{m}}_{s_{R}} (k) \neq m_{s_{R}} (k), \\ (x_{R}^{n (1 - ϵ) p (s_{R})} ({\hat{m}}_{s_{R}} (k) | s_{R}, w_{R} (k - 1), j_{r} (k)), \\ u_{r, s_{R}}^{n (1 - ϵ) p (s_{R})} (w_{R} (k - 1), j_{r} (k)), x_{r, s_{R}}^{n (1 - ϵ) p (s_{R})} (k), \\ y_{r, s_{R}}^{n (1 - ϵ) p (s_{R})} (k)) \\ \in T_{ϵ}^{n} (X_{R}, U_{r}, X_{r}, S_{R}, Y_{r}) given S_{R} = s_{R}\} \end{array}

Conditioned on the events E_1k^c, E_2k^c and $E_{3 s_{R} k}^{c}$ , for all $s_{R} \in S_{R}$ , by the joint typicality lemma [31], the probability that $(x_{R}^{n (1 - ϵ) p (s_{R})} ({\hat{m}}_{s_{R}} (k) | s_{R}, w_{R} (k - 1), j_{r} (k)), u_{r, s_{R}}^{n (1 - ϵ) p (s_{R})} (w_{R} (k - 1), j_{r} (k)), x_{r, s_{R}}^{n (1 - ϵ) p (s_{R})} (k), y_{r, s_{R}}^{n (1 - ϵ) p (s_{R})} (k))$ $\in T_{ϵ}^{n (1 - ϵ) p (s_{R})} (X_{R}, U_{r}, X_{r}, S_{R}, Y_{r})$ given $u_{r, s_{R}}^{n (1 - ϵ) p (s_{R})} (w_{R} (k - 1), j_{r} (k)),$ $x_{r, s_{R}}^{n (1 - ϵ) p (s_{R})} (k)$ and S_R = s_R for ${\hat{m}}_{s_{R}} (k) \neq m_{s_{R}} (k)$ is less than $2^{- n (1 - ϵ) p (s_{R}) (I (X_{R}; Y_{r} | U_{r}, X_{r}, S_{R} = s_{R}) - δ (ϵ))}$ for sufficiently large n. There are $2^{n R_{s_{R}}}$ (exactly $2^{n R_{s_{R}}} - 1$ ) such $x_{R}^{n (1 - ϵ) p (s_{R})}$ s. Thus, the conditional probability of event $E_{4 s_{R} k}$ given E_1k^c, E_2k^c and $E_{3 s_{R} k}^{c}$ is upper bounded by

\begin{array}{l} \Pr (E_{4 s_{R} k} | E_{1 k}^{c} E_{2 k}^{c} E_{3 s_{R} k}^{c}) \\ \leq 2^{- n ((1 - ϵ) p (s_{R}) I (X_{R}; Y_{r} | U_{r}, X_{r}, S_{R} = s_{R}) - (1 - ϵ) p (s_{R}) δ (ϵ) - R_{s R})} \end{array}

(65)

From (65), $\Pr (E_{4 s_{R} k} | E_{1 k}^{c} E_{2 k}^{c} E_{3 s_{R} k}^{c}) \to 0$ as n → ∞ if

R_{s_{R}} < (1 - ϵ) p (s_{R}) (I (X_{R}; Y_{r} | U_{r}, X_{r}, S_{R} = s_{R}) - δ (ϵ))

Since $R_{R} = \sum_{s_{R} \in S_{R}} R_{s_{R}}$ , $\sum_{s_{R} \in S_{R}} P (S_{R} = s_{R}) \Pr (E_{4 s_{R} k} | E_{1 k}^{c} E_{2 k}^{c} E_{3 s_{R} k}^{c}) \to 0$ as n → ∞ if

R_{R} < (1 - ϵ) (I (X_{R}; Y_{r} | U_{r}, X_{r}, S_{R}) - δ (ϵ)),

where δ(ϵ) → 0 as ϵ → 0.

Let E_5k be the event that uⁿ(w_D(k), j_d(k)|w_R(k − 1), j_r(k)), $u_{r}^{n} (w_{R} (k - 1), j_{r} (k))$ and yⁿ(k) are not jointly typical, i.e.,

\begin{array}{l} E_{5 k} & = \{(u^{n} (w_{D} (k), j_{d} (k) | w_{R} (k - 1), j_{r} (k)), u_{r}^{n} (w_{R} (k - 1), \\ j_{r} (k)), y^{n} (k)) \notin T_{ϵ}^{n} (U, U_{r}, Y)\} \end{array}

Conditioned on the events E_1k^c and E_2k^c, we have Pr(E_5k|E_1k^cE_2k^c) → 0 as n → ∞ by the Markov Lemma.

Let E_6k be the event that $u^{n} ({\hat{w}}_{D} (k), {\hat{j}}_{d} (k) | {\hat{w}}_{R} (k - 1), {\hat{j}}_{r} (k))$ and $u_{r}^{n} ({\hat{w}}_{R} (k - 1), {\hat{j}}_{r} (k))$ are jointly typical with yⁿ(k) for some $({\hat{w}}_{D} (k), {\hat{w}}_{R} (k - 1)) \in \{1, 2, \dots, 2^{n R_{D}}\} \times \{1, 2, \dots, 2^{n R_{R}}\}$ , ${\hat{j}}_{d} (k) \in \{1, 2, \dots, 2^{n R_{d, s}}\}$ and ${\hat{j}}_{r} (k) \in \{1, 2, \dots, 2^{n R_{r, s}}\}$ , with $({\hat{w}}_{D} (k), {\hat{w}}_{R} (k - 1)) \neq (w_{D} (k), w_{R} (k - 1))$ , i.e.,

\begin{array}{l} E_{6 k} & = \{\exists ({\hat{w}}_{D} (k), {\hat{w}}_{R} (k - 1)) \in \{1, 2, \dots, 2^{n R_{D}}\} \\ \times \{1, 2, \dots, 2^{n R_{R}}\}, \\ {\hat{j}}_{d} (k) \in \{1, 2, \dots, 2^{n R_{d, s}}\}, {\hat{j}}_{r} (k) \in \{1, 2, \dots, 2^{n R_{r, s}}\} \\ s . t . ({\hat{w}}_{D} (k), {\hat{w}}_{R} (k - 1)) \neq (w_{D} (k), w_{R} (k - 1)), \\ (u^{n} ({\hat{w}}_{D} (k), {\hat{j}}_{d} (k) | {\hat{w}}_{R} (k - 1), {\hat{j}}_{r} (k)), \\ u_{r}^{n} ({\hat{w}}_{R} (k - 1), {\hat{j}}_{r} (k)), y^{n} (k)\} \in T_{ϵ}^{n} (U, U_{r}, Y) \end{array}

We split the potential event E_6k into three disjoint parts: first, ${\hat{w}}_{D} (k) = w_{D} (k)$ and ${\hat{w}}_{R} (k - 1) \neq w_{R} (k - 1)$ ; second, ${\hat{w}}_{D} (k) \neq w_{D} (k)$ and ${\hat{w}}_{R} (k - 1) = w_{R} (k - 1)$ ; third, ${\hat{w}}_{D} (k) \neq w_{D} (k)$ and ${\hat{w}}_{R} (k - 1) \neq w_{R} (k - 1)$ , i.e.,

\begin{array}{l} E_{6 k_1} & = \{\exists {\hat{w}}_{R} (k - 1) \in \{1, 2, \dots, 2^{n R_{R}}\}, {\hat{j}}_{r} (k) \in \{1, 2, \dots, 2^{n R_{r, s}}\} \\ s . t . {\hat{w}}_{R} (k - 1) \neq w_{R} (k - 1), \\ (u^{n} (w_{D} (k), j_{d} (k) | {\hat{w}}_{R} (k - 1), {\hat{j}}_{r} (k)), \\ u_{r}^{n} ({\hat{w}}_{R} (k - 1), {\hat{j}}_{r} (k)), y^{n} (k) \in T_{ϵ}^{n} (U, U_{r}, Y)\} \end{array}

\begin{array}{l} E_{6 k_2} & = \{\exists {\hat{w}}_{D} (k) \in \{1, 2, \dots, 2^{n R_{D}}\}, {\hat{j}}_{d} (k) \in \{1, 2, \dots, 2^{n R_{d, s}}\}, \\ s . t . {\hat{w}}_{D} (k) \neq w_{D} (k), \\ (u^{n} ({\hat{w}}_{D} (k), {\hat{j}}_{d} (k) | w_{R} (k - 1), j_{r} (k)), \\ u_{r}^{n} (w_{R} (k - 1), j_{r} (k)), y^{n} (k) \in T_{ϵ}^{n} (U, U_{r}, Y)\} \end{array}

\begin{array}{l} E_{6 k_3} & = \{\exists ({\hat{w}}_{D} (k), {\hat{w}}_{R} (k - 1)) \in \{1, 2, \dots, 2^{n R_{D}}\} \\ \times \{1, 2, \dots, 2^{n R_{D}}\}, {\hat{j}}_{d} (k) \in \{1, 2, \dots, 2^{n R_{d, s}}\}, \\ {\hat{j}}_{r} (k) \in {1, 2, \dots, 2^{n R_{r, s}}} s . t . {\hat{w}}_{D} (k) \neq w_{D} (k), \\ {\hat{w}}_{R} (k - 1) \neq w_{R} (k - 1), \\ (u^{n} ({\hat{w}}_{D} (k), {\hat{j}}_{d} (k) | {\hat{w}}_{R} (k - 1), {\hat{j}}_{r} (k)), \\ u_{r}^{n} ({\hat{w}}_{R} (k - 1), {\hat{j}}_{r} (k)), y^{n} (k) \in T_{ϵ}^{n} (U, U_{r}, Y)\} \end{array}

Conditioned on the events E_1k^c, E_2k^c and E_5k^c, by the joint typicality lemma, the probability of the above three events are bounded by

Pr (E_{6 k_1} | E_{1 k}^{c} E_{2 k}^{c} E_{5 k}^{c}) \leq 2^{- n (I (U, U_{r}; Y) - δ (ϵ) - R_{R})}

(66)

Pr (E_{6 k_2} | E_{1 k}^{c} E_{2 k}^{c} E_{5 k}^{c}) \leq 2^{- n (I (U; Y | U_{r}) - δ (ϵ) - R_{D})}

(67)

Pr (E_{6 k_3} | E_{1 k}^{c} E_{2 k}^{c} E_{5 k}^{c}) \leq 2^{- n (I (U, U_{r}; Y) - δ (ϵ) - (R_{D} + R_{R}))}

(68)

From (66) to (68), Pr (E_{6k _1}|E_1k^cE_2k^cE_5k^c) → 0, Pr (E_{6k _2}|E_1k^cE_2k^cE_5k^c) → 0 and Pr (E_{6k _3}|E_1k^cE_2k^cE_5k^c) → 0 as n → ∞, if

R_{D} + R_{d, s} < I (U; Y | U_{r}) - δ (ϵ)

R_{D} + R_{d, s} + R_{R} + R_{r, s} < I (U, U_{r}; Y) - δ (ϵ),

where δ(ϵ) → 0 as ϵ → 0. Note that (68) makes (66) unnecessary. Therefore, Pr (E_6k|E_1k^cE_2k^cE_5k^c) ≤ Pr (E_{6k _1}|E_1k^cE_2k^cE_5k^c) + Pr (E_{6k _2}|E_1k^cE_2k^cE_5k^c) + Pr (E_{6k _3}|E_1k^cE_2k^cE_5k^c) approaches 0 as n → ∞.

It remains to show that the alphabet sizes of the random variables U_r and U can be limited without loss of generality as stated in (8) and (9), respectively. This is done by the support lemma [33]. Fix a distribution μ of (S_D, S_R, U_r, U, X_r, X_R, X_D, Y_r, Y_D) on the Borel σ-algebra of $P (S_{D}, S_{R}, U_{r}, U, X_{r}, X_{R}, X_{D}, Y_{r}, Y_{D})$ that has the form (7).

To bound (8) on $|U_{r}|$ , note that we have

\begin{array}{l} I_{μ} (X_{R}; Y_{r} | U_{r}, X_{r}, S_{R}) + I_{μ} (U; Y | U_{r}) - I_{μ} (U; S_{D} | U_{r}) \\ = H_{μ} (X_{R}, X_{r}, S_{R} | U_{r}) + H_{μ} (X_{r}, S_{R}, Y_{r} | U_{r}) \\ - H_{μ} (X_{r}, S_{R} | U_{r}) - H_{μ} (X_{R}, X_{r}, S_{R}, Y_{r} | U_{r}) \\ + H_{μ} (Y | U_{r}) \\ - H_{μ} (U, Y | U_{r}) + H_{μ} (U, S_{D} | U_{r}) - H_{μ} (S_{D} | U_{r}) \end{array}

(69)

\begin{array}{l} I_{μ} (U, U_{r}; Y) - I_{μ} (U, U_{r}; S_{D}) \\ = H_{μ} (Y) - H_{μ} (S_{D}) - H_{μ} (U, Y | U_{r}) + H_{μ} (U, S_{D} | U_{r}) \end{array}

(70)

Hence, it suffices to show that the following functionals of μ(S_D, S_R, U_r, U, X_r, X_R, X_D, Y_r, Y_D):

\begin{array}{l} f_{S_{D}, S_{R}, X_{r}, X_{D}} & = μ (s_{D}, s_{R}, x_{r}, x_{D}) \forall (s_{D}, s_{R}, x_{r}, x_{D}) \in S_{D} \\ \times S_{R} \times X_{r} \times X_{D} \end{array}

(71)

\begin{array}{l} f_{0} (μ) & = \int_{μ} d_{μ} (u_{r}) (H_{μ} (X_{R}, X_{r}, S_{R} | U_{r}) + H_{μ} (X_{r}, S_{R}, Y_{r} | U_{r}) \\ - H_{μ} (X_{r}, S_{R} | U_{r}) - H_{μ} (X_{R}, X_{r}, S_{R}, Y_{r} | U_{r}) \\ + H_{μ} (Y | u_{r}) - H_{μ} (U, Y | u_{r}) \\ + H_{μ} (U, S_{D} | u_{r}) - H_{μ} (S_{D} | u_{r})) \end{array}

(72)

f_{1} (μ) = \int_{μ} d_{μ} (u_{r}) (H_{μ} (U, S_{D} | u_{r}) - H_{μ} (U, Y | u_{r}))

(73)

can be preserved with another measure μ′ that has the form (7). This condition can be satisfied by the support lemma. Observing that there are $|S_{D}| |S_{R}| |X_{r}| |X_{D}| + 1$ functionals (in (71), there are $|S_{D}| |S_{R}| |X_{r}| |X_{D}| - 1$ degrees of freedom) in (71) to (73), according to the support lemma, the cardinality of the alphabet of the auxiliary random variable U_r can be taken to $|S_{D}| |S_{R}| |X_{r}| |X_{D}| + 1$ without altering I(X_R; Y_r|U_r, X_r, S_R) + I(U; Y|U_r) − I(U; S_D|U_r) and I(U, U_r; Y) − I(U, U_r; S_D).

Once the alphabet of U_r is fixed, the alphabet of U is bounded in a similar way. $|S_{D}| |S_{R}| |X_{r}| |X_{D}| (|S_{D}| |S_{R}| |X_{r}| |X_{D}| + 1) + 1$ functionals must be satisfied to preserve the joint distribution of U_r, S_D, S_R, X_r, X_D and two more functionals to preserve

I_{μ} (U; Y | U_{r}) - I_{μ} (U; S_{D} | U_{r}) = H_{μ} (Y | U_{r}) - H_{μ} (S_{D} | U_{r}) - H_{μ} (U_{r}, Y | U) + H_{μ} (S_{D}, U_{r} | U)

(74)

I_{μ} (U, U_{r}; Y) - I_{μ} (U, U_{r}; S_{D}) = H_{μ} (Y) - H_{μ} (S_{D}) - H_{μ} (U_{r}, Y | U) + H_{μ} (U_{r}, S_{D} | U)

(75)

yielding the bound indicated in (9).

This completes the proof.

Appendix 2

Outline of the proof of Theorem 4

With the random variables defined in (40) to (43), we rewrite (37) and (38) as follows:

Y_{r} = X_{R} + S_{R} + Z_{r}

(76)

\begin{array}{l} Y = X_{D} + X_{r} + S_{D} + Z_{d} \\ = X_{D 0} + (1 + ρ \sqrt{β / γ}) X_{r} + S_{D} + Z_{d} \end{array}

(77)

With these definitions, it is easy to verify that $U = X_{D} + (α (1 + α_{r}) + \frac{α_{r} ρ \sqrt{β / γ}}{(1 + ρ \sqrt{β / γ})}) S_{D}$ , U₀ and U_r are conditionally independent, given S_D, and α and α_r will be clear in the sequel.

Use the two random variables U_r and U to generate the auxiliary codewords $u_{r}^{n}$ and uⁿ. Under these choices of the random variables, codebook generation, encoding and decoding are the same as that in the DM case except that successive decoding is used to decode the quadruple $({\hat{w}}_{R} (k - 1), {\hat{j}}_{r} (k), {\hat{w}}_{D} (k), {\hat{j}}_{d} (k))$ instead of joint decoding in the DM case.

Observing yⁿ(k), the destination firstly find a pair $({\hat{w}}_{R} (k - 1), {\hat{j}}_{r} (k))$ such that

(u_{r}^{n} ({\hat{w}}_{R} (k - 1), {\hat{j}}_{r} (k)), y^{n} (k)) \in A_{ϵ}^{n} (U_{r}, Y)

If there is no such pair or it is not unique, an error is declared. By the packing lemma [31], it can be shown that for sufficiently large n, decoding is correct with high probability if

R_{R} + R_{r, s} \leq I (U_{r}; Y)

(78)

From (10) and (78), R_R ≤ I(U_r; Y) − I(U_r; S_D).

Treating X_D0 in (77) as noise, we define the auxiliary random variable U_r as

U_{r} = (1 + ρ \sqrt{β / γ}) X_{r} + α_{r} S_{D}

(79)

By Costa's dirty paper coding, let

α_{r} = \frac{{(\sqrt{γ} + ρ \sqrt{β})}^{2} P}{(γ + β + 2 ρ \sqrt{βγ}) P + N}

(80)

then

\begin{array}{l} R_{R} \leq I (U_{r}; Y) - I (U_{r}; S_{D}) \\ = C (\frac{{(\sqrt{γ} + ρ \sqrt{β})}^{2} P}{(1 - ρ^{2}) βP + N}) \end{array}

(81)

is achievable.

Since the destination has successfully decoded $u_{r}^{n} ({\hat{w}}_{R} (k - 1), {\hat{j}}_{r} (k))$ , the destination can peel off U_r to make the channel to the destination equivalent to

\begin{array}{l} Y^{'} = Y - U_{r} \\ = X_{D 0} + (1 + ρ \sqrt{\frac{β}{γ}}) X_{r} + S_{D} + Z_{d} - [(1 + ρ \sqrt{\frac{β}{γ}}) X_{r} + α_{r} S_{D}] \\ = X_{D 0} + (1 + α_{r}) S_{D} + Z_{d} \end{array}

(82)

The destination finds a unique pair $({\hat{w}}_{D} (k), {\hat{j}}_{d} (k))$ such that

\begin{array}{l} (u^{n} ({\hat{w}}_{D} (k), {\hat{j}}_{d} (k) | {\hat{w}}_{R} (k - 1), {\hat{j}}_{r} (k)), u_{r}^{n} ({\hat{w}}_{R} (k - 1), {\hat{j}}_{r} (k)), {y^{'}}^{n} (k)) \\ \times \in A_{ϵ}^{n} (U, U_{r}, Y^{'}) \end{array}

If there is no such pair or it is not unique, an error is declared. By the packing lemma [31], it can be shown that for sufficiently large n, decoding is correct with high probability if

R_{D} + R_{d, s} \leq I (U; Y^{'} | U_{r})

(83)

From (11) and (83), R_D ≤ I(U; Y′|U_r) − I(U; S_D|U_r) is achievable. The proof will continue after the following lemma.

Lemma 1 For the considered channel here, if R_D ≤ I(U; Y′|U_r) − I(U; S_D|U_r) is achievable, R_D ≤ I(U₀; Y′) − I(U₀; S_D) is achievable.

Proof We only have to verify I(U; Y′|U_r) − I(U; S_D|U_r) ≥ I(U₀; Y′) − I(U₀; S_D).

\begin{array}{l} I (U; Y^{'} | U_{r}) - I (U; S_{D} | U_{r}) \\ = H (U | U_{r}) - H (U | U_{r}, Y^{'}) - H (U | U_{r}) + H (U | U_{r}, S_{D}) \\ = H (U | U_{r}, S_{D}) - H (U | U_{r}, Y^{'}) \\ = H (U_{0} + \frac{ρ \sqrt{β / γ}}{(1 + ρ \sqrt{β / γ})} U_{r} | U_{r}, S_{D}) \\ - H (U_{0} + \frac{ρ \sqrt{β / γ}}{(1 + ρ \sqrt{β / γ})} U_{r} | U_{r}, Y^{'}) \\ = H (U_{0} | U_{r}, S_{D}) - H (U_{0} | U_{r}, Y^{'}) \\ \geq H (U_{0} | S_{D}) - H (U_{0} | Y^{'}) \\ = I (U_{0}; Y^{'}) - I (U_{0}; S_{D}), \end{array}

(84)

where the inequality holds due to the facts that U₀ and U_r are conditionally independent given S_D and conditioning reduces entropy.

Thus, if R_D ≤ I(U; Y′|U_r) − I(U; S_D|U_r) is achievable, we have

R_{D} \leq I (U_{0}; Y^{'}) - I (U_{0}; S_{D})

(85)

is achievable.

This completes the proof of Lemma 1.

By (82), we define

U_{0} = X_{D, 0} + α (1 - α_{r}) S_{D}

(86)

With Costa's dirty paper coding, let

α = \frac{(1 - ρ^{2}) βP}{(1 - ρ^{2}) βP + N}

(87)

Then by Lemma 1, we have

\begin{array}{l} R_{D} \leq I (U_{0}; Y^{'}) - I (U_{0}; S_{D}) \\ = C (\frac{(1 - ρ^{2}) βP}{N}) \end{array}

(88)

is achievable.

The decoding at the relay is the same as that in the DM case except that the interference sequence $s_{R}^{n} (k)$ can be subtracted before decoding w_R(k) since $s_{R}^{n} (k)$ is additive and known to the relay. This makes the channel from the source to the relay equivalent to

Y_{r}^{'} = X_{R} + Z_{r}

(89)

Therefore,

R_{R} \leq C (\frac{\bar{β} P}{N})

(90)

is achievable.

Combining (81), (88) and (90), we show that R = R_R + R_D is achievable when the following condition is satisfied

\begin{array}{l} C (P, γP) & = max_{0 \leq β, ρ \leq 1} min \\ \times \{C (\frac{(β + γ + 2 ρ \sqrt{βγ}) P}{N_{2}}), C (\frac{\bar{β} P}{N_{1}}) + C (\frac{β (1 - ρ^{2}) P}{N_{2}})\} \end{array}

(91)

This completes the proof.

References

Keshet G, Steinberg Y, Merhav N: Channel coding in the presence of side information: subject review. Found. Trends Commun. Inf. Theory 2008, 4(6):44-586.
Article Google Scholar
Shannon CE: Channels with side information at the transmitter. IBM J. Res. Develop. 1958, 2(4):289-293.
Article MathSciNet Google Scholar
Gel'fand SI, Pinsker MS: Coding for channel with random parameters. Prob. Contr. Inf. Theory 1980, 9(1):19-31.
MATH MathSciNet Google Scholar
Costa MHM: Writing on dirty paper. IEEE Trans. Inf. Theory 1983, 29(3):439-441. 10.1109/TIT.1983.1056659
Article MATH Google Scholar
Gel'fand SI, Pinsker MS: On Gaussian channel with random parameters. Paper presented at the IEEE International Symposium on Information Theory, U.S.S.R Tashkent, 18–22 Sept 1984 247-250.
Google Scholar
Sigurjonsson S, Kim YH: On multiple user channels with state information at the transmitters. Paper presented at the IEEE international symposium on information theory (ISIT), Adelaide, 4–9 Sept 2005 72-76.
Google Scholar
Steinberg Y, Shamai S: (Shitz), Achievable rates for the broadcast channel with states known at the transmitter. Paper presented at the IEEE international symposium on information theory (ISIT), Adelaide, SA, 4–9 Sept 2005 2184-2188.
Google Scholar
Steinberg Y: Coding for the degraded broadcast channel with random parameters, with causal and noncausal side information. IEEE Trans. Inf. Theory. 2005, 51(8):2867-2877. 10.1109/TIT.2005.851727
Article MATH Google Scholar
Nagananda KG, Murthy CR, Kishore S: nt broadcast channels. (Cornell University Library, 2012). . Accessed 12 Aug 2012 http://arxiv.org/abs/1110.0124
Kotagiri S, Laneman JN: Achievable rates for multiple access channels with state information known at one encoder. Paper presented at the Allerton conference on communications, control, and computing, Monticello, IL, USA, 29 Sept–1 Oct 2004
Google Scholar
Kotagiri S, Laneman JN: Multiple access channels with state information known to some encoders and independent messages. EURASIP J. Wireless Commun. Netw. 2008. doi:10.1155/2008/450680
Google Scholar
Cemal Y, Steinberg Y: The multiple-access channel with partial state information at the encoders. IEEE Trans. Inf. Theory. 2005, 51(11):3392-4003.
Article MathSciNet Google Scholar
Kim Y, Sutivong A, Sigurjonsson S: Multiple user writing on dirty paper. Paper presented at the IEEE international symposium on information and theory (ISIT), Chicago, 27 June–2 July 2004 534.
Google Scholar
Somekh-Baruch A, Shamai (Shitz) S, Verdú S: Cooperative multiple-access encoding with states available at one transmitter. IEEE Trans. Inf. Theory 2008, 54(10):4448-4469.
Article Google Scholar
Bross SI, Lapidoth A, Wigger M: Dirty-paper coding for the Gaussian multiaccess channel with conferencing. IEEE Trans. Inf. Theory 2012, 58(9):5640-5668.
Article MathSciNet Google Scholar
Zaidi A, Kotagiri SP, Nicholas Laneman J, Vandendorpe L: Cooperative relaying with state available noncausally at the relay. IEEE Trans. Inf. Theory 2010, 56(5):2272-2298.
Article Google Scholar
Zaidi A, Vandendorpe L: Lower bounds on the capacity of the relay channel with states at the source. EURASIP J. Wireless Commun. Netw. 2009. doi:10.1155/2009/634296
Google Scholar
Zaidi A, Shamai (Shitz) S, Piantanida P, Vandendorpe L: Bounds on the capacity of the relay channel with noncausal state at source. IEEE Trans. Inf. Theory 2013, 59(5):2639-2672.
Article Google Scholar
Zaidi A, Shamai S, Piantanida P, Vandendorpe L: On the capacity of a class of relay channels with orthogonal components and noncausal state information at source. Paper presented at the 5th international symposium on wireless pervasive computing (ISWPC), Modena, Italy, 5–7 May 2010 528-533.
Google Scholar
Bakanoglu K, Erkip E, Simeone O, Shamai (Shitz) S: Relay channel with orthogonal components and structured interference known at the source. IEEE Trans. Comm. 2012, 61(4):1277-289.
Article Google Scholar
Akhbari B, Mirmohseni M, Aref MR: Compress-and-forward strategy for the relay channel with non-causal state information. Paper presented at the IEEE international symposium on information theory (ISIT), Seoul, Korea, 28 June–3 July 2009 1169-1173.
Google Scholar
Chen B, Wornell GW: Quantization index modulation: a class of provably good methods for digital watermarking and information embedding. IEEE Trans. Inf. Theory 2001, 47(4):1423-1443. 10.1109/18.923725
Article MATH MathSciNet Google Scholar
Cohen AS, Lapidoth A: The Gaussian watermarking game. IEEE Trans. Inf. Theory 2002, 48(6):1639-1667. 10.1109/TIT.2002.1003844
Article MATH MathSciNet Google Scholar
Moulin P, O'Sullivan JA: Information-theoretic analysis of information hiding. IEEE Trans. Inf. Theory 2003, 49(3):563-593. 10.1109/TIT.2002.808134
Article MATH MathSciNet Google Scholar
Kusnetsov AV, Tsybakov BS: Coding in a memory with defective cells. Probl. Predach. Inf. 1974, 10: 52-60.
Google Scholar
Mitola J: Cognitive radio: an integrated agent architecture for software defined radio. KTH Royal Institute of Technology: Dissertation; 2000.
Google Scholar
Cover TM, Gamal AEL: Capacity theorems for the relay channel. IEEE Trans. Inf. Theory 1979, IT-25(5):572-584.
Article Google Scholar
Kim YH: Capacity of a class of relay channels. IEEE Trans. Inf. Theory 2008, 54(3):1328-1329.
Article MATH Google Scholar
El Gamal A, Zahedi S: Capacity of a class of relay channels with orthogonal components. IEEE Trans. Inf. Theory 2005, 51(5):1815-1817. 10.1109/TIT.2005.846438
Article MATH MathSciNet Google Scholar
Goldsmith AJ, Varaiya PP: Capacity of fading channels with channel side information IEEE Trans. Inf. Theory 1997, 43(6):1986-1992. 10.1109/18.641562
Article MATH MathSciNet Google Scholar
El Gamal A, Kim YH: Network Information Theory. Cambridge: Cambridge University Press; 2011.
Book MATH Google Scholar
Cover TM, Thomas JA: Elements of Information Theory. 2nd edition. New York: Liley; 2006.
MATH Google Scholar
Csiszár I, Körner J: Information Theory: Coding Theorems for Discrete Memoryless Channels. Waltham: Academic; 1981.
MATH Google Scholar

Download references

Acknowledgements

The work was partly supported by the National Natural Science Foundation of China (Nos. 61271232, 60972045, and 61071089), the open research fund of National Mobile Communications Research Laboratory, Southeast University (No. 2012D05), the Fundamental Research Funds for the Central Universities of China (No. 2013B08214), and the University Postgraduate Research and Innovation Project in Jiangsu Province (No. CXZZ11_0395).

Author information

Authors and Affiliations

College of Telecommunications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing, 21003, China
Zhixiang Deng, Baoyun Wang & Fei Lang
National Mobile Communications Research Laboratory, Southeast University, Nanjing, 210096, China
Baoyun Wang
College of Automation, Nanjing University of Posts and Telecommunications, Nanjing, 210003, China
Baoyun Wang
College of Internet of Things Engineering, Hohai University, Changzhou, 213022, China
Zhixiang Deng

Authors

Zhixiang Deng
View author publications
You can also search for this author in PubMed Google Scholar
Baoyun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Fei Lang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Baoyun Wang.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Deng, Z., Wang, B. & Lang, F. The capacity of a class of state-dependent relay channel with orthogonal components and side information at the source and the relay. J Wireless Com Network 2014, 59 (2014). https://doi.org/10.1186/1687-1499-2014-59

Download citation

Received: 01 February 2013
Accepted: 24 March 2014
Published: 16 April 2014
DOI: https://doi.org/10.1186/1687-1499-2014-59

The capacity of a class of state-dependent relay channel with orthogonal components and side information at the source and the relay

Abstract

1 Introduction

1.1 Background

1.2 Motivation

1.3 Main contributions and organization of the paper

2 Notations and problem setup

3 Discrete memoryless case

3.1 Non-causal channel state information

3.1.1 Outline of the proof of Theorem 1

3.1.2 Codebook generation

3.1.3 Encoding

3.1.4 Decoding

3.2 Causal channel state information

4 Semi-deterministic orthogonal relay channel with non-causal channel state information

5 Memoryless Gaussian case

5.1 Channel state information non-causally known to the source and the relay

5.2 Channel state information known at the source and the relay causally

6 Numerical examples

7 Conclusions

Appendices

Appendix 1

Appendix 2

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ original submitted files for images

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Rights and permissions

About this article

Cite this article

Share this article

Keywords