Skip to content

Advertisement

  • Research
  • Open Access

The capacity of a class of state-dependent relay channel with orthogonal components and side information at the source and the relay

EURASIP Journal on Wireless Communications and Networking20142014:59

https://doi.org/10.1186/1687-1499-2014-59

  • Received: 1 February 2013
  • Accepted: 24 March 2014
  • Published:

Abstract

In this paper, a class of state-dependent relay channel with orthogonal channels from the source to the relay and from the source and the relay to the destination is studied. The two orthogonal channels are corrupted by two independent channel states SR and SD, respectively, which are known to both the source and the relay. The lower bounds on the capacity are established for the channel either with non-causal channel state information or with causal channel state information. Further, we show that the lower bound with non-causal channel state information is tight if the receiver output Y is a deterministic function of the relay input Xr, the channel state SD, and one of the source inputs XD, i.e., Y = f(XD, Xr, SD), and the relay output Yr is restricted to be controlled by only the source input XR and the channel state SR, i.e., the channel from the source to the relay is governed by the conditional probability distribution P Y r | X R , S R . The capacity for this class of semi-deterministic orthogonal relay channel is characterized exactly. The results are then extended to the Gaussian cases, modeling the channel states as additive Gaussian interferences. The capacity is characterized for the channel when the channel state information is known non-causally. However, when the channel state information is known causally, the capacity cannot be characterized in general. In this case, the lower bound on the capacity is established, and the capacity is characterized when the power of the relay is sufficiently large. Some numerical examples of the causal state information case are provided to illustrate the impact of the channel state and the role of the relay in information transmission and in cleaning the channel state.

Keywords

  • State-dependent relay channel with orthogonal components
  • Non-causal channel state information
  • Causal channel state information
  • Dirty paper coding

1 Introduction

We consider a state-dependent relay channel with orthogonal components as shown in Figure 1. The channel from the source to the relay and the channel from the source and the relay to the destination are assumed orthogonal. The source wants to send a message W to the destination with the help of the relay in n channel uses. Through a given memoryless probability law P Y r | X R , X r , S R P Y | X D , X r , S D , the channel outputs Y r n for the relay is controlled by the source inputs X R n , the relay inputs X r n , and the channel state S R n , while the channel outputs Y n for the destination is controlled by the source input X D n , the relay inputs X r n , and the channel state S D n . The state sequences S R n = S R , 1 , S R , 2 , , S R , n and S D n = S D , 1 , S D , 2 , , S D , n are independent and identically distributed (i.i.d.) with S R , i ~ Q S R s R , i and S D , i ~ Q S D s D , i , respectively. We assume SR and SD are independent. The channel state information about SR and SD is known to the source and the relay causally (that is, only S R i and S D i are known before transmission i takes place) or non-causally (that is, entire S R n and S D n are known before communication commences). The destination estimates the message sent by the source from its received channel output Y n . In this paper, we study the capacity of this model.
Figure 1
Figure 1

Orthogonal relay channel with state information available at both the source and the relay.

1.1 Background

In many communication models, the communicating parties typically have some knowledge on the communicating channel or attempt to learn about the channel. State-dependent channels have brought wide attention in recent years [1]. Shannon first considered a single-user channel, wherein the channel state information was causally known to the transmitter [2]; the capacity of this channel was characterized. Gel'fand and Pinsker [3] derived a method to determine the capacity of a channel when the channel state information was non-causally known to the transmitter; this method was later called the Gel'fand-Pinsker (GP) coding scheme. In [4], Costa studied a Gaussian channel with additive white Gaussian noise (AWGN) and additive Gaussian interference known non-causally to the transmitter; it was demonstrated that with dirty paper coding (DPC), capacity can be achieved as if no interference existed in the channel.

Extensions to multiple user channels were performed by Gel'fand and Pinsker in [5], where it was shown that interference cancellation was possible in the Gaussian broadcast channel (BC) and Gaussian multiple-access channel (MAC). In multiple-user state-dependent channels, the channel state information may be known to all the users or only some of them. In [6], Sigurjonsson and Kim characterized the capacity of a degraded broadcast channel and the capacity of a physically degraded relay channel where the channel state information was causally known to the transmitters. Inner bounds for the two-user BC with non-causal side information at the transmitter were derived by extending Marto's achievable scheme to the state-dependent channels in [7]. In [8], Steinberg derived the inner and outer bounds for a degraded BC with non-causal side information and characterized the capacity region when the side information was provided to the transmitter in a causal manner. In [9], information theoretic performance limits for three classes of two-user state-dependent discrete memoryless BCs with non-causal side information at the encoder were derived.

The state-dependent two-user MAC with state information non-causally known to one of the encoders was considered in [10] and [11]. For the MAC with asymmetric channel state information at the transmitters and full channel state information at the decoder, a single-letter capacity region was characterized when the channel state available at one of the encoders was a subset of the channel state available at the other encoder [12]. However, for the general case, only the inner and outer bonds were derived. It is not easy to characterize the explicit capacity region for general state-dependent MACs even when the channel state information is known to all transmitters. Capacity regions are only characterized in some special cases, e.g., Gaussian MAC with additive interference known to both encoders [13]. In some cases where the cooperation between the transmitters is allowed, capacity regions are also characterized, e.g., in [14], explicit capacity region was characterized for the MAC with one informed encoder which transmitted a common message and a private message, while the uninformed encoder only transmitted a common message; in [15], the capacity region was derived for a two-user dirty paper Gaussian MAC with conferencing encoders.

The relay channels capture both the MAC and BC characteristics. The state-dependent relay channels were studied in [1621]. Zaidi et al. [16] studied the relay channel with non-causal channel state information known to only the relay. The lower and upper bounds were derived by a coding scheme at the relay that used a combination of codeword splitting, Gel'f and-Pinsker binning, and decode-and-forward (DF) relaying. When the channel state information was known only at the source node, the lower and upper bounds were obtained in [1720]. In [17], the coding scheme for the lower bound used techniques of rate splitting at the source, partial decode-and-forward (PDF) relaying, and a GP-like binning scheme. In order to derive the lower bound of the capacity, [18] proposed two achievable schemes: (i) state description by which the source described the channel state to the relay and the destination and (ii) analog input description by which the source firstly computed the appropriate input that the relay would send had the relay known the channel state and then transmitted this appropriate input to the relay. With the same achievable schemes proposed in [18], the authors [19] obtained two corresponding lower bounds for the state-dependent relay channel with orthogonal components and channel state information known non-causally to the source. A similar orthogonal relay channel that was corrupted by an interference which was known non-causally to the source was considered in [20], in which several transmission strategies were proposed, assuming the interference had structure. Akhbari et al. [21] considered a state-dependent relay channel in three different cases: only the relay or only the source or both the source and the relay knew the channel state information non-causally. Lower bounds of the capacity were established based on using GP coding and compress-and-forward (CF) relaying for the three cases.

1.2 Motivation

State-dependent channels with state information available at the encoders can be used to model many systems, such as information embedding [2224] which enables encoding a message into a host signal, computer memories with defective cells [25], communication systems with cognitive radios, etc. For the above examples, we are more interested in the communication systems with cognitive radios. In order to improve the frequency spectrum efficiency in wireless systems, some secondary users which are capable of acquiring some knowledge about the primary communication are introduced into an existing primary communication system [26]. Obtaining the knowledge of the primary communication, the secondary users can adapt their coding schemes to mitigate the interference caused by the primary communication. In such models, the channel state can be viewed as the signals of the primary communication and the informed encoders can be viewed as cognitive users [11].

For the state-dependent relay channel with orthogonal components considered in this paper, the channel states SR and SD are viewed as the signals of corresponding primary communication; the source and the relay are viewed as the secondary users which are capable of acquiring the channel state information. Thus, the model studied in this paper can be viewed as a secondary relay communication with cognitive source and cognitive relay. We are interested in studying the capacity of this model.

However, it is tedious to characterize the explicit capacity of the relay channels even if the channel is state-independent. The capacity for the state-independent relay channel is only characterized in some special channels, e.g., physically degraded/reversely degraded relay channel [27], a class of deterministic relay channels [28] and a class of relay channels with orthogonal components [29]. To the best of our knowledge, explicit capacity results for the state-dependent relay channels with channel state information known to a part of the transmitters or all the transmitters were derived mainly in two cases: (i) physically degraded relay channels with state information causally known to both the source and the relay and (ii) Gaussian physically degraded relay channels with the channel state information non-causally known to the source and the relay. However, when the relay channel is corrupted by the channel state, explicit capacity has not been characterized yet, even if the channel state has structure and the structure is known to the source. In this paper, we try to find some capacity results of the state-dependent relay channel with orthogonal components.

1.3 Main contributions and organization of the paper

We investigate a state-dependent relay channel with orthogonal components, where the source communicates with the relay through a channel (say channel 1) orthogonal to another channel (say channel 2) through which the source and the relay communicate with the destination. We assume that channel 1 and channel 2 are affected by two independent channel states SR and SD, respectively. The channel state information about SR and SD is known to both the source and the relay non-causally or causally. In this setup, the main results of this paper are summarized as follows:
  1. (1)

    A lower bound on the capacity of the channel is established when the channel state information is known to the source and the relay non-causally. The achievability is based on superposition coding at the source, PDF relaying at the relay, and cooperative GP coding at the source and the relay.

     
  2. (2)

    When the channel state information is known to the source and the relay causally, an achievable rate of this channel is derived in a similar way as in the non-causal channel state information case, except that the auxiliary random variables U and U r are independent of the channel state S R and S D.

     
  3. (3)

    We show that the exact capacity for the channel with non-causal channel state information at the source and the relay can be characterized if the receiver output Y is a deterministic function of the relay input X r, the channel state S D, and one of the source inputs X D, i.e., Y = f(X D, X r, S), and the relay output Y r is restricted to be controlled by only the source input X R and the channel state S R, i.e., the channel from the source to the relay is governed by the conditional probability distribution P Y r | X R , S R .

     
  4. (4)

    Explicit capacity is also characterized for the Gaussian orthogonal relay channel with additive Gaussian interferences known to the source and the relay non-causally.

     
  5. (5)

    For the Gaussian orthogonal relay channel with additive interferences known to the source and the relay causally, the capacity is derived when the power of the relay is sufficiently large.

     

The rest of the paper is organized as follows. In Section 2, we present the system model and some definitions as well as notations that will be used throughout the paper. Section 3 is devoted to establishing single-letter expressions for the lower bounds on the capacity of the discrete memoryless state-dependent orthogonal relay channel with channel state information known to the source and the relay either non-causally or causally. In Section 4, we show that when the channel state information is known non-causally, Y = f(XD, Xr, SD) and the channel from the source to the relay is governed by the conditional probability distribution P Y r | X R , S R , the lower bound derived in Section 3 is tight; thus, the capacity is characterized exactly. In Section 5, the results are extended to the Gaussian cases. In Section 6, some numerical results are provided to illustrate the impact of the additive interferences and the role of the relay in information transmission and in cleaning the interference. In Section 7, we conclude this paper.

2 Notations and problem setup

Throughout this paper, random variables will be denoted by capital letters, while deterministic realizations thereof will be denoted by lower case letters. Vectors will be denoted by the boldface letters. The shorthand notation x i j is used to abbreviate (x i , xi + 1, …, x j ), x i is used to abbreviate (x1, x2, …, x i ), and x i is used to denote the i th element of x n , where 1 ≤ i ≤ j ≤ n. The probability law of a random variable X will be denoted by P X , and the conditional probability distribution of Y given X will be denoted by PY|X. The alphabet of a scalar random variable X will be designated by the corresponding calligraphic letter X . The cardinality of a set J will be denoted by J . T ϵ n X denotes a set of strongly ϵ-typical sequences x n X n , while A ϵ n X denotes a set of weakly ε-typical sequences x n X n , where ϵ > 0. E(•) denotes expectation; I(•;•) denotes the mutual information between two random variables. N 0 , σ 2 denotes a Gaussian distribution with zero mean and variance σ2.

As shown in Figure 1, we consider the state-dependent relay channel with orthogonal components denoted by P Y , Y r | X R , X D , X r , S D , S R , where Y Y and Y r Y r are the channel outputs from the destination and the relay, respectively. X R X R and X D X D are the orthogonal channel inputs from the source, while X r X r is the channel input from the relay. S R S R and S D S D denote the random channel states that corrupt channel 1 and channel 2, respectively. The channel states SR,i and SD,i at time instant i are independently drawn from the distribution Q S R and Q S D , respectively. The channel state information SR and SD is known to both the source and the relay non-causally or causally.

The message W is uniformly distributed over the set W = 1 , 2 , , M . The source transmits a message W to the destination with the help of a relay in n channel uses. Let X R n = X R , 1 , , X R , n , X D n = X D , 1 , , X D , n , and X r n = X r , 1 , , X r , n be the channel inputs of the source and the relay, respectively. The relay channel is said to be memoryless and to have orthogonal components if
P y r n , y n | x R n , x D n , x r n , s D n , s R n = i = 1 n P y r , i | x r , i , x R , i , s R , i P y i | x r , i , x D , i , s D , i
(1)
A (M, n) code for the state-dependent relay channel with channel state information non-causally known to the source and the relay consists of an encoding function at the source
φ n : 1 , 2 , , M × S D n × S R n X D n × X R n
(2)
a sequence of encoding functions at the relay
φ r , i : Y r i 1 × S D n × S R n X r
(3)
for i = 1, 2, …, n, and a decoding function at the destination
ϕ n : Y n 1 , 2 , , M .

The information rate R is defined as R = 1 n log 2 M bits per transmission

An (ϵ n , n, R) code for the state-dependent relay channel with orthogonal components and non-causal state information is a code having average probability of error smaller than ϵ n , i.e.,
Pr W ϕ n y n ϵ n

The rate R is said to be achievable if there exists a sequence of (ϵ n , n, R) codes with limnϵ n  = 0. The capacity of the channel is defined as the supremum of the set of achievable rates.

The definition of an (ϵ n , n, R) code for the state-dependent relay channel with orthogonal components and causal channel state information at the source and the relay is similar to that of the state-dependent relay channel with non-causal state information, except that the encoder consists of a sequence of maps {φ i }i=1 n , and φ r , i i = 1 n , where i means the time index. Thus, the encoder mappings in (2) to (3) are replaced by
φ i : 1 , 2 , , M × S D i × S R i X D × X R
(4)
φ r , i : Y r i 1 × S D i × S R i X r ,
(5)

respectively, where, i = 1, 2, …, n. The definitions of achievable rate and capacity remain the same as in the non-causal state information case.

3 Discrete memoryless case

In this section, it is assumed that the alphabets X D , X R , X r , Y r , Y , S D , and S R are finite. Lower bounds on the capacity of the channel with non-causal channel state information or causal channel state information are established, respectively. In the proofs of the lower bounds in the discrete memoryless case, strong typicality is used.

3.1 Non-causal channel state information

The following theorem provides a lower bound on the capacity of the state-dependent orthogonal relay channel with channel state information non-causally known to the source and the relay.

Theorem 1 For the orthogonal relay channel with channel state information non-causally known to both the source and the relay, the following rate is achievable
R max min I X R ; Y r | U r , X r , S R + I U ; Y | U r I U ; S D | U r , I U , U r ; Y I U , U r ; S D } ,
(6)
where the maximization is over all measures on S D × S R × X r × U r × X R , × X D × Y r × Y of the form
P S R , S D , X r , U r , U , X R , X D , Y r , Y = Q S R Q S D P U r | S D P X r | U r , S R , S D × P X R | U r , S R P U , X D | U r , S D × P Y r | X R , X r , S R P Y | X D , X r , S D
(7)
U r U r and U U are auxiliary random variables with
U r S D S R X D X r + 1
(8)
U S D S R X r X D S D S R X r X D + 1 + 1
(9)

Remark 1 Since the source and the relay know the channel state information non-causally, with PDF relaying, they can transmit the messages to the destination cooperatively with GP coding, namely, cooperative GP coding. The source communicates with the relay treating s R n as a time-sharing sequence for the same reason that the channel state information is known to both the source and the relay non-causally.

3.1.1 Outline of the proof of Theorem 1

We now give a description of the coding scheme to derive the lower bound in Theorem 1. Detailed error analysis of Theorem 1 is given in Appendix 1. The achievable scheme is based on the combination of superposition coding at the source, PDF relaying at the relay, and cooperative GP coding at the source and the relay.

The message W is divided into two parts W D 1 , 2 n R D and W D 1 , 2 n R D . Consider B + 1 blocks, each of n symbols. Let s D n k and s R n k be the state sequences in block k, k = 1, 2, …, B + 1. A sequence of B messages w(k) [1, 2 nR ], with w(k) = (wD(k), wR(k)), w D k 1 , 2 n R D , w R k 1 , 2 n R R , and R = RD + RR, are sent over the channel in n(B + 1) transmissions. During each of the first B blocks, the source encodes w D k 1 , 2 n R D and sends it over the channel. Since both the source and the relay know the channel state sequence s R n k , the source encodes w R k 1 , 2 n R R by treating s R n k as a time-sharing sequence [30]. The message wR(k) is expressed as a unique set m s R k : s R S R with S R sub-messages. For each s R S R , every sub-message m s R k in the set is associated with a codeword x R n m s R k from a corresponding sub-codebook C s R . The set of codewords x R n m s R k : s R S R are sent over the channel multiplexed according to the state sequence s R n k . The relay demultiplexes the received sequence y r n k into sub-sequences according to the state sequence s R n k and decode each sub-message m s R k . Consequently, wR(k) is decoded at the relay. The coding scheme is illustrated in Figure 2. With PDF relaying, the relay re-encodes wR(k) and sends it to the destination cooperatively with the source. In the last block B + 1, no new message is sent and let w(B + 1) = (wD(B + 1), wR(B + 1)) = (1, 1). The average information rate R(B/(B + 1)) of the message over B + 1 blocks approaches R as B → .
Figure 2
Figure 2

Multiplexed coding and decoding at the source and the relay[30].

3.1.2 Codebook generation

Fix a measure P S R , S D , X r , U r , U , X R , X D of the form (7).
  1. (i)

    Generate 2 n R R + R r , s i.i.d. codewords u r n w ˜ R , j r indexed by w ˜ R = 1 , 2 , , 2 n R R , j r = 1 , 2 , , 2 n R r , s , each with i.i.d. components drawn according to P U r .

     
  2. (ii)

    For each u r n w ˜ R , j r , generate 2 n R D + R d , s i.i.d. codewords u n w D , j d | w ˜ R , j r indexed by w D = 1 , 2 , , 2 n R D , j d = 1 , 2 , , 2 n R d , s , each with i.i.d. components drawn according to P U | U r .

     
  3. (iii)

    For each u r n w ˜ R , j r and for each s R S R , randomly and independently generate 2 n R s R sequences x R n m s R | s R , w ˜ R , j r indexed by m s R 1 , 2 n R s R , each with i.i.d. components according to P X R | U r , S R . These sequences constitute the sub-codebook C s R s R S R . There are S R such sub-codebooks for each u r n w ˜ R , j r . Set R R = s R S R R s R .

     

3.1.3 Encoding

We pick up the story in block k. Let w(k) = (wD(k), wR(k)) {1, 2, …, 2 nR }, where w D k 1 , 2 , , 2 n R D , w R k 1 , 2 , , 2 n R R , be the new message to be sent from the source node at the beginning of block k. The encoding at the beginning of block k is as follows.
  1. (i)
    The relay knows w R(k − 1) (this will be justified below) and searches the smallest j r k 1 , 2 , , 2 n R r , s , such that u r n w R k 1 , j r k is jointly typical with s D n k . If no such j r(k) exists, an error is declared and j r(k) is set to 1. By the covering lemma [31], this error probability tends to 0 as n approaches infinity, if R r,s satisfies
    R r , s I U r ; S D
    (10)

    Given u r n w R k 1 , j r k , s R n k , and s D n k , the relay sends a vector x r n k with i.i.d. components drawn according to the marginal P X r | U r , S D , S R .

     
  2. (ii)
    The source also knows w R(k − 1) and s D n k , thereby knows u r n w R k 1 , j r k . Then, the source searches j d k 1 , 2 , , 2 n R d , s , such that u n (w D(k), j d(k)|w R(k − 1), j r(k)) is jointly typical with s D n k given u r n w R k 1 , j r k . If no such j d(k) exists, an error is declared and j d(k) is set to 1. By the covering lemma [31], this error probability tends to 0 as n approaches infinity, if R d,s satisfies
    R d , s I U ; S D | U r
    (11)

    Given u n (wD(k), jd(k)|wR(k − 1), jr(k)), u r n w R k 1 , j r k , and s D n k , the source then sends a vector x D n k with i.i.d. components drawn according to the marginal P X D | U r , U , S D .

     
  3. (iii)

    Meanwhile, to send a message w R(k) [1, 2 nR ], express it as a unique set of messages m s R k : s R S R . The source, knowing the codeword u r n w R k 1 , j r k , considers the set of codewords x R n m s R k | s R , w R k 1 , j r k : s R S R . Store each codeword in a first-in-first-out (FIFO) buffer of length n. A multiplexer is used to choose a symbol at each transmission time i [1, n] from one of the FIFO buffers according to the state s R,i(k). Then, the chosen symbol is transmitted.

     

3.1.4 Decoding

At the end of block k, the relay and the destination observe y r n k and y n (k), respectively.
  1. (i)
    Having successfully decoded w R(k − 1) in block k − 1 and knowing u r n w R k 1 , j r k , x r n k and s R n k , the relay node estimates w ^ R k from y r n k . According to the state sequence s R n k , the relay demultiplexes y r n k , u r n w R k 1 , j r k and x r n k into subsequences y r , s R n s R k k , s R S R , u r , s R n s R k w R k 1 , j r k , s R S R and x r , s R n s R k k , s R S R , respectively, where s R S R n s R k = n . Assuming s R n k T ϵ n S R , and thus n s R k n 1 ϵ p s R for all s R S R , it finds for each s R S R a unique m ^ s R k such that the codeword sub-sequence x R n 1 ϵ p s R m ^ s R k | s R , w R k 1 , j r k is jointly typical with y r , s R n 1 ϵ p s R k , given u r , s R n 1 ϵ p s R w R k 1 , j r k and x r , s R n 1 ϵ p s R k . By the law of large numbers (LLN) and the packing lemma [31], the probability error of each decoding step approaches 0 as n →  if R s R p s R I X R ; Y r | U r , X r , S R = s R . Therefore, the total probability error in decoding w ^ R k approaches 0 for sufficiently large n if the following condition is satisfied:
    R R = s R S R R s R I X R ; Y r | U r , X r , S R
    (12)
     
  2. (ii)
    Observing y n (k), the destination finds a pair w ^ R k 1 , w ^ D k such that
    u n w ^ D k , j ^ d k | w ^ R k 1 , j ^ r k , u r n w ^ R k 1 , j ^ r k , y n k T ϵ n U , U r , Y
     
for some j ^ d k 1 , 2 , , 2 n R d , s and j ^ r k 1 , 2 , , 2 n R r , s . If there is no such pair or it is not unique, an error is declared. By the packing lemma [31], it can be shown that for sufficiently large n, decoding is correct with high probability if
R D + R d , s I U ; Y | U r R D + R d , s + R R + R r , s I U , U r ; Y
(13)
Combining (11) to (13), w(k − 1) = (wD(k − 1), wR(k − 1)) is decoded correctly with high probability at the end of block k, if
R I X R ; Y r | U r , X r , S R + I ( U ; Y | U r ) I ( U ; S D | U r ) R I U , U r ; Y I ( U , U r ; S D )
(14)

The detailed analysis of error probability is shown in Appendix 1.

3.2 Causal channel state information

In many practical communication systems, the state sequences are not known to the encoders in advance. For the case that the channel state information is provided to the source and the relay causally, the capacity is lower bounded as the following theorem.

Theorem 2 The capacity of the orthogonal relay channel with channel state information causally known to both the source and relay is lower bounded by
C CS max p s D p ( s R ) p ( u r ) p ( x R | u r , s R ) p ( u | u r , s D ) x r = f r u r , s D , s R , x D = f D ( u , s D ) min I X R ; Y r | U r , X r , S R + I U ; Y | U r , I U , U r ; Y ,
(15)
where U r U r and U U are auxiliary random variables with
U r S D S R X D X r + 1
(16)
U S D S R X r X D S D S R X r X D + 1 + 1
(17)

Remark 2 The achievable rate region in Theorem 2 is obtained by specializing the expression for the region in Theorem 1 to the case where the auxiliary random variables U and Urare independent of SDand SR. This is similar to the relation between the expression for the capacity of the state-dependent channel with causal channel state information introduced by Shannon[2]and its non-causal counterpart, the Gel'fand-Pinsker channel[3].

Proof The achievability poof is derived in a similar way as in the non-causal channel state information case except that the auxiliary random variables U and Ur are independent of the channel states SD and SR, and the channel inputs of the source and the relay are restricted to the mappings xD = fD(u, sD) and xr = fr(ur, sD, sR), respectively, where fD() and fr() are deterministic functions. The details are omitted for brevity.

4 Semi-deterministic orthogonal relay channel with non-causal channel state information

In this section, we show that the lower bound derived in Theorem 1 is tight for a class of semi-deterministic orthogonal relay channel, where, the output Y of the destination is a deterministic function of XD, Xr and SD, i.e., Y = f(XD, Xr, SD), and the output Yr of the relay node is controlled only by X R and SR, i.e., the channel from the source to the relay is governed by the conditional distribution P Y r | X R , S R . This assumption is reasonable in many cases, e.g., when the two orthogonal channels use two different frequency bands, the received signal Yr at the relay node will not be affected by its input signal Xr. The channel can be expressed as
P y r , y | x R , x D , x r , s D , s R = P y r | x R , s R 1 y = f x D , x r , s D
(18)

where, f(·) is a deterministic function and 1{·} denotes the indicator function. The channel state information on SR and SD is known to both the source and the relay non-causally. The capacity of this class of semi-deterministic orthogonal relay channel is characterized as shown in the following theorem.

Theorem 3 The capacity of the channel ( 18) with the channel state information known non-causally to the source and the relay is characterized as
C = max min I X R ; Y r | S R + H Y | U r , S D , H Y I U r , Y ; S D ,
(19)
where the maximization is over all measures on S D × S R × X r × U r × X R × X D × Y r × Y of the form
P s D , s R , x r , u r , x R , x D , y r , y = Q s D Q ( s R ) P ( x R | s R ) P ( u r , x r , x D | s D ) × P ( y r | x R , s R ) 1 { y = f ( x D , x r , s D ) }
(20)
U r U r is an auxiliary random variables with
U r S D S R X D X r + 1
(21)

and 1{·} denotes the indicator function.

Proof The achievability follows from Theorem 1. First note that the joint distribution of (20) can also be written as
P s D , s R , x r , u r , x R , x D , y r , y = Q s D Q ( s R ) P ( x R | s R ) P ( u r , y | s D ) P ( x r , x D | s D , u r , y ) × P ( y r | x R , s R )
(22)
with additional requirement that
y = f x D , x r , s D .
(23)
Note that, when P U r , Y , S D u r , y , s D is fixed, all the items on the right-hand side (RHS) of (19) are fixed except for I(XR; Yr|SR), which is independent of P X r , X D | S D , U r , Y x r , x D | s D , u r , y . Therefore, the maximization over all joint distributions of the form (20) can be replaced by the maximization only over those distributions, where xr and xD are two deterministic functions of (sD, ur, y), i.e., of the form
P s D , s R , x r , u r , x R , x D , y r , y = Q s D Q ( s R ) P ( x R | s R ) P ( u r , y | s D ) 1 { x r = g r ( u r , s D ) } × 1 { x D = g d ( y , u r , s D ) } P ( y r | x R , s R )
(24)

for some mappings gr: (ur, sD) → xr, gd: (y, ur, sD) → xD and subject to (23). Thus, we only have to prove the achievability of the rate that satisfies (19) for some distribution of the form (24).

The achievability follows directly from Theorem 1 by taking U = Y since Y = f(XD, Xr, SD), letting XR be independent of Ur and Xr considering the fact that Yr is only determined by XR and SR, and by setting xr = gr(ur, sD), xD = gd(y, ur, sD). Note that with these choices of the random variables, if we chose stochastic kernels P X R | S R and P U r , Y | S D , two deterministic mappings gr:(ur, sD) → xr and gd:(y, ur, sD) → xD, combined with Q S D Q S R and the channel law, the joint distribution (24) for which (23) is satisfied will be determined.

The proof of the converse is as follows.

Consider an (ϵ n , n, R) code with an average error probability P e (n) ≤ ε n . By Fano's inequality, we have
H W | Y n nR P e n + 1 = n δ n
(25)
where δ n  → 0 as n → + . Thus,
nR = H W I W ; Y n + n δ n
(26)
Defining the auxiliary random variable U ¯ r , i = Y i 1 , S D , i + 1 n , we have
I W ; Y n I ( W ; Y n , Y r n ) I W ; Y n , Y r n | S D n , S R n = i I W ; Y i , Y r , i | Y i 1 , Y r i 1 , S D n , S R n = i I W ; Y r , i | Y i 1 , Y r i 1 , S D n , S R n + i I W ; Y i | Y i 1 , Y r i 1 , Y r , i , S D n , S R n ,
(27)

where the second inequality follows from the fact that S D n and S R n are independent of W.

Calculate the two terms in (27) separately as follows:
i I W ; Y r , i | Y i 1 , Y r i 1 , S D n , S R n = i H Y r , i | Y i 1 , Y r i 1 , S D n , S R n H Y r , i | Y i 1 , Y r i 1 , S D n , S R n , W = a i H Y r , i | Y i 1 , Y r i 1 , S D n , S R n H Y r , i | Y i 1 , Y r i 1 , S D n , S R n , W , X R , i b i H Y r , i | S R , i H Y r , i | S R , i , X R , i = i I X R , i ; Y r , i | S R , i ,
(28)
where (a) holds since XR,i is a function of W , S D n , S R n ; (b) follows from the fact that conditioning reduces entropy and the Markov chain Y i 1 , Y r i 1 , S D n , S R i 1 , S R , i + 1 n , W X R , i , S R , i Y r , i .
i I W ; Y i | Y i 1 , Y r i 1 , Y r , i , S D n , S R n = i H Y i | Y i 1 , Y r i 1 , Y r , i , S D n , S R n H Y i | Y i 1 , Y r i 1 , Y r , i , S D n , S R n , W = i H Y i | Y i 1 , Y r i 1 , Y r , i , S D n , S R n H Y i | Y i 1 , Y r i 1 , Y r , i , S D n , S R n , W , X D , i , X r , i = a i H Y i | Y i 1 , Y r i 1 , Y r , i , S D n , S R n b i H Y i | Y i 1 , S D , i + 1 n , S D , i = i H Y i | U ¯ r , i , S D , i ,
(29)

where (a) holds since Xr,i is a function of Y r i 1 , S D n , S R n ; (b) follows from the fact that conditioning reduces entropy.

From (26) to (29), we have
R 1 n i I X R , i ; Y r , i | S R , i + H Y i | U r , i , S R , i + δ n
(30)
The proof of the bound I(W; Y n ) given in the second term in (19) is as follows:
I W ; Y n = i I W ; Y i | Y i 1 i I W , Y i 1 ; Y i = i I W , Y i 1 , S D , i + 1 n ; Y i I S D , i + 1 n ; Y i | W , Y i 1 = a i I W , Y i 1 , S D , i + 1 n ; Y i I Y i 1 ; S D , i | W , S D , i + 1 n = b i H Y i H Y i | W , Y i 1 , S D , i + 1 n I ( W , Y i 1 , S D , i + 1 n ; S D , i ) = i H Y i H Y i | W , Y i 1 , S D , i + 1 n i I W , Y i 1 , S D , i + 1 n , Y i ; S D , i I Y i ; S D , i | W , Y i 1 , S D , i + 1 n c i H Y i I W , Y i 1 , S D , i + 1 n , Y i ; S D , i i H Y i I Y i 1 , S D , i + 1 n , Y i ; S D , i = i H Y i I U ¯ r , i , Y i ; S D , i ,
(31)

where (a) holds due to Csiszar and Korner's sum identity; (b) follows since SD,i is independent of W , S D , i + 1 n , and (c) follows from the fact that H Y i | W , Y i 1 , S D , i + 1 n I Y i ; S D , i | W , Y i 1 , S D , i + 1 n .

By (26) and (31),
R 1 n i H Y i I U ¯ r , i , Y i ; S D , i + δ n
(32)
From the above, we have
R 1 n i I X R , i ; Y r , i | S R , i + i H Y i | U ¯ r , i , S D , i + δ n R 1 n i H Y i I U ¯ r , i , Y i ; S D , i + δ n
(33)
Introduce a time-sharing random variable T, which is uniformly distributed over {1, 2, …, n} and denote the collection of random variables
X R , X r , Y r , Y , U ¯ r , S D , S R = X R , T , X r , T , Y r , T , Y T , U ¯ r , T , S D , T , S R , T .
Considering the first bound in (33), we have
1 n i I X R , i ; Y r , i | S R , i + i H Y i | U ¯ r , i , S D , i = I X R ; Y r | S R , T + H ( Y | U ¯ r , S D , T ) = H Y r | S R , T H ( Y r | X R , S R , T ) + H ( Y | U ¯ r , S D , T ) I X R ; Y r | X r , S R + H ( Y | U ¯ r , S D , T ) ,
(34)

where the last step follows from the fact that T is independent of all the other variables and the Markov chain T ↔ (XR, SR) ↔ Yr.

Similarly, considering the second bound in (33), we have
1 n i H Y i I U ¯ r , i , Y i ; S D , i = H Y | T I ( U ¯ r , Y ; S D | T ) H Y I ( U ¯ r , T , Y ; S D ) + I ( T ; S D ) = H Y I ( U ¯ r , T , Y ; S D )
(35)
Defining U r = U ¯ r , T , we get
R I X R ; Y r | S R + H ( Y | U r , S D ) + δ n R H Y I ( U r , Y ; S D ) + δ n
(36)

Therefore, for a given sequence of (ϵ n , n, R) code with ϵ n going to zero as n goes to infinity, there exists a measure of the form P S D S R , X r , X R , X D = Q S D Q S R P X r | S D S R P X R , X D | X r , S D , S R , such that the rate R essentially satisfies (19).

Considering the facts that I(XR; Yr|SR) is determined by the joint distribution P X R , S R , Y r and the other three items on the RHS of (19) is independent of P X R , S R , Y r , the maximum in (19) taken over all joint probability mass functions P S D , S R , X r , U r , X R , X D , Y r , Y is equivalent to that taken over all joint probability mass functions of the form
P s D , s R , x r , u r , x R , x D , y r , y = Q s D Q ( s R ) P ( x R | s R ) P ( u r , x r , x D | s D ) × P ( y r | x R , s R ) 1 { y = f ( x D , x r , s D ) }

The bound of the cardinality of U r can be proven in a similar way as that proven in Theorem 1. It is omitted here for brevity.

This concludes the proof.

5 Memoryless Gaussian case

In this section, we study a state-dependent Gaussian relay channel with orthogonal components in which the channel states and the noise are additive and Gaussian. As shown in Figure 3, we consider the state-dependent Gaussian orthogonal relay channel, where channel 1 (dashed line) uses a different frequency band as compared to that used by channel 2 (solid line). The two orthogonal channels, channel 1 and channel 2, are corrupted by two independent additive Gaussian interferences SR and SD, respectively, which are known to the source and the relay. The channel can be described as
Y r = X R + S R + Z r
(37)
Y = X D + X r + S D + Z d
(38)
where Yr and Y are the channel outputs of the relay and the destination, respectively; the Gaussian i.i.d. random variables (XR, XD) and Xr are channel inputs from the source and the relay with the average power constraints E X R 2 + E X D 2 P and E X r 2 γP . The additive interferences SR, SD and the noises Zr, Zd are assumed to be zero-mean Gaussian i.i.d. with E S R 2 = Q R , E S D 2 = Q D and E Z r 2 = E Z d 2 = N . Further, we assume that SR, SD, Zr, and Zd are independent mutually. As in the discrete memoryless case, we will discuss the capacity of the channel when the additive interference sequences are known to the source and the relay non-causally and causally, respectively.
Figure 3
Figure 3

Gaussian orthogonal relay channel with channel state known at the source and the relay.

5.1 Channel state information non-causally known to the source and the relay

For the channel shown in Figure 3, when the channel state information is known non-causally to the source and the relay, using cooperative DPC, the capacity is characterized as in the following theorem.

Theorem 4 The capacity of the Gaussian orthogonal relay channel with the channel state information non-causally known to both the source and the relay is given by
C P , γP = max 0 β , ρ 1 min × C β ¯ P N + C β 1 ρ 2 P N , C β + γ + 2 ρ βγ P N ,
(39)

where C x = 1 2 log 2 1 + x and β ¯ = 1 β .

Remark 3 As in many other dirty paper channels with channel state information known non-causally at the encoders, with dirty paper coding, the capacity of the channel considered here is as same as that of the state-independent relay channel with orthogonal components. In fact, ( 39 ) also characterizes the capacity of the state-independent Gaussian orthogonal relay channel. Therefore, no matter the channel state information is either causally or non-causally known to the source and the relay, ( 39 ) serves as an upper bound on the capacity of the channel shown in Figure 3 .

Proof We only need to prove the achievability of (39) since the expression in (39) characterizes the capacity of the state-independent orthogonal relay channel [29] which obviously serves as an upper bound of the channel in this paper.

For the channel given by (37) and (38), we evaluate the achievable rate in (6) with the choice of the jointly Gaussian random variables U, Ur, SR, SD, XR, XD, and XR given by
U 0 = X D , 0 + α 1 α r S D
(40)
U r = 1 + ρ β / γ X r + α r S D
(41)
U = U 0 + ρ β / γ 1 + ρ β / γ U r
(42)
X D = X D 0 + ρ β / γ X r ,
(43)

where E X D 2 = βP , E X R 2 = β ¯ P , E X r 2 = γP , E X r X D = ρ βγ P and X D , 0 ~ N 0 , 1 ρ 2 βP is independent of Xr. The parameter β is the ratio of the source power allocated to XD, while β ¯ = 1 β is the ratio of the source power allocated to XR. The parameter ρ is the correlation coefficient between Xr and XD. With the above definitions of the random variables, it is straightforward to show the achievable rate in (39). The calculation is straightforward, thus omitted for brevity.

However, the calculation above is somewhat algebraic. Proceeding similarly to Costa's dirty paper coding, we extend the result in Theorem 1 for the discrete memoryless (DM) case to memoryless channels with discrete time and continuous alphabets by standard arguments [32]. An alternative proof is outlined in Appendix 2.

5.2 Channel state information known at the source and the relay causally

When the channel state information is known to the source and the relay causally, the capacity is not characterized in general. The following theorem gives a lower bound on the capacity.

Theorem 5 For the Gaussian orthogonal relay channel with the channel state information causally known to the source and the relay, the following rate is achievable:
R P , γP max 0 β 1 1 ρ d , s , ρ r , s , ρ d , r 1 min C β ¯ P N + C 1 ρ d , s 2 1 ρ d , r 2 βP Q D + ρ d , s βP + ρ r , s γP 2 + N , C 1 ρ d , s 2 βP + 1 ρ r , s 2 γP + 2 ρ d , r 1 ρ d , s 2 1 ρ r , s 2 βγ P Q D + ρ d , s βP + ρ r , s γP 2 + N
(44)

where C x = 1 2 log 2 1 + x and β ¯ = 1 β .

Remark 4 Since the interference S R is additive and known to both the source and the relay, the relay can remove S R completely before decoding the message from the source. Actually, the interference S R does not affect the achievable rate.

Remark 5 The source and the relay expend parts ρ d , s 2 βP and ρ r , s 2 γP of their power respectively to clean SDfrom the channel and use the remaining power for cooperative information transmission. It is different from many other dirty paper channels with non-causal channel state information at the transmitters where the channel states can be completely cleaned by choosing appropriate auxiliary random variables, e.g., by dirty paper coding. If QD = 0, the entire power of the source and the relay will be used for information transmission, i.e., ρr,s = ρd,s = 0. This reduces to the capacity of the state-independent relay channel with orthogonal components as shown in[29]since SRdoes not affect the achievable rate.

Proof The result in Theorem 2 for the discrete memoryless case can be extended to memoryless channels with discrete time and continuous alphabets using standard techniques [32]. The proof follows through evaluation of the lower bound of Theorem 2 using the following jointly Gaussian input distribution. Fix 0 ≤ β ≤ 1, − 1 ≤ ρd,s, ρr,s, ρd,r ≤ 1 and β ¯ = 1 β . Let X R ~ N 0 , β ¯ P , U r ~ N 0 , 1 ρ r , s 2 γP , U ~ N 0 , 1 ρ d , s 2 1 ρ d , r 2 βP , where Ur and U′ are independent. Let U = ρ d , r 1 ρ d , s 2 βP 1 ρ r , s 2 γP U r + U . We define X r = U r + ρ r , s γP / Q D S D and X D = U + ρ d , s βP / Q D S D . With these definitions, it can be easily verified that U ~ N 0 , 1 ρ d , s 2 βP , X r ~ N 0 , γP , and X D ~ N 0 , βP . Note that U, Ur, and U′ are independent of SD. Obviously, from these definitions, it is evident that E X R 2 + E X D 2 P and E X r 2 γP . Through straightforward algebra, it can be shown that the evaluation of the lower bound in Theorem 2 using the above choices gives the lower bound in Theorem 5. The computation details are omitted here for brevity.

We next characterize the capacity of the state-dependent Gaussian orthogonal relay channel with causal channel state information when the power of the relay is sufficiently large. As shown in Theorem 5, a part of the relay's power is used to clean the interference SD. When the power of the relay is sufficiently large, the interference SD can be cleaned completely and the capacity of the channel can be determined as shown in the following theorem.

Theorem 6 For the Gaussian orthogonal relay channel with the additive interference sequences known at the source and the relay causally, when the power of the relay satisfies
γP P 4 N + N 4 P + Q D P + 1 2 P
(45)
the capacity can be characterized as
C P , γP = max 0 β 1 C β ¯ P N + C βP N
(46)

Remark 6 When the power of the relay is sufficiently large such that the interference S D is completely cleaned by the relay using part of its power and its remaining power is sufficiently large such that the relay-destination link does not constrain the achievable rate, the message sent from the source is split into two parts: one part is sent directly to the destination through a point-to-point source-destination channel and the other is sent to the destination through a two-hop source-relay-destination channel with DF relaying. The two parts are sent independently, and the rate can be expressed as the sum of the rates of the source-destination channel and the two-hop source-relay-destination channel (the rate of the later is constrained by the source-relay link).

Proof Define ρ = (ρd,s, ρd,r, ρr,s). We denote the two terms on the RHS in (44) as
R 1 β , ρ = C β ¯ P N + C 1 ρ d , s 2 1 ρ d , r 2 βP Q D + ρ d , s βP + ρ r , s γP 2 + N
(47)
R 2 β , ρ = C 1 ρ d , s 2 βP + 1 ρ r , s 2 γP + 2 ρ d , r 1 ρ d , s 2 1 ρ r , s 2 βγ P Q D + ρ d , s βP + ρ r , s γP 2 + N
(48)

Let R(β, ρ) = min{R1(β, ρ), R2(β, ρ)}. Then, R P , γP max 0 β 1 1 ρ d , s , ρ r , s , ρ d , r 1 R β , ρ is achievable.

It is easy to verify that if γP ≥ QD, for any fixed β, R1(β, ρ) is the maximal when ρ = ρ 1 * = 0 , 0 , Q D / γP . Denote the maximum of R1(β, ρ) as R1*(β). Therefore, we have
R 1 * β = R 1 β , ρ 1 * = C β ¯ P N + C βP N
(49)
R 2 β , ρ 1 * = C βP + γP Q D N
(50)
Next, we will show the condition under which R 2 β , ρ 1 * is always larger than R1*(β) for any β. Let
1 + βP + γP Q D N 1 + β ¯ P N 1 + βP N
(51)
The inequality in (51) is equivalent to
P 2 β 2 P 2 PN β + γPN Q D N PN 0
(52)
It is easy to show that if
γ P 4 N + N 4 P + Q D P + 1 2
(53)
the inequality in (52) holds for any β. Thus, if γ P 4 N + N 4 P + Q D P + 1 2 , the following inequality is always satisfied for any β
R 2 β , ρ 1 * R 1 * β
(54)
For any β, we have
R β , ρ 1 * = min R 1 β , ρ 1 * , R 2 β , ρ 1 * = R 1 * β
(55)
Therefore,
R P , γP max 0 β 1 R 1 * β = max 0 β 1 C β ¯ P N + C βP N
(56)

is achievable.

As mentioned in Remark 3, (39) serves as an upper bound on the capacity of the channel considered here. The converse proof follows by proving that (46) matches the upper bound in (39) if the condition in (45) is satisfied. We denote the two terms on the RHS in (39) as
C 1 β , ρ = C β ¯ P N + C β 1 ρ 2 P N
(57)
C 2 β , ρ = C β + γ + 2 ρ βγ P N
(58)
Let C(β, ρ) = min {C1(β, ρ), C2(β, ρ)}. Similar to steps from (47) to (55), it is easy to prove that for any β if γ P 4 N + N 4 P + 1 2
C β , 0 = min C 1 β , 0 , C 2 β , 0 = C 1 β , 0 = C β ¯ P N + C βP N
(59)

Next, we have to prove that for any β, under the condition γ P 4 N + N 4 P + 1 2 , C(β, ρ) is maximized when ρ = 0. Denote the maximal of C(β, ρ) as C*(β), i.e., C * β = max 1 ρ 1 C β , ρ . This can be proven by contradiction.

Assume that C(β, ρ) is maximized when ρ = ρ′ (ρ′ ≠ 0). By (59), we get
C β , ρ C β , 0 = C 1 β , 0
(60)
However, we have
C β , ρ = min C 1 β , ρ , C 2 β , ρ C 1 β , ρ
(61)
From (57), it is easy to verify that for any β, C1(β, ρ) is maximized when ρ = 0. Thus, (60) and (61) are contradictory. This proves that for any β, C * β = C 1 β , 0 = C β ¯ P N + C βP N . Thus, the maximization problem in (39) is equivalent to the following maximization problem
C P , γP max 0 β 1 C β ¯ P N + C βP N
(62)

This completes the proof.

6 Numerical examples

In this section, we provide some numerical examples for the achievable rate in Theorem 5. With these examples, we will show the impact of the channel state and the role of the relay in information transmission and in cleaning the channel state.

For γ = 1, Figure 4 shows a comparison of the capacity of the state-independent (QR = QD = 0) relay channel with orthogonal components and the achievable rate derived in Theorem 5. Obviously, the larger the power of the additive interference, more power of the source and the relay will be used to clean the interference; this results in a lower achievable rate. As the power values (P) of the source and relay increase, a larger amount of interference can be cleaned, leaving more power for information transmission. Consequently, the achievable rate will approach the capacity of the state-independent relay channel with orthogonal components as P increases. This can also be verified from (44) such that if PQD, the impact of the additive interference SD will be negligible with respect to P. The maximization problem of (44) is approximate to that of (39) by taking ρr,s → 0 and ρd,s → 0.
Figure 4
Figure 4

Achievable rates vs. SNR under different power values of the interference.

For P/N = 30, Figure 5 shows the role of the relay in cleaning the channel state. As the power of the relay increases, the achievable rate increases. In particular, when the power of the relay is sufficiently large such that the channel state can be cleaned completely and the relay-destination link does not become the bottleneck for the achievable rate, the achievable rate matches the upper bound. This has been proven in Theorem 6. Figure 5 vividly illustrates this result.
Figure 5
Figure 5

Achievable rates vs. γ under different power values of the interference.

7 Conclusions

In this paper, we consider a state-dependent relay channel with orthogonal channels from the source to the relay and from the source and the relay to the destination. The orthogonal channels are affected by two independent channel states, respectively, and the channel state information is known to both the source and the relay either non-causally or causally. In the non-causal state information case, the lower bound on the capacity of the channel is established with superposition coding at the source, PDF relaying at the relay and cooperative GP coding at the source and the relay. We further show that if the output of the destination Y is a deterministic function of the relay input Xr, the channel state SD and one of the source inputs XD, i.e., Y = f(XD, Xr, SD) and the relay output Yr is restricted to be controlled by only the source input XR and the channel state SR, the lower bound is tight, and the capacity can be characterized exactly. As for the causal channel state information case, the lower bound on the capacity is also derived. The expression for the achievable rate in the causal state information case can be interpreted as a special case of that for the achievable rate in the non-causal state information case, where the auxiliary random variables U and Ur are independent of SR and SD. This is similar to the relation between the expression for the capacity of the state-dependent channel with causal channel state information introduced by Shannon [2] and its non-causal counterpart, the Gel'fand-Pinsker channel [3].

Further, we investigate the Gaussian state-dependent relay channel with orthogonal components, modeling the channel states as additive Gaussian interferences. Capacity is characterized when the additive interference sequences are known non-causally. The expression for the capacity is the same as that for the capacity of the state-independent relay channel with orthogonal components. This observation is similar to the results for the multiple user state-dependent channels shown in [6]. When the state information is known causally, however, the capacity is not characterized in general. In this case, with carefully chosen auxiliary random variables, achievable rate is derived. It is shown that when the power of the relay is sufficiently large, the capacity can be characterized exactly. Finally, two numerical examples are given to illustrate the impact of the channel state and the role of the relay in information transmission and in cleaning the state. The simulation results show that the larger the power of the additive interference, the more power of the source and the relay will be spent to clean the interference; this results in a lower achievable rate. However, as the power P increases, the impact of the interference will be negligible if PQD and the achievable rate will approach the capacity of the state-independent relay channel with orthogonal components. The simulation results also illustrate that when the power of the relay satisfies γP P 4 N + N 4 P + Q D P + 1 2 P , the capacity of the channel can be characterized.

Appendices

Appendix 1

Proof of Theorem 1: Analysis of probability of error

The average probability of error is given by
P e s D n T ϵ n S D s R n T ϵ n S R Pr s D n Pr s R n + s D n T ϵ n S D s R n T ϵ n S R Pr s D n Pr s R n Pr error | s D n , s R n
(63)
By the AEP, the first term Pr s D n T ϵ n S D Pr s R n T ϵ n S R on the RHS of (63) goes to 0 as n → . It is sufficient to upper bound the second term of the RHS of (63). We now examine the probabilities of the error events associated with the encoding and decoding steps. The error event is contained in the union of the following error events, where E1k and E2k correspond to the encoding steps in block k, E 3 s R k and E 4 s R k correspond to decoding the message m ^ s R k at the relay in block k given SR = sR, the events E5k and E6k correspond to decoding wD(k) at the destination in block k. The probability of error Pr error | s D n , s R n is upper bounded as
Pr error | s D n , s R n Pr ( E 1 k ) + Pr ( E 2 k ) + s R S R P S R = s R Pr E 3 s R k | E 1 k c E 2 k c + s R S R P S R = s R Pr E 4 s R k | E 1 k c E 2 k c E 3 s R k c + Pr E 5 k | E 1 k c E 2 k c + Pr ( E 6 k | E 1 k c E 2 k c E 5 k c ) ,

where E mk c (m = 1, 2, 3s R , 5) denotes the corresponding event complement of E mk .

Let E1k be the event that there is no sequence u r n w R k 1 , j r k jointly typical with s D n k , i.e.,
E 1 k = j r k 1 , 2 , , 2 n R r , s s . t . u r n w R k 1 , j r k , s D n k T ϵ n U r , S D
For u r n w R k 1 , j r k and s D n k generated independently with i.i.d. components according to P U r and Q S D , respectively, the probability that there exits at least one j r k 1 , 2 , , 2 n R r , s such that u r n w R k 1 , j r k is jointly typical with s D n k is greater than 1 ϵ 2 n I U r ; S D + δ ϵ for n sufficiently large. There are 2 n R r , s such u r n s in each bin. Therefore, the probability of event E1k is bounded by
Pr E 1 k 1 1 ϵ 2 n I U r ; S D + δ ϵ 2 n R r , s
(64)

Taking the logarithm on both sides of (64) and following from the inequality ln(x) ≤ x − 1, we have ln Pr E 1 k 1 ϵ 2 n R r , s I U r ; S D δ ϵ . Thus, if Rr,s > I(Ur; SD) + δ(ϵ), Pr(E1k) → 0 as n → , where δ(ϵ) → 0 as ϵ → 0.

Let E2k be the event that there is no sequence u n (wD(k), jd(k)|wR(k − 1), jr(k)) jointly typical with s D n k , given u r n w R k 1 , j r k , i.e.,
E 2 k = j d k 1 , 2 , , 2 n R d , s s . t . u n w D k , j d k | w R k 1 , j r k , u r n w R k 1 , j r k , s D n k T ϵ n U , U r , S D

Similar to the analysis on the probability of the event E1k, if Rd,s > I(U; SD|Ur) + δ(ϵ), Pr(E2k) → 0 as n → .

For each s R S R , let E 3 s R k be the event that x R n 1 ϵ p s R m s R k | s R , w R k 1 , j r k is not jointly typical with y r , s R n 1 ϵ p s R k , given u r , s R n 1 ϵ p s R w R k 1 , j r k , x r , s R n 1 ϵ p s R k and SR = sR, i.e.,
E 3 s R k = x R n 1 ϵ p s R m s R k | s R , w R k 1 , j r k , u r , s R n 1 ϵ p s R w R k 1 , j r k , x r , s R n 1 ϵ p s R k y r , s R n 1 ϵ p s R k T ϵ n X R , U r , X r , S R , Y r given S R = s R

By the LLN, for all s R S R , Pr E 3 s R k | E 1 k c E 2 k c 0 as n → . Consequently, s R S R P S R = s R Pr E 3 s R k | E 1 k c E 2 k c 0 as n → .

For each s R S R , let E 4 s R k be the event that x R n 1 ϵ p s R m ^ s R k | s R , w R k 1 , j r k is jointly typical with y r , s R n 1 ϵ p s R k , given u r , s R n s R k w R k 1 , j r k , x r , s R n 1 ϵ p s R k and SR = sR for some m ^ s R k m s R k , i.e.,
E 4 s R k = m ^ s R k 1 , 2 , , 2 n R s R s . t . m ^ s R k m s R k , x R n 1 ϵ p s R m ^ s R k | s R , w R k 1 , j r k , u r , s R n 1 ϵ p s R w R k 1 , j r k , x r , s R n 1 ϵ p s R k , y r , s R n 1 ϵ p s R k T ϵ n X R , U r , X r , S R , Y r given S R = s R
Conditioned on the events E1k c , E2k c and E 3 s R k c , for all s R S R , by the joint typicality lemma [31], the probability that x R n 1 ϵ p s R m ^ s R k | s R , w R k 1 , j r k , u r , s R n 1 ϵ p s R w R k 1 , j r k , x r , s R n 1 ϵ p s R k , y r , s R n 1 ϵ p s R k T ϵ n 1 ϵ p s R X R , U r , X r , S R , Y r given u r , s R n 1 ϵ p s R w R k 1 , j r k , x r , s R n 1 ϵ p s R k and SR = sR for m ^ s R k m s R k is less than 2 n 1 ϵ p s R I X R ; Y r | U r , X r , S R = s R δ ϵ for sufficiently large n. There are 2 n R s R (exactly 2 n R s R 1 ) such x R n 1 ϵ p s R s. Thus, the conditional probability of event E 4 s R k given E1k c , E2k c and E 3 s R k c is upper bounded by
Pr E 4 s R k | E 1 k c E 2 k c E 3 s R k c 2 n 1 ϵ p s R I X R ; Y r | U r , X r , S R = s R 1 ϵ p s R δ ϵ R s R
(65)
From (65), Pr E 4 s R k | E 1 k c E 2 k c E 3 s R k c 0 as n →  if
R s R < 1 ϵ p s R I X R ; Y r | U r , X r , S R = s R δ ϵ
Since R R = s R S R R s R , s R S R P S R = s R Pr E 4 s R k | E 1 k c E 2 k c E 3 s R k c 0 as n →  if
R R < 1 ϵ I X R ; Y r | U r , X r , S R δ ϵ ,

where δ(ϵ) → 0 as ϵ → 0.

Let E5k be the event that u n (wD(k), jd(k)|wR(k − 1), jr(k)), u r n w R k 1 , j r k and y n (k) are not jointly typical, i.e.,
E 5 k = u n w D k , j d k | w R k 1 , j r k , u r n w R k 1 , j r k , y n k T ϵ n U , U r , Y

Conditioned on the events E1k c and E2k c , we have Pr(E5k|E1k c E2k c ) → 0 as n →  by the Markov Lemma.

Let E6k be the event that u n w ^ D k , j ^ d k | w ^ R k 1 , j ^ r k and u r n w ^ R k 1 , j ^ r k are jointly typical with y n (k) for some w ^ D k , w ^ R k 1 1 , 2 , , 2 n R D × 1 , 2 , , 2 n R R , j ^ d k 1 , 2 , , 2 n R d , s and j ^ r k 1 , 2 , , 2 n R r , s , with w ^ D k , w ^ R k 1 w D k , w R k 1 , i.e.,
E 6 k = w ^ D k , w ^ R k 1 1 , 2 , , 2 n R D × 1 , 2 , , 2 n R R , j ^ d k 1 , 2 , , 2 n R d , s , j ^ r k 1 , 2 , , 2 n R r , s s . t . w ^ D k , w ^ R k 1 w D k , w R k 1 , u n w ^ D k , j ^ d k | w ^ R k 1 , j ^ r k , u r n w ^ R k 1 , j ^ r k , y n k T ϵ n U , U r , Y
We split the potential event E6k into three disjoint parts: first, w ^ D k = w D k and w ^ R k 1 w R k 1 ; second, w ^ D k w D k and w ^ R k 1 = w R k 1 ; third, w ^ D k w D k and w ^ R k 1 w R k 1 , i.e.,
E 6 k _ 1 = w ^ R k 1 1 , 2 , , 2 n R R , j ^ r k 1 , 2 , , 2 n R r , s s . t . w ^ R k 1 w R ( k 1 ) , u n w D k , j d k | w ^ R k 1 , j ^ r k , u r n w ^ R k 1 , j ^ r k , y n k T ϵ n U , U r , Y
E 6 k _ 2 = w ^ D k 1 , 2 , , 2 n R D , j ^ d k 1 , 2 , , 2 n R d , s , s . t . w ^ D k w D ( k ) , u n