- Research
- Open Access
- Published:

# Inter-relay Handoff in two-hop relaying networks with highly mobile vehicles

*EURASIP Journal on Wireless Communications and Networking*
**volume 2012**, Article number: 300 (2012)

## Abstract

In multihop cellular networks, mobile users can communicate with the base stations via relay stations (RSs), and handoff between different RSs. This type of handoffs is referred to as inter-relay handoffs. In networks with highly mobile users, the inter-relay handoffs can occur very frequently. Making intelligent inter-relay handoff decisions is important in order to improve the network performance. In this article, we study the inter-relay handoff decision problem in a two-hop cellular network with highly mobile vehicles using a semi-Markov decision process. The objective is to maximize the total reward, which is defined by taking into consideration the transmission rate of the user’s link, the overheads for performing inter-relay handoffs, and the moving speed of the user. Numerical results are shown to demonstrate the effectiveness of the handoff decisions.

## Introduction

The increasing demand on ubiquitous wireless broadband access drives the development of the next generation wireless communications. To come with the tide of trend, multihop relaying has gained much attention in recent years as an appealing strategy for capacity improvement, coverage extension, and quality of service enhancement in cellular communication systems. Many efforts have focused on the relaying technologies in cellular networks [1]. More recently, the IEEE 802.16 standard committees have been working on the extension of the basic IEEE 802.16 standard, known as IEEE 802.16j [2], to incorporate functions of relay stations (RSs) into WiMAX networks. Using RSs to relay traffic between the base stations (BSs) and the mobile users is a promising approach to improving the capacity and coverage in cellular networks. The application of the relaying concept to cellular networks, however, raises many technical issues, one of which is the inter-relay handoffs [3]. Due to the mobility of the users, the propagation channel conditions between the mobile users and their RSs change with time. When the quality of the channel between a mobile user and its RS is not sufficiently good, the user should handoff to a different RS with better channel quality. This process is called inter-relay handoff. It is shown in [4, 5] that the Markov decision process (MDP) model can be used to solve the handoff problem in integrated cellular and wireless local area networks. In [4], the vertical handoff problem was formulated as an MDP model to decide whether or not to admit a handoff session, and to which network the session should be admitted. The MDP model was proposed to reduce unnecessary handoffs, while increasing the resource utilization and decreasing the connection dropping significantly [5]. The handoff decisions for mobile users depend on the channel states, which are affected by the user mobility. Nevertheless, the effect of user mobility at high moving speed is not considered in most of the literature. As the moving speed changes, the state transition probabilities in the MDP model will be changed accordingly, which affects the handoff decisions. Frequent inter-relay handoffs can increase the signaling overhead and result in serious ping-pong effect, and degrade the system performance. It is shown that the handoff overhead can also significantly reduce benefits in terms of throughput, delay, and jitter if data frames are small. Thus, the handoff overhead is an important factor that should be considered when making handoff decisions.

In this article, we investigate the inter-relay handoff problem for a network with highly mobile users, such as the networks with high speed trains. The objective of the optimum handoff decisions for a user is to maximize its overall reward, which is defined as a function that incorporates both the link transmission rate and the handoff overhead, and the link rate is further determined by the SNR of the two-hop channels. Since the channel states are time-related random variables in wireless fading environments, a first-order finite-state Markov channel (FSMC) is used to model the wireless channels, based on which the state transition matrix of the SNR of the transmission channel is derived as a function of the moving speeds of the user. The inter-relay handoff problem is then formulated as a semi-Markov decision process (SMDP) with the objective to maximize the reward function. The remainder of the article is organized as follows. In Section 2, we describe the system model. The problem of the interrelay handoff is formulated as an SMDP problem and then solved in Section 2. Simulation results are shown in Section 2 to demonstrate the performance of the proposed handoff decision policy. Finally, Section 2 concludes the article.

## System model

We consider the downlink transmissions in a cellular network, where the traffic source in each cell is the BS, which communicates with highly mobile users, such as high speed trains, via a number of RSs. The RSs are placed in suitable locations, where the BS-to-RS links are reliable, so that the RSs can correctly decode the signals received from the BS. We assume that there is no direct connection between the BS and the trains (for example due to shadowing). Namely, we only consider the two-hop links in this article, while our approach can also be generalized to the more general scenario, where either direct transmissions or relayed transmissions can be chosen for communications. This can be achieved by taking account of the channel state of the direct transmissions in the state transition probabilities of the SMDP-based handoff decision problem. Each train is equipped with a wireless controller, which can collect the corresponding channel information to make inter-relay handoff decisions by itself.

Due to the high moving speeds of the trains, inter-relay handoffs may occur frequently. The RSs before and after an inter-relay handoff may be associated to the same BS or different BSs. The former is referred to as intra-cell inter-relay handoff, and the latter is referred to as inter-cell inter-relay handoff. In Figure 1, train 1 is performing an intra-cell inter-relay handoff when switching from RS 1a to RS 1b in cell 1, and train 2 is performing an inter-cell inter-relay handoff when moving from RS 1c in cell 1 to RS 2a in cell 2. When an inter-relay handoff occurs within the same cell, the BS can have a relatively simple control over the handoff process, since it has direct communications to both the RSs before and after the handoff. On the other hand, performing an inter-cell inter-relay handoff usually causes more signaling overheads, because it requires inter-BS signalings and BS-RS signalings in both cells.

Without loss of generality, we assume that all the nodes are equipped with single antennas and work in half-duplex mode. That is, they cannot receive and transmit at the same time. Channel time is divided into equal length time slots, one of which is for one packet transmission. We consider the decode-and-forward (DF) relaying scheme, since it has advantages in digital processing and avoids noise amplification, compared to the amplify-and-forward relaying scheme. During the odd time slots, the source node transmits to the RS. The RS then decodes each received signal and forwards to the user in the next time slot (i.e., an even time slot).

Let *s*, *r*_{
n
}, *d* denote the source node, the *n* th RS, and the destination node (train), respectively, where $n\in \mathcal{N}$, $\mathcal{N}=\{1,2,\dots ,N\}$, and *N* is the total number of available relay nodes within this network. We use *g*_{
ij
}to denote the link gain between node *i* and node *j*, where the node pair *ij* can be *s* *r*_{
n
} or *r*_{
n
}*d*. It is assumed that the link gains remain constant for at least one time slot. The average transmission rate is a good indicator of the channel efficiency in practical wireless systems without considering complex coding, detecting, and decoding procedures [6]. Our objective is to maximize the overall reward for each individual train, which takes into consideration the effect of moving speed on the channel SNR and the overhead for performing handoffs. Given the physical designs of the transceivers, such as modulation and coding schemes (MCSs), the transmission rate between two directly communicating nodes is a monotonically increasing function of the received SNR. In other words, once the received SNR is obtained, the MCS can be mapped, and then the transmission rate can be determined based on the corresponding MCS. When using adaptive modulation, the transmission rate between node *i* and node *j* can be represented as *θ*(*γ*_{
ij
}), which is the function of the received SNR *γ*_{
ij
}. Accordingly, the transmission rate for DF with the relay node *n* is given in [7] as

where *W* is the bandwidth, ${\gamma}_{s{r}_{n}}$ and ${\gamma}_{{r}_{n}d}$ are the received SNRs at the *n*_{
th
} RS and the destination node, respectively. The factor 1/2 comes from the fact that every source or relay node transmits for half of the time slots.

The FSMC models have been widely accepted in the literature as an effective approach to characterizing the correlation structure of the fading process [8]. In the FSMC, the channel state is characterized via the received SNR. The range of the average SNR of a received packet is partitioned (quantized) into *K* levels, each of which is associated with a state of a Markov chain, which has a finite state space denoted as $\mathcal{S}=\{{\mathcal{S}}_{1},{\mathcal{S}}_{2},\dots ,{\mathcal{S}}_{K}\}$. The changes of the channel states follow a Markov process. Let *Γ*={*Γ*_{0}*Γ*_{1},…,*Γ*_{
K
}} be the received SNR thresholds in the increasing order, where *Γ*_{0}=0 and *Γ*_{
K
}=*∞*. The channel is in state *k*, if the received SNR *γ* of a packet is located in the range *Γ*_{k−1}*Γ*_{
k
}). We assume that a one-step transition in the model corresponds to the channel state transition after one time slot. Let *p*_{i,j} denotes the probability that *γ* moves from state *i* to state *j*, i.e., *p*_{i,j}=*Pr*(*γ*(*t* + 1)=*j*|*γ*(*t*)=*i*), where $i,j\in \mathcal{S}$. The *K*×*K* channel state transition probability matrix is defined as: *P*=*p*_{i,j}_{K×K}. Based on the SNR thresholds, the transition probability matrix can be obtained [8].

## SMDP-based handoff decisions

In this section, we first derive the channel model of a single hop transmission from the RS to the train, then derive the SMDP model for the two-hop transmissions for making handoff decisions, and finally discuss the implementation of the inter-relay handoffs based on the derived model.

### Channel model for RS-to-train transmissions

The moving speed of the train may change from time to time, while the channel state transition probability is usually obtained at a fixed moving speed, assuming that the state transition probability is the same at different speeds. The inter-channel interference (ICI) cannot be ignored when characterizing the SNRs if users are moving at high speeds, since high moving speeds lead to large ICI, which degrades the SNR of the received signal. The SNR of the RS-to-train link^{a} can be expressed by

where *E*_{rd} is the transmit power of the RS, *g*_{rd} represents the link gain between the RS and the train, *σ*^{2}is the noise power, and ICI is the average power of ICI at a given moving speed. Considering that the exact expression of ICI term is complicated and can hardly provide much insight, we utilize the tight upper bound on the ICI power, which is derived in [9] as

where *f*_{
d
} is the maximum Doppler frequency, and *T*_{
s
}is the duration of one symbol. Recall the expression of the maximum Doppler frequency,

where *v* is the moving speed of the train, *f*_{
c
}is the carrier frequency, and *c* is the speed of the light. Combining the above equations, we have

For the mathematical tractability, we assume that the noise power is much less than the ICI power, i.e., *σ*^{2}≪ICI. With this, (5) becomes

Different moving speeds for a train affect the SNR, and thus change the state transition probability to some extent. The state transition will change if the train changes its moving speed. The aim of this section is to find the relationship between the state transition probabilities when the train changes its moving speed. Considering that the moving speed of the train is changing from *v*_{0} to *v*_{1}, we derive the channel state (SNR) transition probability. Define ${\gamma}_{\text{rd}}^{0}$ and ${\gamma}_{\text{rd}}^{1}$ as the received SNRs at the train when the moving speeds are *v*_{0}and *v*_{1}, respectively, we can easily obtain the expressions of the two SNRs from (6), and then get

Define ${p}_{{k}_{1},{k}_{2}}$ as the transition probability of the SNR from state *k*_{1}to state *k*_{2}. We consider that each state can only stay in the same state or transit to one of the two neighboring states. From [8], the state transition probabilities can be approximated as

where *N*(*Γ*_{k + 1}) and *N*(*Γ*_{
k
}) are the level crossing rates at the thresholds *Γ*_{k + 1}and *Γ*_{
k
}, respectively, and *Π*_{
k
}is the steady-state probability of state *k*. It is shown in [10] that the crossing rate at level *γ* for the SNR process, in the positive direction only (or in the negative direction only), is given by

Practically, the channel condition of a moving train is subject to both path loss and random channel fading. Consider Rayleigh fading, the received SNR of the train at the moving speed *v*_{1} is distributed exponentially with the probability density function given by

Accordingly, the probability that the SNR is in state *k* can be obtained as

According to (7)–(12), the relationship between the state transition probabilities of the train at speeds *v*_{0} and *v*_{1} can be derived as

where ${p}_{k,k+1}^{1}$ and ${p}_{k,k-1}^{1}$, respectively, are the SNR state transition probabilities of the train from state *k* to states *k* + 1 and *k*−1 at the moving speed *v*_{1}, and ${p}_{k,k+1}^{0}$ and ${p}_{k,k-1}^{0}$, respectively, are the SNR state transition probabilities of the train from state *k* to states *k* + 1 and *k*−1 at the moving speed *v*_{0}, *φ*^{+} , and *φ*^{−} can be derived as

and $m={v}_{0}^{2}/{v}_{1}^{2}$. Detailed derivations for *φ*^{+} and *φ*^{−} are given in Appendix.

### Formulation as an SMDP

In this section, we consider the two-hop transmissions from the BS to the train. The aforementioned inter-relay handoff problem can be formulated as an SMDP [11], which is a generalization of an MDP by allowing a decision maker to choose actions whenever the system state changes and allowing the time spent in a particular state to follow an arbitrary probability distribution. A standard form of an SMDP model consists of the following five elements: (1) decision epochs; (2) states; (3) actions; (4) rewards; and (5) state transition probabilities. The decision epochs are the time instants for the wireless controller on the train to make handoff decisions. Other elements of the SMDP are introduced below.

Let *C* be the channel state space. For the two-hop transmissions with multiple RS choices, a composite state *c*(*t*) ∈ *C* at time slot *t* is given as

where *γ*_{sr}(*t*) and *γ*_{rd}(*t*), respectively, are the SNRs of the BS-to-RS link and the RS-to-train link at time slot *t*, and *r* and *r*^{′}, respectively, are the current RS and the RS that the train may handoff to. The state transition probability function from the current state *c*(*t*) to the next state *c*(*t* + 1) is denoted *p*_{c(t),c(t + 1)}, which is given by

where ${p}_{{\gamma}_{\text{sr}}\left(t\right),{\gamma}_{\text{sr}}(t+1)}$ and ${p}_{{\gamma}_{\text{rd}}\left(t\right),{\gamma}_{\text{rd}}(t+1)}$ denote state transition probabilities of the BS-to-RS and RS-to-train links, respectively.

Let *A* be the action space. A decision rule prescribes a procedure for the action selection in each state at a specified decision epoch. Markov decision rules are functions *δ*(*t*):*C*→*A* that specify the action choice *a*(*t*)∈*A* when the system occupies state *c*(*t*)∈*C* at decision epoch *t*. A policy *τ*=(*δ*(1),*δ*(2),…,*δ*(*t*)) is a sequence of decision rules that will be used at all decision epochs. Let *ζ*^{τ}(*c*(0)) denote the expected total rewards from the first decision epoch until the time of interest, given that the policy *τ* is used with an initial composite state *c*(0). Given that the time periods between successive decision epochs are geometrically distributed with mean 1/(1−*λ*), we have the expected value of the total reward

when the policy *τ* is used with an initial state *c*(0), where *R*(*c*(*t*),*a*(*t*)) is the reward function, and 0≤*λ*≤1 can be interpreted as a discount factor. The aforementioned model is an infinite discount Markov decision model.

Our optimization problem is to maximize the expected total discounted reward. We define a policy *τ*^{∗} to be optimal if ${\zeta}^{{\tau}^{\ast}}\ge {\zeta}^{\tau}$, for all *τ*. According to [11], the optimal policy for an infinite discount Markov decision model is stationary. If a policy is not stationary, then the policy is not optimal. A policy *τ*=(*δ*(1),*δ*(2),…,*δ*(*t*)) is said to be stationary if *δ*(*t*)=*δ* for all *t*. A stationary policy has the form *τ*=(*δ* *δ*,…). Our objective is then to determine an optimal stationary policy *δ*=*δ*^{∗}, which maximizes the expected total discounted reward given by (18).

In our SMDP model, at each decision epoch, the wireless controller on the train first has to decide whether the connection of the train should use the current chosen RS or connect to a different RS. We assume that the wireless controller on the train is in the coverage of no more than two RSs because of the deployment cost. The current composite action *a*(*t*)∈*A* is denoted by

where ${\mathfrak{a}}_{\text{intra}}\left(t\right)$ is the handoff decision between different RSs in the same cell, ${\mathfrak{a}}_{\text{inter}}\left(t\right)$ is the handoff decision between different RSs, which are located in two different cells. When the incoming handoff occurs between different RSs in the same cell, ${\mathfrak{a}}_{\text{intra}}\left(t\right)=1$; when the train decides to keep the connection with the same RS, ${\mathfrak{a}}_{\text{intra}}\left(t\right)=0$. When the incoming handoff occurs between two RSs in different cells, ${\mathfrak{a}}_{\text{inter}}\left(t\right)=1$; when the train stays in the coverage of the current BS, ${\mathfrak{a}}_{\text{inter}}\left(t\right)=0$.

In the formulation of an SMDP, the system reward is the optimization objective. Due to the high moving speeds of trains, inter-relay handoffs may occur frequently, which accordingly may lead to large handoff overheads. The objective of the proposed handoff scheme is to maximize the overall reward for a given user by taking into consideration both the transmission rate and the overhead for performing the inter-relay handoffs. The reward function is defined as

where *O*_{intra}and *O*_{inter} are the handoff overheads for the intra-cell and inter-cell inter-relay handoffs, respectively. Intuitively, having a higher transmission rate increases the reward, while higher overhead for performing the handoff reduces the reward. The exact values of the overheads for performing intra-cell and inter-cell inter-relay handoffs may depend on the specific system, protocol implementations, and the standards [12–14]. The relationship between the handoff overheads and some system parameters, such as link throughput, handoff delay, data packet arrival rate, etc., can be found in [12, 15, 16].

The optimality equation for an SMDP is given by [11]

where *ζ*(*c*) denotes the maximum expected total reward, given the initial composite state *c*, and the next composite state *c*^{′}, i.e.,

The solutions to the optimization problem correspond to the maximum expected total reward *ζ*(*c*) and the SMDP optimal policy *δ*^{∗}. Note that the SMDP optimal policy *δ*^{∗}indicates the decision regarding which action to choose. Various algorithms are available to solve the optimization problem [11]. We use the value iteration algorithm in this article to determine a stationary *є*-optimal policy and the corresponding expected total reward. The value iteration algorithm is described as follows, where *ζ*^{k}(*c*) represents the maximum expected total reward at iteration *k*.

(1) Set *k*=0 and *ζ*^{0}(*c*)=0 for each composite state *c*. Specify *є*>0.

(2) For each state *c*, compute *ζ*^{k + 1}(*c*) by

in iteration *k*.

(3) If ∥*ζ*^{k + 1}(*c*)−*ζ*^{k}(*c*)∥<*є*(1−*λ*)/2*λ*, go to step 4. Otherwise, increase *k* by one, and return to step 2.

(4) For each *c*∈*C*, compute the stationary *є*-optimal policy as

The value iteration algorithm is proved to be efficient and stable [11]. The algorithm operates by calculating successive approximations to the optimal value function *ζ*(*c*) in (22). The computation complexity of the algorithm is *O*(|*A*||*C*|^{2}).

### Implementation issues

In this section, we briefly describe the implementation of our proposed handoff decisions.

To determine the optimal policy *δ*^{∗}, we need to obtain the parameters in the SMDP model. We assume that the initial composite state transition probability in the network can be obtained, such as from field tests for a train at a certain moving speed. The values of the handoff overhead for the intra-cell and inter-cell inter-relay handoffs can be specified according to the communication standard. Since the time periods between successive time epochs are geometrically distributed with mean 1/(1−*λ*), the discount factor *λ* can be determined by the time of interest in our SMDP model.

Given the values of all the parameters, the value iteration algorithm described in Section 2 can be used to derive the optimal policy *δ*^{∗}. Once the optimal policy is obtained, the optimal action can be determined given the current state. At each decision epoch, the mobile train looks up the obtained policy to find the optimal action that corresponds to its current SNR state and moving speed, and then executes the optimal decision. In this way, the optimal handoff decision can be made by considering the handoff overhead and the moving speed.

## Simulation results

In this section, we present numerical results to demonstrate the performance of the proposed solution for making the inter-relay handoff decisions in a relay-assisted cellular network. We use the *M*-ary quadrature amplitude modulation (MQAM) and binary phase shift keying (BPSK) with modulation levels {2,4,…} as the available modulation schemes. We assume that the state of the BS-to-RS channel is good for 16QAM, since the RSs are deployed in the advantageous geographic locations, the modulation schemes of the RS-to-train channel can be BPSK, QAM, or 16QAM based on different channel states. The corresponding transmission rates of BPSK, QAM, and 16QAM are 1, 2, and 4, respectively. We set the default discount factor as *λ*=0.9. The duration of one symbol is 0.1 ms. There are three states for the RS-to-train channel. The simulation results are attained when the train is moving at speed *v*_{1}, which is varied in the simulation.

We first investigate the effectiveness of the proposed method. For the channel states, we allow transitions from a given state to its two adjacent states only [8]. We assume that the RSs are fixed, which are good enough to ensure the transmission rate. That is, the channel of BS-to-RS stays in a good state or transits to a good state with a high probability. An example of the state-transition probability matrix for the BS-to-RS channel is given below,

where ${P}_{\text{sr}}={\left[{p}_{{\gamma}_{\text{sr}}\left(t\right),{\gamma}_{\text{sr}}(t+1)}\right]}_{3\times 3}$ and ${P}_{{\text{sr}}^{\prime}}={\left[{p}_{{\gamma}_{{\text{sr}}^{\prime}}\left(t\right),{\gamma}_{{\text{sr}}^{\prime}}(t+1)}\right]}_{3\times 3}$. Depending on the selected SNR thresholds, the state-transition probability matrices of RS-to-train channels, denoted as ${P}_{\text{rd}}={\left[{p}_{{\gamma}_{\text{rd}}\left(t\right),{\gamma}_{\text{rd}}(t+1)}\right]}_{3\times 3}$ and ${P}_{{\text{r}}^{\prime}\text{d}}={\left[{p}_{{\gamma}_{{\text{r}}^{\prime}\text{d}}\left(t\right),{\gamma}_{{\text{r}}^{\prime}\text{d}}(t+1)}\right]}_{3\times 3}$, are changed accordingly. Given the states of the current RS and potential next RS, the overall state transition probability matrices can be easily obtained through Kronecker tensor product, i.e., $\mathcal{P}={P}_{\text{sr}}\otimes {P}_{{\text{sr}}^{\prime}}\otimes {P}_{\text{rd}}\otimes {P}_{{\text{r}}^{\prime}\text{d}}$. Note that only one type of the inter-relay handoff, either intra-cell or inter-cell handoff, may occur for a given user at any decision epoch. For illustration purposes, we consider that the intra-cell inter-relay handoff introduces little overhead, which is set to zero, and the overhead for performing the inter-cell inter-relay handoff is set to 0.1 as the default value. The exact values of these parameters may be different in different systems, depending on specific implementations. We conduct the Monte Carlo simulations over a large number of trials, and the state-transition probability matrices of RS-to-train channels are chosen randomly for each trial based on different SNR thresholds. All the SNR values are normalized to ${\gamma}_{\text{rd}}^{0}$, which has a normalized value of 1.

In order to better demonstrate the performance of the proposed inter-relay handoff method, we also simulated three other methods for making the inter-relay handoff decisions for comparisons. As the proposed method takes account of both moving speed and handoff overhead in the SMDP model, it is referred to as *overhead & speed* below. In contrast, we also consider two other methods, one considers the moving speed but not the handoff overhead, which is referred to as *speed*; and the other one considers the handoff overhead but not the moving speed in the SMDP formulation, which is referred to as *overhead*. In addition, we also simulated a “*conventional*” method, in which case the RS with the best instantaneous channel state is chosen for the train at each time instant. Figures 2, 3, 4 and 5 show the reward performance versus the moving speed and handoff overhead, and Figures 6, 7, 8 and 9 show the average transmission rate of the link versus the moving speed and handoff overhead.

Figures 2 and 3 show the reward performance for different values of *v*_{1}, where *v*_{0}=100 m/s in Figure 2 and *v*_{0}=50 m/s in Figure 3. It is observed that the reward performance of the proposed *overhead & speed* method is better than all the other methods, which indicates the effectiveness of the proposed scheme in improving the reward and the importance of considering both the moving speed and the handoff overhead in making the inter-relay handoff decisions. It is seen that the reward performance of the *overhead* method varies significantly with the moving speed. When the actual moving speed *v*_{1} is close to the original speed, for example, around 100 m/s in Figure 2 and 50 m/s in Figure 3, the difference between the proposed method and the *overhead* method is the minimum. In this case, the effect of the moving speed changes on the reward performance is much less than that of the handoff overhead. As the difference between the actual moving speed (*v*_{1}) and the original moving speed (*v*_{0}) increases, i.e., *v*_{1} either much smaller or much larger than *v*_{0}, the gap between the *overhead* method and the proposed method increases, because of the increasing effect of the moving speed on the SNR, which eventually affects the transmission rate and the reward function. By taking into consideration the moving speed changes in deriving the channel state transition matrix and making the handoff decisions, the reward performance of both the proposed and the *speed* methods keeps almost constant when the train is moving at different speeds. Although the reward performance of the *conventional* method also keeps unchanged with the moving speed, its reward function is lower than both the proposed one and the *speed* one, because it does not consider the effects of the moving speed and handoff overhead on the reward. The effect of the moving speed on the reward function in the *conventional* method can be complicated. On one hand, using this handoff method allows the MS to always connect to the best RS that provides it with the highest transmission rate, which increases the reward; on the other hand, this increases the number of handoffs and reduce the reward. Overall, as we can observe from the figures that the reward is not very much sensitive to the moving speed.

Figures 4 and 5 show the reward performance versus the inter-cell inter-relay handoff overhead for *v*_{0}=100 m/s and 50 m/s, respectively. These figures also show that the proposed handoff method achieves the highest reward over a wide range of the handoff overhead values. The reward performance of all the four handoff methods degrades with the handoff overhead. That is, high overhead can discourage the user for switching to the RS with better channel quality. As the overhead is relatively low, the proposed method achieves much higher reward than all the other three methods, because it takes into account the effects of both the overhead and the moving speed on the reward function. It is also noticed that the *conventional* method achieves better reward than the *overhead* method when the overhead is relatively low; as the overhead for performing the handoffs increases, the rewards of the proposed method and the *overhead* method outperform the *conventional* one due to the fact that the effect of the overhead becomes increasingly larger. Without considering the handoff overhead and the moving speed, using the *conventional* method to make inter-relay handoff decisions results in the worse reward performance among all the four methods.

Next, we examine the average rate performance as the moving speed and handoff overhead change. Figures 6 and 7 show the average rate versus the actual moving speeds when *v*_{0}=100 m/s and *v*_{0}=50 m/s, respectively. The simulation setting is the same as that in Figures 2 and 3. By always connecting to the RS with the best SNR, the *conventional* method achieves the highest transmission rate. Both Figures 6 and 7 show that the proposed method achieves almost the same average rate as the *conventional* one, indicating that the proposed method can achieve the highest reward without sacrificing much transmission rate. The *speed* method also achieves almost as good transmission rate as the proposed one, but at a price of reduced reward performance. Without considering the moving speed changes, the average rate achieved by the *overhead* method can change significantly with the actual moving speed, and much worse than the average rates achieved by the other handoff methods when the actual moving speed is much larger or smaller than the original speed.

Figures 8 and 9 show the average rate performance versus the inter-cell inter-relay handoff overhead for *v*_{0}=100 m/s and 50 m/s, respectively. The simulation setting is the same as that in Figures 4 and 5. The average rates using both the *conventional* and the *speed* methods do not change with the handoff overhead, as the handoff decisions in both these methods are independent of the overhead. The proposed method achieves approximately the same average rate as the *conventional* method, except when the overhead for handoff is very high, in which case the mobile user may decide not to handoff to a better RS in order to achieve higher reward. The *overhead* method does not consider the effect of the moving speed and achieves the lowest average rate among all the handoff methods.

## Conclusions

In this article, we have studied the problem of inter-relay handoff in a relay-assisted network. The problem is formulated as an SMDP, which considers both the effect of moving speed on the channel SNR and the overhead for performing handoffs. Our results indicate that the proposed method can achieve much higher reward, compared to the handoff methods that consider either the handoff only, moving speed only, or conventional handoff method based on channel quality only. Furthermore, the proposed handoff method achieves as high average transmission rate as the conventional handoff method that always connects the mobile user to the best RS. In addition, numerical results indicate that the proposed method can be applied to the networks with users having a wide range of moving speeds. When the overhead for performing the handoff is very high, the proposed method can be simplified to the *overhead* method without sacrificing the reward performance much.

## Appendix: Derivation of the channel state transition probability when the speed changes

Usually, the SNR state transition probability at an original speed *v*_{0} can be measured, which can be seen as known. In order to derive the SNR state transition probability when the speed changes, it is meant to find the relationship between the SNR state transition probabilities at the actual speed *v*_{1} and the original speed *v*_{0}. Based on (8) and (9), the SNR state transition probabilities with the speed of *v*_{1} can be obtained as

substituting (10), (11), and (12) into (25) and (26), we can get

According to the expression of the maximum Doppler frequency (4), the SNR state transition probabilities with the actual speed *v*_{1}can be further reformulated as

where $m={v}_{0}^{2}/{v}_{1}^{2}$.

Since the information about the SNR state transition probability of the train at speed *v*_{0} is known, we need to transform (29) and (30) into the function with respect to the SNR state transition probability with the original speed. In the similar way as the derivation of (29) and (30), the channel state transition probabilities with the speed *v*_{0} can be obtained as

In terms of the expression of the SNR state probabilities at the original and actual speeds, the SNR state transition probabilities at the actual speed *v*_{1}can be calculated as

where

## Endnote

^{a}Here we only consider the RS-to-train link, because the RS is assumed to be in the fixed place, such that the BS-to-RS link will not change much.

## References

- 1.
Pabst R, Walke B, Schultz D, Herhold P, Yanikomeroglu H, Mukherjee S, Viswanathan H, Lott M, Zirwas W, Dohler M, Aghvami H, Falconer D, Fettweis G: Relay-based deployment concepts for wireless and mobile broadband radio.

*IEEE Commun. Mag*2004, 42: 80-89. - 2.
Ni W, Shen G, Jin S, Fahldieck T, Muenzner R: Cooperative relay in IEEE 802.16j MMR. IEEE C802.16j-06_006r1. 2006.http://www.ieee802.org/16/relay/contrib/C80216j-06_006.pdf Alcatel, Shanghai, China, Technical Report.

- 3.
Nourizadeh H, Nourizadeh S, Tafazolli R: Impact of the inter-relay handoff on the relaying system performance. In

*IEEE 64th Vehicular Technology Conference*. Montreal, QC, Canada; 2006:2529-2533. - 4.
Yu F, Krishnamurthy V: Optimal joint session admission control in integrated WLAN and CDMA cellular networks with vertical handoff.

*IEEE Trans. Mobile Comput*2007, 6: 126-139. - 5.
Chang BJ, Chen JF, Hsieh CH, Liang YH: Markov decision process-based adaptive vertical handoff with rss prediction in heterogeneous wireless networks. In

*IEEE Wireless Communications and Networking Conference*. Budapest, Hungary; 2009:1-6. - 6.
Zhang P, Xu Z, Wang F, Xie X, Tu L: A relay assignment algorithm with interference mitigation for cooperative communication. In

*IEEE Wireless Communications and Networking Conference*. Budapest, Hungary; 2009:1-6. - 7.
Laneman JN, Tse DNC, Wornell GW: Cooperative diversity in wireless networks: efficient protocols and outage behavior.

*IEEE Trans. Inf. Theory*2004, 12: 3062-3080. - 8.
Zhang Q, Kassam SA: Finite-state Markov model for rayleigh fading channels.

*IEEE Trans. Commun*1999, 47: 1688-1692. - 9.
Li Y, Cimini LJ: Bounds on the interchannel interference of OFDM in time-varying impairments.

*IEEE Trans. Commun*2001, 49: 401-404. 10.1109/26.911445 - 10.
Yacoub MD:

*Foundations of Mobile Radio Engineering*. CRC, Boca Raton, FL; 1993. - 11.
Puterman M:

*Markov Decision Processes: Discrete Stochastic Dynamic Programming*. Wiley, New York; 1994. - 12.
Pathak A, Srivatsa AM, Xie J: An analytical model for handoff overhead analysis in Internet-based infrastructure wireless mesh networks. In

*IEEE International Conference on Communications*. Beijing, China; 2008:2884-2888. - 13.
Tseng C, Chi K, Hsieh M, Chang H: Location-based fast handoff for 802.11 networks.

*IEEE Commun. Lett*2005, 9: 304-306. - 14.
3GPP TS 36.331 V10.5.0: Evolved Universal Terrestrial Radio Access (E-UTRA); Radio Resource Control (RRC); Protocol specification, Technical Report. 2009.http://www.3gpp.org/ftp/tsg_sa/WG5_TM/TSGS5_73/_specs_for_checking/32762-a10.doc

- 15.
Chang B, Lin S: Mobile IPv6-based efficient vertical handoff approach for heterogeneous wireless networks.

*Wirel. Commun Mobile Comput*2006, 6: 691-709. 10.1002/wcm.418 - 16.
Dimou K, Wang M, Yang Y, Kazmi M, Larmo A, Pettersson J, Muller W, Timner Y: Handover within 3GPP LTE-design principles and performance. In

*IEEE 70th Vehicular Technology Conference Fall*. Anchorage, USA; 2009:1411-1415.

## Acknowledgements

This work was supported by the Key Project of State Key Lab. of Rail Traffic and Control under Grant number RCS2012ZZ004 and the Fundamental Research Funds for the Central Universities under Grant number 2013YJS025.

## Author information

## Additional information

### Competing interests

The authors declare that they have no competing interests.

## Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

## Rights and permissions

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## About this article

#### Received

#### Accepted

#### Published

#### DOI

### Keywords

- Reward Function
- State Transition Probability
- Total Reward
- Decision Epoch
- Handoff Decision