2.1. Average Smoothed Flow Utility
A wireless communication link supports
data flows in a channel that varies with time, which we model using discrete-time intervals
. We let
be the data flow rate vector on the link, where
,
, is the
th flow's data rate at time
and
denotes the set of nonnegative numbers. We let
denote the total flow rate over all flows, where
is the vector with all entries one. The flows, and the total flow rate, will depend on the random channel gain (through the flow policy, described below) and so are random variables.
We will work with a smoothed version of the flow rates, which is meant to capture the tolerance of the applications using the data flows to time variations in data rate. This was introduced in [12] using delivery contracts, in which the utility is a function of the total flow over a given time interval; here, we use instead a very simple first-order linear smoothing. At each time
, the smoothed data flow rate vector
is given by
where
,
,
, is the smoothing parameter for the
th flow, and we take
. Thus, we have
where at time
, each smoothed flow rate
is the exponentially weighted average of previous flow rates.
The smoothing parameter
determines the level of smoothing on flow
. Small smoothing parameter values (
close to zero) correspond to light smoothing; large values (
close to one) correspond to heavy smoothing. (Note that
means that flow
is not smoothed; we have
.) The level of smoothing can be related to the time scale over which the smoothing occurs. We define
to be the smoothing time associated with flow
. Roughly speaking, the smoothing time is the time interval over which the effect of a flow on the smoothed flow decays by a factor
. Light smoothing corresponds to short smoothing times, while heavy smoothing corresponds to longer smoothing times.
We associate with each smoothed flow rate
a strictly concave nondecreasing differentiable utility function
, where the utility of
is
. The average utility derived over all flows, over all time, is
where
. Here, the expectation is over the smoothed flows
, and we are assuming that the expectations and limit above exist.
While most of our results will hold for more general utilities, we will focus on the family of power utility functions, defined for
as
parameterized by
and
. The parameter
sets the curvature (or risk aversion), while
sets the overall weight of the utility. (For small values of
,
approaches a log utility.)
Before proceeding, we make some general comments on our use of smoothed flows. The smoothing can be considered as a type of time averaging; then we apply a concave utility function; finally, we average this utility. The time averaging and utility function operations do not commute, except in the case when the utility is linear (or affine). Jensen's inequality tells us that average smoothed utility is greater than or equal to the average utility applied directly to the flow rates, that is,
So the time smoothing step does affect our average utility; we will see later that it has a dramatic effect on the optimal flow policy.
2.2. Average Power
We model the wireless channel with time-varying positive gain parameters
,
, which we assume are independent identically distributed (IID), with known distribution. At each time
, the gain parameter affects the power
required to support the total data flow rate
. The power
is given by
where
is increasing and strictly convex in
for each value of
(
is the set of positive numbers).
While our results will hold for the more general case, we will focus on the more specific power function described here. We suppose that the signal-to-interference-and-noise ratio (SINR) of the channel is given by
. (Here
includes the effect of time-varying channel gain, noise, and interference.) The channel capacity is then
, where
is a constant; this must equal at least the total flow rate
, so we obtain
The total average power is
where, again, we are assuming that the expectations and limit exist.
2.3. Flow Rate Control Problem
The overall objective is to maximize a weighted difference between average utility and average power,
where
is used to trade off average utility and power.
We require that the flow policy is causal; that is, when
is chosen, we know the previous and current values of the flows, smoothed flows, and channel gains. Standard arguments in stochastic control (see, e.g., [13–17]) can be used to conclude that, without loss of generality, we can assume that the flow control policy has the form
where
. In other words, the policy depends only on the current smoothed flows and the current channel gain value.
The flow rate control problem is to choose the flow rate policy
to maximize the overall objective in (9). This is a standard convex stochastic control problem, with linear dynamics.
2.4. Our Results
We let
be the optimal overall objective value and let
be an optimal policy. We will show that in the general (multiple-flow) case, the optimal policy includes a "no-transmit" zone, that is, a region in the
space in which the optimal flow rate is zero. Not surprisingly, the optimal flow policy can be roughly described as waiting until the channel gain is large, or until the smoothed flow has fallen to a low level, at which point we transmit (i.e., choose nonzero
). Roughly speaking, the higher the level of smoothing, the longer we can afford to wait for a large channel gain before transmitting. The average power required to support a given utility level decreases, sometimes dramatically, as the level of smoothing increases.
We show that the optimal policy for the case of a single flow is readily computed numerically, working from Bellman's characterization of the optimal policy, and is not particularly sensitive to the details of the utility functions, smoothing levels, or power functions.
For the case of multiple flows, we cannot easily compute (or even represent) the optimal policy. For this case we propose an approximate policy, based on approximate dynamic programming [18, 19]. By computing an upper bound on
, by allowing the flow control policy to use future values of channel gain (i.e., relaxing the causality requirement [20]), we show in numerical experiments that such policies are nearly optimal.