When estimating TOA in a dense multipath environment, the accuracy is impacted not only by the noise, but also by the presence of the many echoes of the signal due to the multipath. In this section, we first demonstrate that when noise is absent and we are in the presence of multipath only, then the proposed estimator yields the correct TOA uniquely, provided the covariance matrix **K**_{
h
} is exactly known. For the rest of the article, we assume that **D** = **I** without loss of generality.

### 3.1 Uniqueness of the ML-TOA estimation

Assume for the present that noise is absent, i.e., *σ*^{2} = 0. Since **K**_{
h
} can be factored using the Singular Value Decomposition **K**_{
h
} = (**UΛ**^{1/2}**U**^{H}) (**UΛ**^{1/2}**U**^{H}) = **RR**^{H}, the channel can be expressed as

\mathbf{h}=\mathbf{Rz},

(6)

where **z** ∈ *C*^{L} is a zero mean Gaussian random vector with covariance matrix \mathbb{E}\left\{\mathbf{z}{\mathbf{z}}^{H}\right\}=\mathbf{I} and *L* is the rank of **K**_{
h
}. In this case, the received FFT output vector will be **y** = **G**(*τ*_{0})**h** = **G**(*τ*_{0})**Rz**.

Using this expression and the fact that when noise is absent the **F** matrix reduces to **F** = **R** (**R**^{H}**R)**^{-1} **R**^{H} and the fact that **G**^{H}(*τ*)**G**(*τ*_{0}) = **G**^{H}(*τ* - *τ*_{0}), the cost function *Q*(*τ*) in (5) becomes *Q*(*τ*) = **z**^{H} **R**^{H}**G**^{H}(*τ*_{0})**G**(*τ*)**R** (**R**^{H} **R)**^{-1} **R**^{H}**G**^{H}(*τ*)**G**(*τ*_{0})**Rz** = ||**P**_{
R
}**G**^{H}(*τ* - *τ*_{0})**Rz**||^{2} where **P**_{
R
} = **R** (**R**^{H} **R**)^{-1} **R**^{H} is the orthogonal projector onto the range space of **R**, i.e., Range (**R**), and this follows from the fact that {\mathbf{P}}_{\mathbf{R}}^{2}={\mathbf{P}}_{\mathbf{R}}. Since **P**_{
R
} is an orthogonal projector, it can be seen that given a realization of **z**, *Q*(*τ*) is maximized if and only if **G**^{H} (*τ* - *τ*_{0}) **Rz** ∈ Range (**R**). Obviously, this is the case when *τ* = *τ*_{0} and the **G** matrix reduces to an identity matrix. We would like to investigate whether there are other possible maximizing values of *τ*.

To simplify the notation, let \theta =\frac{2\pi}{T}\left(\tau -{\tau}_{0}\right) and define **G**(*θ*) ≜ **G**^{H}(*τ* - *τ*_{0}). We are looking for conditions on *θ* such that **G**(*θ*)**Rz** ∈ Range(**R**), *θ* = 0 being an obvious solution. We note first that we can convert this problem into the deterministic one of finding conditions on *θ* such that Range (**G** (*θ*) **R**) ⊆ Range(**R**). Certainly this latter condition is sufficient to guarantee that **G**(*θ*)**Rz** ∈ Range(**R**). It is also true that if Range (**G**(*θ*)**R**) ⊈ Range(**R**), then **G**(*θ*)**Rz** ∉ Range(**R**) with probability one. To see this, note that Range (**G**(*θ*)**R**) ⊈ Range(**R**) is equivalent to [Range (**R**)]^{⊥} ⊈ [Range (**G**(*θ*)**R**)]^{⊥} where ⊥ denotes the orthogonal complement. Let **v** denote any non-zero vector such that **v** ∈ [Range (**R**)]^{⊥} but **v** ∉ [Range (**G**(*θ*)**R**)]^{⊥}. Then, **v**^{H}**G**(*θ*)**R** ≠ **0** and the random variable **v**^{H}**G**(*θ*)**Rz** is Gaussian with non-zero variance and will be non-zero with probability one. Therefore, with probability one, **v**^{H} **G**(*θ*)**Rz** ≠ 0 and **G**(*θ*)**Rz** ∉ Range(**R**) because it is not orthogonal to **v**.

Now, the deterministic condition Range (**G**(*θ*)**R**) ⊆ Range (**R**) is equivalent to the existence of some matrix **A** such that **G**(*θ*)**R** = **RA**. Multiplying on the left by **G** yields **G**^{2}**R** = **GRA** = **RA**^{2}, and continuing this operation yields **G**^{n}**R** = **RA**^{n}, for all positive integers *n*. It follows easily that for any polynomial f\left(\lambda \right)={\sum}_{n}{c}_{n}{\lambda}^{n},

f\left(\mathbf{G}\right)\mathbf{R}=\mathbf{R}f\left(\mathbf{A}\right),

(7)

where f\left(\mathbf{A}\right)={\sum}_{n}{c}_{n}{\mathbf{A}}^{n} is a matrix polynomial. From the structure of **G**, we see that

f\left(\mathbf{G}\right)=\mathsf{\text{diag}}\left\{f\left(1\right),\phantom{\rule{2.77695pt}{0ex}}f\left({e}^{j\theta}\right),\phantom{\rule{2.77695pt}{0ex}}f\left({e}^{j2\theta}\right),\phantom{\rule{2.77695pt}{0ex}}\dots ,\phantom{\rule{2.77695pt}{0ex}}f\left({e}^{j\left(N-1\right)\theta}\right)\right\}.

(8)

Equation (7) says that any matrix of the form (8) can multiply **R** on the left, and the resulting matrix *f*(**G**)**R** satisfies Range (*f*(**G**)**R**) ⊆ Range (**R**).

Let us now assume that Range(**R**) includes the flat channel vector **h**_{f} = **1** where **1** is a vector with all unit elements. This essentially assumes that a flat fading channel is one of the possible realizations so that there is a vector **z** such that **1** = **Rz**. Multiplying (7) by **z** yields *f*(**G**)**1** = **R** *f*(**A**)**z** which means that, from (6),

f\left(\mathbf{G}\right)1={\left[f\left(1\right)\phantom{\rule{2.77695pt}{0ex}}\phantom{\rule{2.77695pt}{0ex}}f\left({e}^{j\theta}\right)\phantom{\rule{2.77695pt}{0ex}}\phantom{\rule{2.77695pt}{0ex}}f\left({e}^{j2\theta}\right)\phantom{\rule{2.77695pt}{0ex}}\phantom{\rule{2.77695pt}{0ex}}\cdot \cdot \cdot \phantom{\rule{2.77695pt}{0ex}}\phantom{\rule{2.77695pt}{0ex}}f\left({e}^{j\left(N-1\right)\theta}\right)\right]}^{T}

(9)

is a realizable channel vector for any polynomial *f*(*λ*).

Now, let *L* be the rank of **R** and assume that *L < N*. Then, the *N* values {1, *e*^{jθ}, *e*^{j 2θ},..., *e*^{j(N-1)θ}} cannot all be distinct for, if they were, the channel vector (9) could be chosen arbitrarily by suitable choice of interpolating polynomial *f*(*λ*), contrary to the fact that the realizable channels are restricted to the *L* dimensional space Range (**R**). This is due to the well-known fact that a polynomial can always be found, which takes arbitrary values on any given set of arguments. In fact, we can see that at most *L* of the values {1, *e*^{jθ}, *e*^{j 2θ},..., *e*^{j(N-1)θ}} can be distinct for a similar reason. Now suppose there are actually *q* distinct values. It follows that the first *q* values must be distinct because, for example, if *e*^{jrθ} = *e*^{jpθ} where *r < p* ≤ *q* then *e*^{j} ^{(p-r)θ}= 1 and there will be only *p* - *r* - 1 *< q* distinct values.

We have now shown that there must be an integer *q* ≤ *L* such that *e*^{jqθ} = 1. Then, the sequence {1, *e*^{jθ}, *e*^{j 2θ},..., *e*^{j(N-1)θ}} cycles as follows {1, *e*^{jθ}, *e*^{j 2θ},..., *e*^{j(q-1)θ}, 1, *e*^{jθ}, *e*^{j 2θ},...}. Suppose for example that *q* = 2. Then, the sequence is {1, *e*^{jθ}, 1, *e*^{jθ},...,1, *e*^{jθ},...}. Choose an interpolating polynomial such that *f*(1) = 1 and *f*(*e*^{jθ}) = -1. Then, from (9) the vector of alternating plus and minus ones, i.e., *f*(**G**)**1** = [1 -1 1 -1 1⋯]^{T} would be a realizable channel vector. But this highly oscillatory channel frequency response would imply a very large channel delay spread. Therefore, if the delay spread of the channel is not too large, the value *q* = 2 would not be realistic. Similar examples of unrealistic channel frequency response can be constructed for any *q* greater than 1. Therefore, we are left with *q* = 1 in which case the only solution is *e*^{jqθ} = 1 so that *θ* = 0 and the solution is unique. The simulation results in Section 5.1 also demonstrate this uniqueness property of the ML-TOA estimator.

### 3.2 ML-TOA estimation in NLOS situations

In this section, the effect of NLOS on the ML-TOA estimator is discussed and we show that the NLOS case is very naturally incorporated into the proposed ML-TOA estimator. Recall (1), (2) and (3) which illustrate how the TOA *τ*_{0} is factored out and incorporated into the **G** matrix. These equations were developed with the understanding that *τ*_{0} was the path delay of the direct LOS path. From now on, however, we simply define TOA *τ*_{0} as the time it would take for an electromagnetic wave to travel the straight line that links the MS and AP, whether or not such a direct LOS path actually exists. In the case when a LOS path does not exist, (1) would be modified to read

{y}_{k}={d}_{k}\left(\sum _{i=1}^{L-1}{a}_{i}{e}^{-j\frac{2\pi}{T}k{\tau}_{i}}\right)+{n}_{k},

(10)

where the *i* = 0 term has been removed since the LOS path is absent. Nevertheless, with *τ*_{0} defined as above, we may still express the actual path delays in terms of *τ*_{0} as {\tau}_{i}={\tau}_{0}+\left({\tau}_{i}-{\tau}_{0}\right)={\tau}_{0}+{\stackrel{\u0304}{\tau}}_{i}, and we obtain a modified (2) as

{y}_{k}={H}_{k}{d}_{k}{e}^{-j\frac{2\pi}{T}k{\tau}_{0}}+{n}_{k},

(11)

where *H*_{
k
} is now given by {H}_{k}={\sum}_{i=1}^{L-1}{a}_{i}{e}^{-j\frac{2\pi}{T}k{\stackrel{\u0304}{\tau}}_{i}}, and is the zero delay frequency response at the *k* th subcarrier when no LOS path is present. We maintain the earlier definition of the subcarrier frequency response vector as **h** = [*H*_{0} *H*_{1} ⋯ *H*_{N-1}]^{T} and (11) can be used to express the complete FFT output vector as

\mathbf{y}=\mathbf{G}\left({\tau}_{0}\right)\mathbf{Dh}+\mathbf{n}.

(12)

Note that (12) is exactly the same as (3). The only difference in this NLOS case is the modification of the elements of the **h** vector due to the absence of the direct path. The derivation of the ML estimator now follows exactly as the case in which a direct path is present, and the channel statistics as measured by the procedure outlined below will reflect the actual environment, whether or not there is always a direct path present.

In practice, no matter what the multipath structure, the channel covariance matrix **K**_{
h
} can be estimated off-line by averaging measurements at each AP while the MS transmits at some known locations chosen in a random fashion. The detailed procedure is as follows: *Step 1* : For a given, *known*, AP location, measure the received FFT output vector **y**^{(i)}at the AP for the *i* th MS location. *Step 2* : Since, in this measurement phase, both MS and AP locations are known, TOA of the *i* th MS transmission (at *i* th location), i.e., {\tau}_{0}^{\left(i\right)}, can be computed by dividing the distance between them by the speed of light, and the **G**^{(i)}matrix can be determined by {\mathbf{G}}^{\left(i\right)}=\mathsf{\text{diag}}\left\{1,{e}^{-j\frac{2\pi}{T}{{\tau}_{0}}^{\left(i\right)}},{e}^{-j\frac{2\pi}{T}2{\tau}_{0}^{\left(i\right)}},\dots ,{{e}^{-j\frac{2\pi}{T}}}^{\left(N-1\right){{\tau}_{0}}^{\left(i\right)}}\right\}. Then, for this *i* th transmission, the FFT output vector **y** in Equation (12) is measured and an estimated snapshot of **h**^{(i)}can be found by {\hat{\mathbf{h}}}^{\left(i\right)}={\left({\mathbf{G}}^{\left(i\right)}\right)}^{-1}{\mathbf{y}}^{\left(i\right)}={\mathbf{h}}^{\left(i\right)}+{\left({\mathbf{G}}^{\left(i\right)}\right)}^{-1}{\mathbf{n}}^{\left(i\right)}. *Step 3*: After collecting measurements at *P* different MS locations, the estimated channel covariance matrix is obtained by {\widehat{\mathbf{K}}}_{\mathbf{h}}=\frac{1}{P}{\sum}_{i=1}^{P}{\hat{\mathbf{h}}}^{\left(i\right)}{\left({\hat{\mathbf{h}}}^{\left(i\right)}\right)}^{H}.

For future reference we now define the NLOS delay. For the NLOS case in which a LOS path does not exist, *τ*_{1} in (10) will be the first arriving path delay. Then, we define "NLOS delay" = *τ*_{1} - *τ*_{0}, where *τ*_{0} is the line of sight distance divided by the speed of light, as described above. The NLOS delay is sometimes called the excess delay and is the time difference between the first arriving actual NLOS path and the direct LOS time delay, *τ*_{0}.

At this point, we emphasize that the NLOS case is very naturally incorporated into the proposed ML-TOA estimator. Recall that in the entire development, including the estimation procedure for {\widehat{\mathbf{K}}}_{\mathbf{h}} above, TOA is defined as the time it takes for the electromagnetic waves to travel the straight line that links the MS and AP, whether or not such a LOS path actually exists. Therefore, in *Step 2* above the TOA can still be computed given the location of MS and AP even in the absence of a LOS path, since TOA is known whether or not a direct LOS exists. This is based on the idea that motivates the ML-TOA estimation. That is to separate the desired parameter from the statistics of the multipath channel. For the purpose of positioning, the desired parameter is the "generalized" TOA that we defined in the beginning of this section. In this way, the statistical properties of the measured channels will naturally incorporate the NLOS properties of the channel and no extra step or *a prior* information about the NLOS statistics is required to mitigate the NLOS effects. In Section 5.1, we present simulation results which show the TOA estimation performance for both LOS and NLOS cases. Finally, we point out that, since (12) is identical to (3), the uniqueness proof in Section 3.1 applies to the NLOS case as well.

### 3.3 Properties of the cost function *Q*(*τ*)

The TOA estimation is a nonlinear problem and is known to exhibit ambiguities which could result in large errors [21, 22]. In the large error regime, the CRLB cannot be attained. In this section, the behavior of the cost function *Q*(*τ*) is studied for two multipath channel models. It is also shown that for single path channels, the ML-TOA estimator is unbiased and the estimation error variance is inversely proportional to the bandwidth.

Consider first the extreme case in which there is only a direct path at *τ*_{0} = 0 and no additive noise. We have **h** = *a* **1** where *a* is the random path gain. Then, it is easily seen that {\mathbf{K}}_{\mathbf{h}}={\sigma}_{a}^{2}1{1}^{T} where {\sigma}_{a}^{2} is the variance of *a*, then **R** = *σ*_{
a
}**1**, **F** = *c* **11**^{T} and **y** = **h** = *a* **1**, and the cost function (5) becomes Q\left(\tau \right)=\alpha |{1}^{T}\mathbf{G}1{|}^{2}=\beta {\left(\frac{sin\left(\frac{N\pi}{T}\tau \right)}{sin\left(\frac{\pi}{T}\tau \right)}\right)}^{2} where *α*, *β* are some constants. The width of the main lobe is inversely proportional to the number of subcarriers *N* or equivalently the bandwidth. In Figure 1, one realization of the noise free cost function *Q*(*τ*) in a single path channel is shown for the 802.11a configuration where *N* = 64 (see Section 5). It can be seen that it closely matches the theoretical curve where the training sequence is assumed to be all 1's.

In the case of multipath, we first investigate the cost function *Q*(*τ*) when noise is absent. In Figure 2, one realization of the noise free cost function for Exponential channel model and WLAN channel model A (see Section 5 for detailed description of the channel models used in this article) are plotted. Note the noise free cost function for the Exponential channel is fairly flat. As demonstrated in Section 3.1 if **K**_{
h
} is perfectly known the actual peak of the cost function is at zero offset, but at high SNR, where the flattening effect is observed, an error in **K**_{
h
} can result in biased TOA estimation (see Figure 3). The noise free cost function for WLAN channel model A shows that a clear peak is present thus is more robust to the error from the estimated channel covariance matrix at high SNR region (see Figure 4). For all other WLAN channel models, i.e., B to D, we have observed that the cost functions have similar characteristics to those for channel model A.