Channel model
Generally, Underwater acoustic (UWA) channels are random channels that vary in time and space [7]. The received signal is a coherent superposition of the transmission signals by all the sound rays at the receiver. Due to the influence of different sound rays, the received signal exhibits spatiotemporal characteristics [34]. In general, this paper ignores the frequency characteristics of dielectric absorption and assumes that there is no dispersion along any path. Suppose there are P paths and the acoustic signal waveform along each propagation path remains unchanged. For the UWA channel, the impulse response can be expressed as:
$$\begin{aligned} \begin{aligned} h(t)=\sum \limits _{p=1}^{P}{{{A}_{p}}\delta (t{{\tau }_{p}})} \end{aligned} \end{aligned}$$
(1)
where \({{A}_{p}}\) is the amplitude of the pth path, and \({{\tau }_{p}}\) is the delay of the pth path.
Another channel model takes the DOA of different paths at the receiving end as the classification standard. Then, for the coherent multipath channel, the impulse response is expressed as:
$$\begin{aligned} \begin{aligned} h(t)=\sum \limits _{j=1}^{J}{{{A}_{j}}\sum \limits _{l=1}^{{{L}_{\text {j}}}}{\delta (t{{\tau }_{j,l}})}} \end{aligned} \end{aligned}$$
(2)
where J is the DOA number of all paths. \({{L}_{j}}\) is the number of paths in the jth direction, \({{\tau }_{j,l}}\) is the delay of the corresponding path, and:
$$\begin{aligned} \begin{aligned} \sum _{j=1}^{L}L_{j}=P \end{aligned} \end{aligned}$$
(3)
The communication system is illustrated in Fig. 1. The receiver uses a uniform linear array (ULA) to receive UWA signals. Assuming that the ULA is located in the farfield, the UWA signal incident on the ULA is a parallel plane wave. In the subsequent analysis, the first channel model is adopted.
Receive signal model
Assuming that the transmitted signal is a singlefrequency signal spread by msequence, it can be represented as:
$$\begin{aligned} \begin{aligned} C\left( t \right) =c\left( t \right) \text {cos}\left( {{\omega }_{c}}t \right) \ \end{aligned} \end{aligned}$$
(4)
where c(t) is the ith symbol of the msequence and \(c(t)=\sum \limits _{i=0}^{{{N}_{m}}1}{{{c}_{i}}{{P}_{{{T}_{c}}}}(in{{T}_{c}})}\),\({{c}_{i}}\in (1,1)\). \({{T}_{c}}\) is the symbol interval, \({{N}_{m}}\) is the period of the mth sequence. \({{P}_{{{T}_{c}}}}\)is the symbol pulse shaping filter, and the root raised cosine filter is adopted. The rolloff coefficient is \(\beta\). Then, the bandwidth of the signal is \(B={(1+\beta )}/{{{T}_{c}}}\;\).
This paper takes the ULA as an example and assumes that the sound source is in the farfield. After passing through the channel, the spread spectrum signal on the \({{N}_{p}}\) path is incident on the Melement ULA [35], as demonstrated in Fig. 2.
The first array element is the reference array element. d represents the distance between the array elements. The signal received by the mth array element is:
$$\begin{aligned} \begin{aligned} {{x}_{m}}(t)=\sum \limits _{p=1}^{{{N}_{p}}}{{{A}_{p}}C\left[ (1+a)t{{\tau }_{p}}{{\tau }_{m}}({{\theta }_{p}}) \right] }+{{n}_{m}}(t),\text {1}\le \text {m}\le \text {M}\ \end{aligned} \end{aligned}$$
(5)
where \({{A}_{p}}\) represents the gain of the pth path. a is the Doppler compression factor. \({{\tau }_{p}}\) represents the time delay of the pth path that reaches the reference array element. \({{\tau }_{m}}({{\theta }_{p}})\) represents the sound path difference of the mth array element relative to the reference array element. \({{n}_{m}}(t)\) represents the noise of the mth element.
MUSIC Algorithm Based on Spread Spectrum Sequence
The MUSIC algorithm has been widely used because of its high resolution and precision. However, it also has several shortcomings [36]. For instance, the MUSIC algorithm suffers from a reduced resolution under a low SNR and few snapshots, and the estimation performance drops substantially. Meanwhile, the algorithm fails in a multipath environment because the signals are highly correlated [37]. These shortcomings affect the application of this algorithm in the communication field. However, using the MUSIC algorithm to estimate the modulated msequence does not require decoherence such as spatial smoothing, and the signal’s DOA can be estimated in the easiest and fastest way.
Before using the MUSIC algorithm, quadrature demodulation must first be performed to create a complex signal. In Fig. 3, the signal received by the mth element is multiplied by sine and cosine signals and then processed by a lowpass filter. This process is called quadrature demodulation [38].
The signal after quadrature demodulation can be represented as:
$$\begin{aligned} \begin{aligned} y_{mi}(t)&=2x_{m}(t)cos(\omega t)\\&\overset{LPF}{\mathop {=}}\,\sum \limits _{p=1}^{{{N}_{p}}}{{{A}_{p}}c\left[ (1+a)t{{\tau }_{p}}{{\tau }_{m}}({{\theta }_{p}}) \right] }\\&\cdot \cos \left[ a{{\omega }_{\text {c}}}t{{\omega }_{\text {c}}}{{\tau }_{p}}{{\omega }_{\text {c}}}{{\tau }_{m}}({{\theta }_{p}}) \right] +{{n}_{mI}}(t)\ \end{aligned} \end{aligned}$$
(6)
$$\begin{aligned} \begin{aligned} {{y}_{mQ}}(t)&=2{{x}_{m}}(t)\sin ({{\omega }_{c}}t)\\&\overset{LPF}{\mathop {=}}\,\sum \limits _{p=1}^{{{N}_{p}}}{{{A}_{p}}c\left[ (1+a)t{{\tau }_{p}}{{\tau }_{m}}({{\theta }_{p}}) \right] }\\&\cdot \sin \left[ a{{\omega }_{\text {c}}}t{{\omega }_{\text {c}}}{{\tau }_{p}}{{\omega }_{\text {c}}}{{\tau }_{m}}({{\theta }_{p}}) \right] +{{n}_{mQ}}(t)\ \end{aligned} \end{aligned}$$
(7)
where \({{n}_{mIQ}}(t)={{n}_{mI}}(t)+j{{n}_{mQ}}(t)\), and \(j=\sqrt{1}\). The output of quadrature demodulation is a complex number consisting of indirection and quadrature components:
$$\begin{aligned} \begin{aligned} {{y}_{m}}(t)&={{y}_{mI}}(t)+j{{y}_{mQ}}(t)\\&=\sum \limits _{p=1}^{{{N}_{p}}}{{{A}_{p}}c\left[ (1+a)t{{\tau }_{p}}{{\tau }_{m}}({{\theta }_{p}}) \right] }\\&\cdot {{e}^{j{{\omega }_{\text {c}}}\left[ at{{\tau }_{p}}{{\tau }_{m}}({{\theta }_{p}}) \right] }}+{{n}_{mIQ}}(t)\ \end{aligned} \end{aligned}$$
(8)
Here, an approximation is made:
$$\begin{aligned} \begin{aligned} c(t+at{{\tau }_{p}}{{\tau }_{m}}({{\theta }_{p}}))\approx c(t+at{{\tau }_{p}}) \end{aligned} \end{aligned}$$
(9)
Experiments indicate that this approximation influences DOA estimation performance. Still, the orientation can be estimated. After approximation, for the pth path, the baseband response is:
$$\begin{aligned} \begin{aligned} {{\mathbf {Y}}_{p}}={{A}_{p}}c(t+{{a}_{p}}t{{\tau }_{p}}){{e}^{j{{\omega }_{c}}({{\tau }_{p}}at)}}\mathbf {a}({{\theta }_{p}})+{{\mathbf {n}}_{IQ}}(t) \end{aligned} \end{aligned}$$
(10)
In Eq. (10), we have:
$$\begin{aligned} \begin{aligned} \mathbf {a}({{\theta }_{p}})={{\left[ \begin{matrix} {{e}^{j{{\omega }_{c}}{{\tau }_{1}}({{\theta }_{p}})}} &{} {{e}^{j{{\omega }_{c}}{{\tau }_{2}}({{\theta }_{p}})}} &{} \cdots &{} {{e}^{j{{\omega }_{c}}{{\tau }_{M}}({{\theta }_{p}})}} \\ \end{matrix} \right] }^{T}}\ \end{aligned} \end{aligned}$$
(11)
Then, we get baseband responses for all paths:
$$\begin{aligned} \begin{aligned} \mathbf {Y}=\sum \limits _{p=1}^{{{N}_{p}}}{{{\mathbf {Y}}_{p}}}=\mathbf {AS}+\mathbf {n}\ \end{aligned} \end{aligned}$$
(12)
In Eq. (12):
$$\begin{aligned} \begin{aligned} \mathbf {A}&=\left[ \begin{array}{llll} \mathbf {a}({{\theta }_{1}}) &{} \mathbf {a}({{\theta }_{2}}) &{} \cdots &{} \mathbf {a}({{\theta }_{{{N}_{p}}}}) \\ \end{array} \right] \ \end{aligned} \end{aligned}$$
(13)
$$\begin{aligned} \begin{aligned} \mathbf {S}&=\left[ \begin{matrix} {{A}_{1}}{{c}_{n}}(t+at{{\tau }_{1}}){{e}^{j{{\omega }_{c}}({{\tau }_{1}}at)}} \\ \begin{matrix} {{A}_{2}}{{c}_{n}}(t+at{{\tau }_{2}}){{e}^{j{{\omega }_{c}}({{\tau }_{2}}at)}} \\ \vdots \\ \end{matrix} \\ {{A}_{p}}{{c}_{n}}(t+at{{\tau }_{{{N}_{p}}}}){{e}^{j{{\omega }_{c}}({{\tau }_{{{N}_{p}}}}at)}} \\ \end{matrix} \right] \ \end{aligned} \end{aligned}$$
(14)
$$\begin{aligned} \begin{aligned} \mathbf {n}&={{[\begin{matrix} {{n}_{IQ1}} &{} {{n}_{IQ2}} &{} \cdots &{} {{n}_{IQM}} \\ \end{matrix}]}^{T}}\ \end{aligned} \end{aligned}$$
(15)
The baseband signal’s covariance matrix is represented as:
$$\begin{aligned} \begin{aligned} {{\mathbf {R}}_{yy}}=\mathbf {Y}{{\mathbf {Y}}^{H}}=\mathbf {A}{{\mathbf {R}}_{\text {s}}}{{\mathbf {A}}^{H}}+{{\sigma }^{2}}\mathbf {I}\ \end{aligned} \end{aligned}$$
(16)
where H stands for the conjugate transpose; \({{\sigma }^{2}}\)stands for the noise power; \(\mathbf {I}\) stands for the identity matrix; \({{\mathbf {R}}_{\text {s}}}\) stands for the signal’s covariance matrix:
$$\begin{aligned} \begin{aligned} {{\mathbf {R}}_{s}}=\mathbf {S}{{\mathbf {S}}^{H}}\ \end{aligned} \end{aligned}$$
(17)
Since a uniform linear matrix is used,\(\mathbf {A}\) is a Vandermonde matrix. When the number of array elements is larger than that of signal sources and each path has different incident directions, the rank of \({{\mathbf {R}}_{yy}}\) is dependent on\({{\mathbf {R}}_{s}}\):
$$\begin{aligned} \begin{aligned} rank({{\mathbf {R}}_{yy}})=rank({{\mathbf {R}}_{s}})\ \end{aligned} \end{aligned}$$
(18)
When coherent signals are received by the array in different directions, a rank loss will be caused, and the signal eigenvectors will diverge to the noise subspace. In this case, the MUSIC algorithm is ineffective. Since the msequence has excellent autocorrelation characteristics, a rankfree matrix can be obtained. Then, the MUSIC algorithm can be applied as usual. Equation (19) indicates that each element in matrix \({{\mathbf {R}}_{s}}\) involves the spreading sequence correlation function \(R(\tau )\), and \(*\) stands for the conjugate operation.
$$\begin{aligned} \begin{aligned} {{\mathbf {R}}_{s}}&=E\left\{ \left[ \begin{matrix} {{A}_{1}}{{c}_{n}}(t+at{{\tau }_{1}}){{e}^{j{{\omega }_{c}}({{\tau }_{1}}at)}} \\ {{A}_{2}}{{c}_{n}}(t+at{{\tau }_{2}}){{e}^{j{{\omega }_{c}}({{\tau }_{2}}at)}} \\ \vdots \\ {{A}_{p}}{{c}_{n}}(t+at{{\tau }_{{{N}_{p}}}}){{e}^{j{{\omega }_{c}}({{\tau }_{{{N}_{p}}}}at)}} \\ \end{matrix} \right] \right. \ \\&\cdot \left. \left[ \begin{matrix} {{A}_{1}}c_{n}^{*}(t+at{{\tau }_{1}}){{e}^{j{{\omega }_{c}}({{\tau }_{1}}at)}} &{} \cdots &{} {{A}_{p}}c_{n}^{*}(t+at{{\tau }_{{{N}_{p}}}}){{e}^{j{{\omega }_{c}}({{\tau }_{{{N}_{p}}}}at)}} \\ \end{matrix} \right] \right\} \ \\&=E\left\{ \left[ \begin{matrix} {{A}_{1}}^{2}R(0) &{} \cdots &{} {{A}_{1}}{{A}_{p}}R({{\tau }_{1}}{{\tau }_{{{N}_{p}}}}){{e}^{j{{\omega }_{c}}({{\tau }_{1}}{{\tau }_{{{N}_{p}}}})}} \\ \vdots &{} \ddots &{} \vdots \\ {{A}_{p}}{{A}_{1}}R({{\tau }_{{{N}_{p}}}}{{\tau }_{1}}){{e}^{j{{\omega }_{c}}({{\tau }_{1}}{{\tau }_{{{N}_{p}}}})}} &{} \cdots &{} A_{p}^{2}R(0) \\ \end{matrix} \right] \right\} \ \end{aligned} \end{aligned}$$
(19)
The speed of the ship is usually within 10 m/s. The speed of sound in water is 1500 m/s. So the Doppler factor ranges from \(\)0.014 to 0.014. In Fig. 4, for a large Doppler compression factor, there is still good autocorrelation in the spread spectrum sequence [39]. So, the MUSIC algorithm is not affected by Doppler. Meanwhile, the correlation output value decreases quickly with the increase of \(\tau \). Ideally, it will be approximated as a diagonal matrix. Based on this, the rank deficient problem can be solved, and the MUSIC algorithm can efficiently estimate DOA.
Estimated number of sources
Based on the spread spectrum sequence’s coherence, the MUSIC algorithm can quickly estimate the signal’s DOA. However, because the spread spectrum sequence usually works at a low SNR, conventional source estimation methods cannot achieve accurate source number estimation. Aiming at this problem, this paper proposes to perform source estimation based on the Hankel matrix’s SVD of element delay structure information. The general form of the Hankel matrix [40] is as follows:
$$\begin{aligned} \begin{aligned} {{H}_{n}}=[{{h}_{i+j1}}]_{i,j=1}^{n}=\left[ \begin{matrix} {{h}_{1}} &{} {{h}_{2}} &{} \cdots &{} {{h}_{n}} \\ {{h}_{2}} &{} {{h}_{3}} &{} \cdots &{} {{h}_{n+1}} \\ \vdots &{} \vdots &{} \cdots &{} \vdots \\ {{h}_{n}} &{} {{h}_{n+1}} &{} \cdots &{} {{h}_{2n+1}} \\ \end{matrix} \right] \ \end{aligned} \end{aligned}$$
(20)
where \([{{h}_{i+j1}}]_{i,j=1}^{n}\in C\).
By using the ULA time delay fixed value feature and the crosscorrelation function of the specific array element signal (reference signal) and other array element signals, the Hankel matrix is reconstructed through the delay difference operation. The reconstructed Hankel matrix is in the following form:
$$\begin{aligned} \begin{aligned} {{H}_{i+k+q,j}}={{\left[ \begin{matrix} {{h}_{i,j}} &{} {{h}_{i+1,j}} &{} \cdots &{} {{h}_{i+q1,j}} \\ {{h}_{i+1,j}} &{} {{h}_{i+2,j}} &{} \cdots &{} {{h}_{i+q,j}} \\ \vdots &{} \vdots &{} \cdots &{} \vdots \\ {{h}_{i+k1,j}} &{} {{h}_{i+k,j}} &{} \cdots &{} {{h}_{i+(k1)+(q1),j}} \\ \end{matrix} \right] }_{k\times n}}\ \end{aligned} \end{aligned}$$
(21)
Among them, i, k, and q, respectively, represent the starting sequence number of matrix elements, the number of matrix rows, and the number of matrix columns. The subscript j of \({{H}_{i+k+q,j}}\) stands for the serial number of the reference signal, and the expression of \({{h}_{i,j}}\) is:
$$\begin{aligned} \begin{aligned} {{h}_{i,j}}=E\{{{x}_{i}}(t)x_{j}^{H}(t)\}{{\sigma }^{2}}{{\delta }_{ij}},j=1,2,\cdots ,M \end{aligned} \end{aligned}$$
(22)
According to the above formula, when \(i=j\), the element \({{h}_{i,j}}\) contains the noise component; otherwise, there is only the signal component in the element \({{h}_{i,j}}\) because the signal and the noise are not related. Therefore, the Hankel matrix can be constructed as a matrix with only signal components. Through SVD of the Hankel matrix, we have:
$$\begin{aligned} \begin{aligned} {{H}_{i+k+q,j}}={{U}_{H}}{{\sum }_{H}}V_{H}^{\text {H}}\ \end{aligned} \end{aligned}$$
(23)
where \({{U}_{H}}=[{{u}_{H1}},{{u}_{H2}},\cdots ,{{u}_{Hk}}]\)is a \(k\times k\) left singular vector matrix. \({{V}_{H}}=[{{u}_{H1}},{{u}_{H2}},\cdots ,{{u}_{Hn}}]\) is a \(q\times q\) right singular vector matrix. \({{\sum }_{H}}\)is a \(k\times q\) singular value matrix, and it satisfies:
$$\begin{aligned} \begin{aligned} {{\sum }_{H}}=diag(\underbrace{{{\lambda }_{1}},{{\lambda }_{2}},\cdots ,{{\lambda }_{{N}_{p}}}}_{{N}_{p}}\underbrace{0,\cdots ,0}_{\min (k{{N}_{p}},q{{N}_{p}})}) \end{aligned} \end{aligned}$$
(24)
where the first \({{N}_{p}}\) singular values correspond to the signal, and the last \(\min (k{{N}_{p}},q{{N}_{p}})\) singular values correspond to the noise.
According to the analysis of the Hankel matrix, we can get that the Hankel matrix has the following characteristics:

(1)
It is different from having only one spatial covariance matrix of the received data and one eigenvalue decomposition operation for source number estimation. For enough array elements M, a large number of matrices of different reference signals and different dimensions \({{H}_{i+k+q,j}}\) can be constructed, which improves the variety of judgment methods and greatly enhances data utilization. Based on this, the ability to estimate the signal source number accurately is increased.

(2)
In the Hankel matrix \({{H}_{i+k+q,j}}\), the number of nonzero singular values equals the matrix’s rank and the signal source number \({{N}_{p}}\). In other words, the signal subspace covered by the left singular vector corresponding to the nonzero singular value equals that covered by the steering vector, and it is orthogonal to the noise subspace.

(3)
In practical applications, the noise variance \({{\sigma }^{2}}\) is unknown. Due to the flexibility of the construction method of the Hankel matrix, the Hankel matrix can be constructed by selecting elements \({{h}_{i,j}}\) that do not contain \(i=j\). In this approach, the influence of noise is avoided.
According to the singular values obtained after the decomposition of the Hankel matrix, the signal source number is estimated by heuristic decision criteria. That is, when the first C(i) satisfies \(C(i)\le 0\), there are \({\widehat{q}}=i1\) signal sources. C(i) is defined as:
$$\begin{aligned} \begin{aligned} C(i)=\widehat{{{\lambda }_{i}}}\frac{1}{\min (k,q)}\sum \limits _{i=1}^{\min (k,q)}{\widehat{{{\lambda }_{i}}}} \end{aligned} \end{aligned}$$
(25)
The algorithm is described as follows:
Step 1 Perform quadrature demodulation on the signals received by the M array elements to obtain a complex baseband signal.
Step 2 Construct a Hankel matrix according to the M complex baseband signals, and different Hankel matrices can be constructed by selecting different values.
Step 3 Conduct SVD on the obtained Hankel matrix in step 2 to obtain singular values.
Step 4 Determine the signal source number \({{N}_{p}}\) through heuristic criteria.
Step 5: Obtain the array’s covariance matrix based on the M complex baseband signals following Eq. (16).
Step 6: Conduct eigenvalue decomposition on the covariance matrix obtained according to Eq. (21).
$$\begin{aligned} \begin{aligned} {{\mathbf {R}}_{yy}}=\mathbf {U\Sigma }{{\mathbf {U}}^{H}} \end{aligned} \end{aligned}$$
(26)
where \(\mathbf {\Sigma }=diag\left\{ {{\lambda }_{1}},{{\lambda }_{2}},\cdots ,{{\lambda }_{M}} \right\}\) stands for the eigenvalue matrix, and \(\mathbf {U}\) stands for the eigenvector matrix.
Step 7: Sort the eigenvalues from large to small. Use the signal source number obtained in step 4 and form the signal subspace \({{\mathbf {U}}_{s}}\) by taking the eigenvectors corresponding to the first N eigenvalues. Then, form the noise subspace \({{\mathbf {U}}_{n}}\) by using the eigenvectors corresponding to the remaining smaller eigenvalues.
Step 8: Calculate the spatial spectrum function according to Eq. (27). Obtain the DOA estimate value of the signal by finding the peak value.
$$\begin{aligned} \begin{aligned} {{P}_{MUSIC}}=\frac{1}{{{a}^{H}}(\theta ){{\mathbf {U}}_{N}}\mathbf {U}_{N}^{H}a(\theta )} \end{aligned} \end{aligned}$$
(27)