Deep Learning Based BackCom Multiple Beamforming for 6G UAV IoT Networks

—Combining unmanned aerial vehicles (UAVs) with 6G, Internet of Things (IoT) and other emerging communication technologies could better satisfy various IoT applications and create more innovative services. This paper develops a novel hierarchical 6G IoT network with UAVs in the sky and intelligent reflective surface (IRS) equipped. The system employs backscattering communication (BackCom) to transmit data in a free-ride manner. Through beamforming, IRS enhances the energy of the reflectable signal, thereby improving the distance and performance of the BackCom. Simulation results reveal that our approach makes a significant improvement to the performance of the whole system and takes obvious advantage over traditional solutions.


INTRODUCTION
In recent years, with the development of 6G and Internet of Things (IoTs) technology, the integration of UAVs and cellular system has become a new network development trend. At present, UAVs are showing a vigorous development momentum in many industrial applications. It is expected that UAVs will bring significant economic benefits in many fields such as smart city construction, power and oil pipeline inspections, emergency communications, agriculture, forestry and plant protection, mineral exploration, disaster assessment, and have broad application prospects. In a word, 6G UAV network could better satisfy various IoT applications and create more innovative services.
In the UAV cellular converged network, the UAV can act as an aerial base station or access node (AP) to collect information from a large number of IoT nodes distributed in a certain range and realize the connection with the 6G network. UAVs can also be equipped with IoT devices such as cameras and communication equipment to form a UAV IoT network. In many IoT applications, the collaborative work of multiple UAVs will be a common requirement. UAVs can not only communicate with ground cellular base stations, but also form a self-organizing cluster network through remote intelligent control platform. Multiple UAVs equipped with different IoT devices collect IoT data at different locations at the same time and transmit them directly to nearby ground base stations or communicate with ground base stations through a leader UAV with AP capability.
However, UAV flight and IoT devices are highly dependent on power supply, and they are all energy limited, which will seriously affect the promotion and popularization of UAV-based IoT applications. Moreover, under 6G millimeter wave communication, the system requires sophisticated radio frequency transceiver units and complex signal processing to achieve high-performance communication, which requires the system having sufficient energy. The cooperative work of multiple UAVs is also an important challenge to the use of radio spectrum resources. Therefore, it is necessary to carry out innovative research on spectrum use and energy-saving technology under the premise of low hardware cost.
If IoT nodes can still perceive the world without being bound by batteries, the passive IoT without batteries is no longer a dream. Backscatter technology brings hope for the IoTs to get rid of the battery shackles. When the node backscatters the incident signal, it can encode and modulate the sensed data by modifying the three parameters of the signal's amplitude, phase and frequency. Therefore, the backscattering system uses the incident electromagnetic waves to load the data that IoT nodes need to transmit in a free-riding manner to the scattered signal, and then transmits it to the receiver.
The backscatter system "cuts" the power-consuming RF circuit part and obtains energy from the incident signal. IoT devices can transmit data with extremely low power consumption and cost, and the energy consumption can be reduced to microwatts, which is an important feature of backscatter technology. At present, backscatter is mainly used in radar systems to measure the distance and azimuth of the target by using this reflected wave.
However, a fatal disadvantage of backscattering technology is that the reflective node obtains weak energy from the surrounding environment, resulting in a too short distance between the reflective node and the receiver, and it is difficult for the receiver to distinguish extreme weak reflection signals from the original signal and other noises. Intelligent reflective surface (IRS) uses beamforming technology to directionally gather signal energy, which may become an effective means for backscattering systems to increase the communication distance.
IRS is a surface that reflects incident signals in beam form. It consists of a large number of low-cost, reconfigurable passive components, each of which can be phase modulated independently to reflect the incident signal. By cleverly adjusting the phase shift of all IRS passive devices, the incident signal received by the IRS can be beamformed and reflected to the receiving end, so that three-dimensional (3D) passive beamforming can be achieved without any transmission radio. Through beamforming, IRS enhances the energy of the reflectable signal, thereby improving the distance and performance of the backscatter communication system. IRS has no transmitter, but reflects the received signal in the form of a passive array, so there is no transmission power consumption. In view point of an implementation, IRS has low implementation cost, strong adaptability and very convenient deployment. Although IRS has been used in radar systems, remote sensing, and satellite communications, it is rarely used in mobile wireless communications.
The reminder of this paper is organized as follows. Section II and Section III present the general system model and improved MIMO IRS model, respectively. Section IV develops A-LSTM based trajectory prediction scheme, in a bid to handle the high speed mobility of UAVs. Section V gives numerical results to justify the performance of our proposed system, followed by Section VI to conclude the paper.

A. System architecture
In the face of the future 6G Internet of Things demanding for ultra-high coverage, expanding communication coverage has become an inevitable trend in the development of smart cities. Due to the high cost and time-consuming of adding additional base stations, drones are currently one of the most effective solutions to improve communication coverage. There are two main differences between UAV communication and traditional ground wireless communication. First of all, because UAVs usually have a strong line of sight connection with ground nodes, they provide better channel conditions than ground fading channels, and can even predict the channel state information (CSI) of different UAVs in 3D positions based on the location information of ground nodes and communication performance. Secondly, UAV has fully controllable maneuverability 3D. The UAV can be used to adjust its height and horizontal position at any time to optimize its communication performance with ground nodes.
As shown in Fig.1 of communication scenario, the use of a cylindrical antenna array can effectively enhance the robustness of the communication link. In order to ensure the communication efficiency between the UAV and the base station, the base station needs to perform accurate beamforming, and capture the specific position of the UAV working in the air. Using the traditional cylindrical antenna array DOA estimation algorithm, the current position information of the UAV can be obtained. Because the UAV moves too fast in the air communication process, accompanied by a certain time delay, the traditional DOA estimation method to estimate the current position of the UAV is not enough to meet the needs of the entire system. Therefore, an angle predictor is needed to predict the position of the drone at the next moment. In order to accurately grasp the status information of the UAV, as shown in Figure 1, a new backscatter communication scenario for drone communication is considered, which consists of a backscatter device (BD), a receiver and a UVA. It is assumed that the drone can freely adjust its heading movement at a fixed height H , and the limited flying time of drone is T . In order to make the problem easier to deal with, Therefore, the horizontal position of the recording UAV at time n is    w , respectively. In energy-efficient network architectures. We will first discuss power issues in cellular, D2D, and the IoTs, and then we will review various attempts to solve energy constraints. Figure 1 shows the ambient backscatter communication (AmBC) system, the source S is a UAV, the receiver R is equipped with an M antenna, the antenna form is a uniform linear array (ULA), and the single antenna is a passive tag G. R not only directly receives the signal from S , but also collects the backscattered signal from G . G first obtains energy from the drone signal. By deliberately changing the load impedance, G carries its information on the UAV carrier to disperse or absorb the received signal.
Let s(n) be the signals of the UAV source with power Ps and B(n) ∈ {0, 1} be the modulated signal at the tag which keeps unchanged during N consecutive UAV signals. Define θ0 ∈ [-π/2, π/2] and θ1 ∈ [-π/2, π/2] as the signal azimuth angles or direction of arrivals of paths S−R and G−R, respectively. Denote channel gains of S−R, S−G and G−R as hsr, hsg, and hgr, respectively. The attenuation factor inside the tag is denoted as η ∈ (0, 1]. Then, the received signal at the reader is () sn is denoted as the UAV signal with power

C. Reconfigurable Reflectarray
The reflectarray antenna is a directional antenna that behaves a bit like a parabolic reflector. Instead of relying on the physical shape of the antenna to determine the reflection characteristics, the reflected light is composed of many reflective elements. Since the components on the antenna can provide phase compensation, the antenna reflects the incident wave of the electromagnetic radiation source and finally forms the main beam in a specific direction. In this way, the reflected wave is beamformed, and the reflectarray antenna receives the input signal wave and reflects it to a predetermined spatial direction, as shown in Figure 2. The reflectarray antenna is composed of an array of reflectarray elements and a power supply. It collimates the radiation wave from the power supply by adjusting the reflection phase of each reflecting light element. In the design of the reflected wave antenna, the key issue is how to change the reflection phase of the reflected wave element. The reflectarray antenna usually works at a single frequency and has a fixed main beam. A reconfigurable technology, reconfigurable reflector antenna (RRA), is a combination of a parabolic antenna and a phased array antenna. It adopts plane structure and is easy to process. In addition, RRA is more flexible than traditional mirrors and realizes beam scanning through mechanical scanning. The feeding network is simple, the transmission loss is reduced, and the radiation efficiency is greatly improved. The component design is flexible, and different resonant components can be designed to achieve multi-beam and beam scanning functions [2].
The topology of the novel slot-coupled digitally reconfigurable reflective array element is shown in Fi. 2(a). The DC bias circuit controls the switch of the PIN diode on the phase delay line, thereby changing the propagation path of the electromagnetic wave, and finally achieving a phase difference of 180° degrees for beam scanning.
The reconfigurable reflectarrays can change the delay of each element and direct the reflected light in different directions at different time. These elements are represented by dots in Figure 2. The elements of the reflective surface are called subatomic or reflective elements. In short, we can think of elements as antennas, who captures the radio signal, keeps it inside for a short time and then sends the signal. The reflectarrays can be regarded as a passive MIMO array.
From a conceptual point of view, the establishment of future networks is indeed an exciting prospect. However, researches on this topic is still in its infancy. The most important thing is to demonstrate practically important use cases of the reconfigurable reflectarrays.

III. IMPROVED MIMO IRS SYSTEM MODEL
We considered IRS-assisted downlink communication in a single-cell network, where IRS is deployed to assist communication from multi-antenna APs to K single-antenna users on a given frequency band. The number of transmitting antennas at the AP and the number of reflecting units at the IRS are denoted by M and N respectively. IRS is equipped with a controller to coordinate its switching between two working modes, namely the receiving mode for channel estimation and the reflection mode for data transmission [3]. Due to the high path loss, it is assumed that the power of the signal reflected twice or more by the IRS is negligible and therefore can be ignored. In order to characterize the theoretical performance gain brought by IRS, we assume that the AP fully understands the channel state information (CSI) of all involved channels. In addition, all channels use a quasistatic flat fading model. Since IRS is a passive reflection device, we consider time division duplex (TDD) protocol for uplink and downlink transmission and assume channel reciprocity for CSI acquisition in downlink based on uplink training.
We consider IRS-assisted downlink communication in a single-cell network, where IRS assists communication from multi-antenna APs to K single-antenna users on a given frequency band. The number of transmitting antennas at the AP and the number of reflecting units at the IRS are denoted as M and N , respectively. IRS is equipped with a controller to coordinate its switching between two operating modes, namely the receiving mode for channel estimation and the reflection mode for data transmission [3]. Due to the higher path loss, it is assumed that the power of the signal reflected by the IRS two or more times is ignored. The AP is assumed to fully understand all CSI information. In addition, all channels use a quasi-static flat fading model. Since IRS is a passive reflection device, time division duplex (TDD) protocol for uplink and downlink transmission is adapted.
We consider performing linear transmission precoding on the AP. Therefore, the complex baseband transmission signal at the AP can be expressed as We denote The parameters G and Θ are as follows 1,1 1, The SINR k of the single user and the SINR of the system are respectively Please refer to the ANNEX chapter for the detailed formula description of the system model

A. Detailed Explanation of A-LSTM
The structure of A-LSTM model is similar to that of encoder-decoder model. A-LSTM model composed of LSTM model and attention mechanism is widely used in time series prediction [8,9], including machine translation, document extraction, question and answer system, etc.
LSTM, as a more complex recurrent neural network (RNN), is expert in time information processing and solves the problems such as the long-term dependence, gradient disappearance and gradient explosion in back propagation through time (BPTT) through the truncated gradient and regularization of guided information flow. [10]. Fig. 3(a) shows the LSTM cell structure at time t. As show as the Fig.  3(a), xt stands for the input vector, ct represents the cell, and ht represents the hidden state at the current time. There are three gated units in the figure, including forget gate f, input gate i and output gate O. The forget gate controls that cell state information is to forget or pass useful information down. The intersection of new information and cell state are controlled by input gate. How much the current cell state will be treated as an output value will be judged by output gate. Specifically, the mapping relationship between an input vector sequence x=(x1,x2,…,xT) to an output sequence h=(h1,h2,…,hT) is precisely specified by: where ft, it o and ct represent the forget gate, input gate, output gate, cell state vectors respectively at the current time, and σ stands for the logistic function mapping between 0 and 1 .W* and b* represent the weight matrixes and bias vectors respectively [11,12].
The A-LSTM model [13] mainly deals with the problem of "Seq-to-Seq". We have drawn the details of A-LSTM model shown in Fig. 3(b). The expressions of encoding, storage and decoding in graphs are sequence data xj, relational vector Ci and sequence output yi, respectively. As shown in Fig. 3(b), xj and hj represents the input sequence data and hidden state in the encoder. yi and Si represents the output sequence data and hidden state in the decoder. eij represents the correlation between encoding hidden state information and decoding state. aij explains the weight vector of eij and the higher the value of aij, the greater the influence of xj to yi. (11) where f and g explain the activation function, xj represents the input vector, j=1,…,Tx，and yi is the output data , i=1,…,Ty.
In terms of automatic information generation, the part of encoder is adopted LSTM. As show as Fig. 3(b), each xj represents the input vector of each time node. As time goes by, hj of the LSTM is updated with the gradual input of xj. We also defined the decoder as an LSTM that outputs sequence data yi. The relational vector Ci, which is calculate through a series of function transformations of encoding input vector xj and output sequence hidden state Si-1, indicates the only correlation between the encoder and decoder. According to assigning weight to hj by the function of softmax, vector Ci shows different concerns about hj. The relational vector Ci including the total useful information of input sequence vector xj has guiding significance for the output of the decoder. The results show that, it is necessary to obtain useful sequence information in the training to effectually increase the precision of decoding prediction [14].
Through the above description, it mainly introduces LSTM and A-LSTM that is the combination of LSTM and attention mechanism. Meanwhile, In the training process of alstm model, the network parameters can be continuously updated by loss function, so as to realize the prediction of time series.

B. A-LSTM Location Prediction Model
In section III, it will pay attention to A-LSTM model prediction process structure in Fig. 3(b). Since UAV moves fast and is susceptible to some external factors, it is necessary to obtain location of UAV by using the A-LSTM location model at next time. First, according to the spatial spectrum of URA's DoA, the current UAV communication location information can be obtained, which includes Azimuth information θ and φ, respectively. Afterwards, we will adopt preprocessing system to implement angle data preprocessing to get the redefinition angle information including θ * and φ * . Ultimately, the redefinition information as the input layer will be mapped to next time data through the A-LSTM model. As the epoch of training increases, best A-LSTM model parameters will be kept to predict pitch and horizontal angles.

1)Acquisition of A-LSTM training samples
For A-LSTM model, the acquisition of the training samples is crucial. Consequently, we also have researched DOA estimation. After the discussion above, the covariance matrix of x(t) shown in Equation (29) be defined as: where Rx expresses the source covariance matrix, σ 2 is the common variance, and I denotes an MN*MN identity matrix.
In addition, the standard subspace method can be applied to convert the covariance matrix of x(t) to 2 1 1 where λ1 ≥ λ2 ≥ ··· ≥ λP > λP+1 = ··· = λMN are the eigenvalue of Rx and e1,e2,…,eMN are the associated eigenvectors of them. Es represents the eigenvectors of P largest eigenvalue, and En stands for eigenvectors of MN-P smallest eigenvalue. In addition, the number of sources P will be evaluated using the principle of minimum description length. Further, the signal subspace of a(θ1, φ1),…, a(θp, φp) is the same as the Es signal subspace and orthogonal to the noise subspace En. Thus, we have ( , ) 0, 1, 2,..., where ||·|| explains the Kronecker product. So, the DoA estimate of URA will be obtained, and the spatial spectrum can be defined due to the multiple signal classification to define [15].
According to Equation (32), Two maximum value of spectrum search can be obtained, which corresponds to signal source incidence angles included pitch angle and horizontal angle of, respectively.
Meanwhile, the A-LSTM location predictive model which consists of parameters and structure in this paper is considered as a mapping function. Once URA receive the signal of BS, BS can get the feedback of URA about the relevant received signal vector x(t). Afterwards, these two peak values of the spatial spectrum achieve the current location pitch angle and horizontal angle of UAV, which means that we can continuously obtain 2D DoA angle information. Consequently, it is possible to continuously acquire the received signals during the air communication of UAV, thereby continuously acquiring the 2D arrival angles at different moments, which helps us to obtain training samples.  Fig. 4 Data processing For further enhancing the feasibility and stability of A-LSTM prediction effect, a series of complex preprocessing is introduced for the acquired dataset. Data cleaning aims at detecting errors and inconsistencies in data, eliminating or correcting them to improve data quality, so that the uniformity of location information data can be achieved. Therefore, we first clean the angle data in the preprocessing system.

2)A-LSTM prediction system
Besides, the constancy of training set directly affects the error of entire training result of proposed A-LSTM location predictive model. Therefore, Augmented Dickey-Fuller test (ADF) is carried out after data cleaning. The critical value of ADF statistics and ADF statistics are full of guiding significance for stability of the system. Suppose that the ADF result is smaller than threshold level, the assumption that there is a unit root is rejected. Meanwhile, raw data set shows stable. Therefore, it is assumed that there exits the unit root in the zero hypothesis of ADF, and the criterion is that test statistic value preferably is no more than 1%, the invalid hypothesis can be significantly negated, thereby determining the data set stability. The detection result of ADF is shown in TABLE I. As we can see from the table, the test statistic value is far less than 1% of the critical statistic, which is obviously less than 5% and 10% of the critical value. At the same time, the probability value (P value) of the detection is close to zero. Hence, we can conclude that the obtained angel data is stationary. Critical value (10%) -2.56682 Since the position of UAV is various at different moments, some specific angle data at different moments will be generated. Therefore, data integration method is used to gather pitch angle θ and the horizontal angle φ of different time nodes, thus constituting the training angle database. Assuming that the pitch angle is regarded as X axis and the horizontal angle as Y axis, a coordinate system about angle is formed. In this way, the information from two unrelated perspectives can be transformed and given new meaning. In addition, data normalization can help us to solve the impact of single attribute in multi-attribute sample data and ensure that the speed and accuracy of finding the optimal solution are accelerated when the gradient descends. Finally, data reconstruction is realized to adapt the data to the input data structure of A-LSTM model.
For further enhancing the performance of A-LSTM location predictive parameter model, we adopt the sliding-window to guarantee the real-time prediction of the A-LSTM model. If the length of sliding-window is n, and n+1th data will be automatically predicted by A-LSTM model. When UAV and BS communicate continuously, we will get the latest UAV azimuth information at every moment. Over time, we also import the latest data into the data structure to be predicted, while automatically deleting outdated data.
After the above processing, the training data of a specific structure is input into the structure of A-LSTM. After training, the structure and parameters of the A-LSTM model can guarantee the accuracy of prediction, so as to achieve azimuth prediction. By predicting the location of the next moment of the UAV, helping the BS to achieve accurate beamforming can improve the communication quality within the coverage of UAV. At the same time, we will also use the following experiments to express the reliability and stability of A-LSTM the accuracy of the prediction.
The results verify that A-LSTM model is suitable for trajectory prediction and performs well, which shows that A-LSTM model pays more attention to the trajectory angle of UAV.

V. RESULTS AND DISCUSSION
The simulation scenario is shown in the Fig. 1: black dots represent base stations (BS); the red dot represents massive unmanned aerial vehicle (Muav); the blue dot represents small IoT unmanned aerial vehicle (Iuav); the hexagon represents the service area covered by the base station, which is composed of three sectors, and each sector corresponds to a phased array antenna. In this part, we discussed two types of links in the simulation: the first type of link is Iuav->Muav; the second type of link is BS->Muav.
For Iuav->Muav, Iuav is an IoT terminal. In order to ensure the power continuity of Iuav, Iuav uses the backscattering mode to transmit to Muav. Iuav environmental electromagnetic wave source comes from BS. The BS points to Iuav through beamforming to ensure that Iuav can receive enough energy for reflection. The direction of beamforming needs to be predicted with LSTM to ensure the accuracy of BS beamforming. Iuav is equipped with an Intelligent Reflective Surface (IRS), which only reflects signals and does not emit signals. After LSTM prediction and calibration, the IRS beamforming points to Muav to ensure the strength of the backscattering link. For BS->Muav, Muav is the main unmanned aerial vehicle (UAV), and it is equipped with a phased array antenna, which has strong capabilities and belongs to the air center node. The BS->Muav link is an important backhaul fronthaul path and is the gateway between the ground network and the air network. Normally, only ray1 exists on the BS->Muav link. For this, the BS needs beamforming to point to Muav. Prior to this, the position of Muav was also predicted with LSTM to ensure the accuracy of beamforming. However, due to the existence of the first type of link (Iuav->Muav), Muav not only has the receiving path of ray1, but also has the receiving path of ray2 based on artificial reflection. Muav's signal receiving path is shown in Fig. 1. Therefore, full use of ray2 energy (precoding technology precoding) can increase the strength of the BS->Muav link.
The two links are compared, as shown in Fig. 5. The first type of link, Iuav->Muav, uses backscattering technology, IRS and LSTM to save resources (including energy, spectrum, and computing power) on Iuav and complete the transmission of the IoT. The second type of link, BS->Muav, uses beamforming, precoding, and LSTM to increase the strength of the link. LSTM has the function of predicting the beam forming direction and increasing the antenna gain, as shown in Fig. 6.  Table 2. We set: the number of base stations is 7, the radius of the cell is 450 meters, the angle of each sector is 120°, the number of Muav in each sector is 1, the number of Iuav in each sector is 3, and the number of LSTM trainings is 250. The simulation results are shown in Fig. 7. Link2 capacity (ray1): under the condition of only ray1, the reception of link2 is CDF (lower), and SINR corresponds to C (upper). The capacity of Link2 (ray1+ray2): the presence of ray1 and ray2 at the same time makes the received energy rise, so the curve is higher than the capacity of Link2 (ray1). Link2 capacity (ray1+ray2+backscatter): the addition of backscatter essentially adds noise to ray2 of link2, which will cause some performance loss. Therefore, this curve is between the capacity curve of Link2 (ray1) and the capacity curve of Link2 (ray1+ray2). Link1 capacity (backscatter): the capacity generated by backscatter is essentially stolen from the capacity generated by link2-ray2. The dotted line part: the imperfect LSTM prediction makes the beamforming of the entire system inaccurate and causes partial loss of antenna gain. Therefore, the dotted line will be a little worse than the solid line. Here, multiple sets of dotted lines can be added to correspond to different RNN algorithms and parameter configurations.

VI. CONCLUSIONS
Driven by the market, UAV industries start pushing the digital transformation of their products and services. We are entering the era of ubiquitous IoT with all kinds of things equipped with computing and communication capabilities. This paper, in particular, develops a novel hierarchical 6G IoT network of UAVs equipped with BackCom IRS. We focus on deep learning based BackCom multiple beamforming, in a bid to improve the energy of the reflective signal. Simulation results justify that our approach can not only save the precious spectrum but also promote the concept of green communication by cutting off the energy consumption. The author keeps the analysis and simulation data sets, but the data sets are not public.