 Research
 Open Access
 Published:
Gesture recognition method based on misalignment mean absolute deviation and KL divergence
EURASIP Journal on Wireless Communications and Networking volume 2022, Article number: 96 (2022)
Abstract
At present, it has become very convenient to collect channel state information (CSI) from ubiquitous commercial WiFi network cards, and the location or activity of a human who affects the CSI can be recognized by analyzing the change of the CSI. Therefore, wireless sensing technology based on the CSI has received widespread attention. However, the existing CSIbased gesture recognition methods still have some problems, which include that subcarrier selection is not optimized and motion interval extraction is not accurate enough, so the accuracy of gesture recognition methods still needs to be further improved. In response to the above problems, a gesture recognition method based on misalignment mean absolute deviation (MMAD) and KL divergence is proposed in the paper, which is called MMADKLGR method. This method uses the proposed MMAD algorithm to extract the CSI amplitude intervals containing gesture information, then selects subcarriers by comparing the KL divergence of the CSI amplitude, and finally uses the subspace Knearest neighbor (KNN) algorithm to recognize the gestures. Several experiments show that the MMADKLGR method can effectively improve the accuracy of the gesture recognition.
Introduction
Since the 21st century, the rapid developments of big data, cloud computing and Internet of Things have promoted the developments of various intelligent technologies, such as smart homes, smart schools, smart cars, and robots. Human–computer interaction technology is the basis for realizing the abovementioned intelligent technologies. In the human–computer interaction technology, a user should give instructions to a device through different gestures. Therefore, how the device recognizes the gestures is a key technology of the human–computer interaction.
At present, the gesture recognition technology mainly includes the following four categories: Videobased gesture recognition technology captures videos through cameras and then recognizes gestures by extracting motion features in the videos. The advantage of this technology is that it can detect tiny gestures with high recognition accuracy, but the disadvantages are that the technology is sensitive to light in the environment, cannot recognize gestures in nonlineofsight situations, and cannot protect user privacy [1, 2].
Gesture recognition technology based on infrared light uses the principle of infrared radiation to recognize gestures. The disadvantages are that the gestures cannot be recognized in nonlineofsight conditions, the equipment costs are high, and the largescale deployment is difficulty [3].
Gesture recognition technology based on wearable devices requires a user to carry wearable devices which integrate a variety of sensors, such as accelerometers and gyroscopes, and these sensors can record the data about user gestures [4]. The disadvantage of this technology is that it is inconvenient to carry the device around. If a user forgets to wear the device, the gesture recognition will stop.
Commercial WiFi devices are ubiquitous and WiFi signals cover almost every corner of people’s lives. In 2011, a tool that can obtain channel state information (CSI) data from Intel 5300 wireless network cards was released, which makes it very convenient to obtain the CSI data of each communication link by using the commercial WiFi devices [5]. In the communication process of WiFi devices, each communication link contains multiple subcarriers, and each subcarrier is composed of amplitude and phase information of the CSI data, so the CSI data can stably reflect the signal changes caused by reflection and diffraction of a human. Based on the above principles, many scholars use the CSI data to recognize locations and activities of a human [6, 7], including gesture recognition technology. Compared with traditional gesture recognition technologies, this technology based on CSI has these advantages, which include no special equipment deployment, nor the need to wear additional sensors, no privacy leakage, and no sensitivity to light intensity and lineofsight (LOS) [8, 9]. In the early days of this technology, many scholars mostly used received signal strength (RSS) to recognize gestures. RSS is a coarsegrained measurement value that is greatly affected by the environment. For example, when someone moves in a monitoring area, the RSS may increase, decrease, or even remain unchanged. In recent years, many scholars have devoted themselves to exploring CSI of wireless signals. Compared with RSS, CSI is a finegrained measurement value, which is less affected by the environment and has better stability, so the CSIbased recognition technology has received extensive attention from academic community [6,7,8,9,10,11,12].
Although the existing methods have good performance in their respective experimental environments, there are still some problems, which include that subcarrier selection is not optimized and motion interval extraction is not accurate, so the accuracy of gesture recognition methods needs to be further improved. In response to the above problems, we propose a gesture recognition method based on misalignment mean absolute deviation (MMAD) and KL divergence, which is called MMADKLGR method. The main contributions are as follows:

1.
Before recognizing gestures, it is necessary to extract CSI amplitude interval affected by the gestures in order to improve the accuracy of gesture recognition. For this reason, we propose a MMAD motion interval extraction algorithm. Based on the MAD algorithm, this algorithm considers the various situations of the starting point and the ending point of the motion interval and improves the accuracy of the motion interval extraction.

2.
Different subcarriers are affected by gestures to different degrees, so it is necessary to select subcarriers that are more affected by gestures and less interfered by noise in order to obtain higher gesture recognition accuracy. For this reason, we propose a subcarrier selection algorithm based on KL divergence. This algorithm compares the KL divergences of the CSI amplitudes to select subcarriers and obtains better results.

3.
To further improve the accuracy of gesture recognition based on CSI, we propose a MMADKLGR method based on subspace Knearest neighbor (KNN) for classification. Through several experiments, we verified the good performance of the proposed MMADKLGR method and analyzed the influence of training samples, transceiver spacing, human body position, indoor environment and features on the proposed method.
The rest of the paper is organized as follows. Section 2 introduces the related works of the paper. Section 3 describes in detail the MMADKLGR method proposed in this paper. Section 4 gives a detailed analysis of the experimental results. Section 5 summarizes this paper.
Related works
In recent years, a large amount of CSIbased wireless sensing methods have emerged, such as activity recognition and gesture recognition methods.
Dang et al. used principal component analysis (PCA) algorithm to build a fingerprint database of CSI amplitude data and used Kalman filter algorithm to obtain data for classification and then used support vector machine (SVM) algorithm and fingerprint database matching for activity recognition [6]. Cheng et al. proposed a CSIbased human continuous activity recognition system. This system uses the CSI phase difference matrix and a method based on threshold and label to segment the continuous activities and then uses Gaussian mixture model–hidden Markov model (GMMHMM) to recognize activities [10]. The above methods use the timedomain features of CSI to recognize activities, and some methods also use the frequencydomain features. Wavelet transform can locally analyze time and frequency and is an ideal tool for time–frequency analysis and processing of signals [13,14,15,16,17,18,19]. Therefore, Wang et al. used a wavelet transform to extract the features of CSI and designed a twostage recognition method to jointly recognize the locations and activities of multiple targets [11]. Tian et al. constructed a time–frequency matrix by using signal preprocessing and wavelet transform and then extracted multidimensional features in time domain and frequency domain as the input feature vector of bidirectional long shortterm memory (BLSTM) network [12].
The above methods based on CSI can recognize the activities with large amplitude. However, the gestures with small amplitude can also be recognized by using CSI.
Tian et al. proposed a CSIbased devicefree gesture recognition system, namely the WiCatch system. Firstly, a new interference cancellation algorithm based on data fusion is proposed to capture weak reflected signals. Secondly, the motion trajectories of gestures are reconstructed by constructing a virtual antenna array of timedomain signal samples. Finally, SVM algorithm is used to complete the classification [20]. Zhang et al. proposed a gesture recognition system called the WiGrus system. The system first uses PCA method and the firstorder difference method to denoise CSI data, and then extracts multiple features that can characterize the gestures, and finally proposes a twostage radio frequency algorithm to classify the gestures [21]. Thariq et al. proposed a sign language recognition system, namely the DFWiSLR system. The system can be used to recognize 30 static gestures and 19 dynamic gestures and can obtain better recognition accuracy for dynamic gestures composed of compound word symbols [22]. Jiang et al. proposed a WiGAN gesture recognition system, which not only solves the problem of performance degradation caused by small samples and strong environmental dependence, but also incorporates more diverse features to improve the accuracy of gesture recognition [23]. Hao et al. proposed a finegrained sign language recognition method, which first filters out environmental interference in the frequency domain through a Butterworth filter, and then uses wavelet transform to smooth the CSI data, and finally builds a lowcomplexity KSB classification model [24]. Tan et al. proposed a finger gesture recognition algorithm, which effectively removes environmental noise and develops a measure that can recognize gestures by dealing with individual diversity and gesture inconsistency. Experimental results show that the algorithm has good recognition accuracy and robustness in a changing environment [25]. Zhang et al. proposed a WiFibased crossdomain gesture recognition system WiGr. The system proposes a dualpath network composed of a depth feature extractor and a dualpath recognizer, which can extract domainindependent gesture features, so that good gesture recognition accuracy can be obtained without retraining in a new domain [26]. Gu et al. proposed a gesture recognition system WiGRUNT based on dualattention network for cross domain recognition. The system dynamically extracts the domainindependent features of CSI by using a spatialtemporal dualattention mechanism and then recognizes the finegrained gestures by using a depth residual network [27].
MMADKLGR method
The framework of the proposed MMADKLGR method is shown in Fig. 1. Firstly, the MMADKLGR method needs to deploy a transmitter and a receiver with WiFi devices. Volunteers complete the required motions between the transceivers. The receiver collects CSI data affected by the gestures and stores the collected data in the computer for training and testing. Then, the MMADKLGR method needs to perform data preprocessing on the collected CSI data. The processing process includes using a Hampel identifier to remove the abnormal values in the CSI data and using a Gaussian filter to remove the highfrequency noise in the CSI data for retaining the lowfrequency features caused by the gestures. Next, the MMAD algorithm is used to detect the time starting point and end point of a gesture in order to extract the motion interval of the CSI amplitude, and the motion interval data is interpolated into a sequence of 50 data points by cubic spline, and then is normalized. The proposed subcarrier selection algorithm based on KL divergence is used to select the better subcarriers conducive to gesture recognition. Finally, the mean, median, upper quartile, lower quartile, variance, rootmeansquare and skewness coefficient of the normalized data are calculated. These features are constructed as a feature matrix together with the normalized data for training and testing the subspace KNN algorithm.
Data preprocessing
Due to the interferences including WiFi device itself, complex indoor environment and various electromagnetic signals in space, there are outliers and highfrequency noise in the original CSI data [28]. These outliers and highfrequency noise can reduce the accuracy of gesture recognition. In order to eliminate the influence of the outliers, we use the Hampel identifier algorithm [29] to remove the outliers in the paper, and the specific method is as follows:
A sliding window with a length of \(2h+1\) is defined on the CSI sequence, and the amplitude of the window midpoint is \(x_i\). The median \(m_i\) and the median absolute deviation \(MAD_i\) of the window are calculated as follows:
where median() represents the function of the median. If \(x_i\) satisfies \(x_im_i>nMAD_i\), the Hampel identifier algorithm determines that \(x_i\) is an outlier and replaces \(x_i\) with the median \(m_i\), where n is a positive integer. Through some experiments, we have verified that the Hampel identifier algorithm has a good effect on removing outliers when \(h=5\) and \(n=3\).
Figure 2 shows the CSI amplitude curve before and after removing outliers by using the Hampel identifier algorithm, where the blue dashed line and the red solid line represent the CSI amplitude curve before and after removing outliers and the black boxes represent the outliers. From Fig. 2, it can be seen that the outliers in the CSI amplitudes have been effectively removed.
After using the Hampel identifier algorithm to remove the outliers, there is still a lot of highfrequency noise in the CSI amplitudes. To eliminate the influence of the highfrequency noise and retain the lowfrequency fluctuations caused by the gestures, we use onedimensional Gaussian filter to eliminate the highfrequency noise in this paper. The specific process is as follows:
A sliding window with a length of \(2k+1\) is defined on the CSI sequence, the amplitude of the window midpoint is \(x_i\), and the weighted normal distribution function \(Q(x_j)\) is calculated as follows:
Then, the Gaussian filter function \(G(x_i)\) is:
The characteristic of Gaussian filter is that \(x_i\) is the center, and the weights are symmetrically distributed. The closer the amplitude to \(x_i\), the greater the influence on \(x_i\), so the weight is also greater. Conversely, the smaller the influence, so the weight is also smaller. The parameters affecting the weighted normal distribution function include k and \(\sigma ^2\), where the larger the k, the larger the range of the CSI amplitudes that affects \(x_i\), and the larger the variance \(\sigma ^2\) of the normal distribution, the more concentrated the weight is in the center. Through some experiments, we verify that the Gaussian filter achieves good performance when \(k=30\) and \(\sigma ^2=20\).
Figure 3 is the comparison of the effects before and after using Gaussian filter, where the blue curve is the CSI amplitudes before filtering, and the red curve is the CSI amplitudes after filtering. As shown in Fig. 3, the Gaussian filter removes the highfrequency noise of the CSI amplitudes and turns them into a smooth curve. The lowfrequency variation of the CSI amplitudes affected by the gestures is mainly concentrated in the \(L_1\) interval, which is wellpreserved to ensure the accuracy of the gesture recognition.
Motion interval extraction
In general, the CSI samples not only include the motion interval affected by the gestures, but also the no motion interval that is not affected by the gestures. The no motion interval is invalid information for the gesture recognition. If the filtered CSI amplitudes are directly used as the input of a machine learning algorithm, the accuracy of the gesture recognition will decrease. Therefore, we need to accurately extract the CSI motion interval. The MAD threshold method is a commonly used motion interval extraction method [30, 31]. This method needs to calculate the MAD value of the data sequence and compare it with the threshold to determine the starting and ending points of the motion interval. However, there are some problems in this method, such as inaccurate judgment of the starting point and possible misjudgment of the ending point. To solve these problems, we improve the MAD threshold method and propose the MMAD algorithm as follows.
A sliding window with a length of \(2c+1\) is defined on the CSI sequence, and the amplitude of the window midpoint is \(x_d\), and the MAD and MMAD value of each point in the sliding window is calculated as follows:
where \({\bar{x}}_{MAD}(d)\) is the mean value of the CSI amplitudes of \(2c+1\) points centered at point d, and \({\bar{x}}_{MMAD}(d)\) is the mean value of the CSI amplitudes of \(2c+1\) points that are at the left of point d (including point d), and MAD(d) and MMAD(d) are the MAD value and MMAD value of the point d, respectively, and d is the integer changing from \(2c+1\) to \(D2c\), and D is the total number of data points.
To illustrate the effectiveness of the MMAD algorithm, we have selected two typical CSI samples, and calculated the MAD values and the MMAD values of the two sample sequences, and then drawn the MAD and the MMAD curves, respectively, as shown in Fig. 4. In Fig. 4a, when the MMAD value is greater than the threshold T for the first time, the corresponding CSI data point \(S_1\) is the starting point of the motion interval. When the MMAD value is less than the threshold T for the first time after \(S_1\), the corresponding CSI data point \(S_3\) is the ending point of the motion interval. The interval \(S_1S_3\) is the extracted motion interval. However, as shown in Fig. 4a, the starting point of the motion interval obtained by the MAD algorithm is \(S_2\), but the interval \(S_1S_2\) contains the part information of the gesture. Figure 4b shows that the MAD algorithm incorrectly divides the interval \(S_4S_5\) into the no motion interval. Therefore, the MMAD algorithm is better than the MAD algorithm in the judgment of the starting point and the ending point of the motion interval. Through some experiments, we verify that the MMAD algorithm has good performance when \(c=10\).
Since the motion interval data length of each sample is different, and the subsequent classification algorithm requires that the data length of each sample must be the same, we use cubic spline interpolation method to interpolate the extracted motion interval data, so as to obtain a unified motion interval data length.
Subcarrier selection
In order to ensure the stability of data transmission, a commercial WiFi network card uses one or more antennas and each communication link contains multiple subcarriers when sending and receiving signals. Therefore, CSI data of each sample collected at the receiver contain multiple subcarriers. Because communication link, transmission frequency and multipath effect may be different and the CSI amplitudes of each subcarrier are also different [32], it is important to select a better subcarrier from a sample to improve the accuracy of gesture recognition. For this reason, we propose an algorithm for selecting subcarriers based on KL divergence.
KL divergence is an asymmetry measure of the difference between two probability distributions [33]. In the field of communication, the KL divergence can be calculated as the difference between the information entropy of two sets of data, where the information entropy is related to the appearance probability of data and is a measure of the time series complexity. Let U(y) and V(y) be the two probability distributions of the random variable y. When y is a discrete random variable, the KL divergence can be defined as follows:
The properties of the KL divergence are: (i) KL divergence is always nonnegative, that is, \(KL(U\parallel {V})\ge {0}\). (ii) KL divergence is an asymmetry measure of two probability distributions, namely \(KL(U\parallel {V})\ne {KL(V\parallel {U})}\).
Using the properties of the KL divergence, we calculate \(KL(U\parallel {V})\) by using the motion interval sequence of subcarrier as the probability distribution V(y) and no motion interval sequence as the probability distribution U(y), as shown in Fig. 5. The larger the \(KL(U\parallel {V})\), the greater the difference between the motion interval and no motion interval of the subcarrier, and the greater the change of the CSI amplitudes caused by the gestures. Therefore, we can select the CSI amplitudes of the subcarrier whose \(KL(U\parallel {V})\) is the largest to recognize the gestures. To ensure good stability of the selected subcarriers, we calculate the subcarriers of all samples as follows:
where \(a=1,\cdots ,A\), and A is the number of subcarriers in a CSI sample, and B is the total number of CSI samples. We select the data of the subcarrier corresponding to the largest \(Sum_a\) for gesture recognition.
Feature extraction
Extracting features that are highly relevant to the gestures from the motion intervals is an important part of improving the accuracy of gesture recognition. In this paper, we use the mean, median, upper quartile, lower quartile, variance, rootmeansquare, skewness factor and the CSI amplitudes to construct the feature vectors of samples which can represent the statistical characteristics and change trend of CSI amplitudes [34].
Subspace KNN algorithm
The subspace KNN algorithm is an improved KNN algorithm. The basic principle of the KNN algorithm is shown in Fig. 6. The algorithm assumes that all existing samples have a definite classification. When it is necessary to determine the category of a new sample, the KNN algorithm calculates the distance between each sample in the existing sample set and the new sample (in this paper, Euclidean distance is used) and finds the K samples with the smallest distance. In the above K samples, the number of samples belonging to which classification is the largest, and the new sample belongs to the classification [35].
Assuming that the feature matrix is R rows and C columns, the steps of the subspace KNN algorithm are as follows:

1.
From the C columns of the feature matrix, M columns are randomly selected to construct a subfeature matrix, and the step is repeated N times to obtain N subfeature matrices.

2.
The N subfeature matrices are used to train the KNN algorithm, and N subclassification models are obtained.

3.
The N subclassification models are used to classify a new sample. Then, we use the majority principle to determine the category of new samples.
The subspace KNN algorithm samples the feature matrix to form multiple subfeature matrices and then trains the KNN algorithm multiple times, thereby improving the classification accuracy and stability of the KNN algorithm.
Experiment evaluation
Experimental setup and data collection
This paper conducted experiments in a laboratory with an area of 11.1 m \(\times\) 9.6 m. The layout of the laboratory is shown in Fig. 7. In the experiment, we use two desktop computers with Intel 5300 wireless network card as the transmitter and the receiver, and both computers are equipped with Ubuntu 12.04 operating system. The transmitter sends signals through one antenna and the receiver receives signals through three antennas. The working frequency of the wireless network card is 2.4 GHz, and the channel bandwidth is 20 MHz. There are 30 subcarriers in each communication link.
During the experiment, the volunteers always sit on the chair at the designated position. When starting to collect the CSI data, the volunteers first remained still, then completed the prescribed motions, and then remained still again. The process lasted 4 seconds in total. In each experiment, the volunteers carried out three motions of twohanded crossing, onehanded sliding and onehanded swing, and 130 samples were collected for each motion. To analyze and verify the MMADKLGR method proposed in this paper, we conducted nine sets of experiments with different transceiver distances, volunteer positions, and interference factor combinations. The information of experimental samples is shown in Table 1. The positions of the transceiver and the human body are shown in Fig. 8. From each set of motion samples, we randomly select 60 samples as the training set and the remaining 70 samples as the testing set. Therefore, in each experiment, the training set contains 180 samples and the testing set contains 210 samples. We use four machine learning algorithms: the bagging tree, the subspace KNN, the linear SVM and the medium decision tree to evaluate the performance of the MMADKLGR method.
Motion interval extraction and subcarrier selection algorithm evaluation
Analysis of MMAD motion interval extraction algorithm
To verify the effectiveness of the MMAD algorithm, we randomly selected 165 samples from the training set of the second group of experimental data to train the four machine learning algorithms, and tested them with 210 samples of the test set. The MMAD and MAD algorithms were used to extract the motion interval respectively. The experimental results are shown in Fig. 9. Figure 9 shows that the accuracy of the MMAD algorithm is better than that of the MAD algorithm. This is because the MMAD algorithm is more accurate in the judgment of the starting point and the MMAD algorithm can better avoid the truncation of the motion interval in the determination of the ending point, so the gesture recognition accuracy of the MMAD algorithm is higher. Figure 9 shows that the subspace KNN algorithm can all achieve higher gesture recognition accuracy when using the MMAD algorithm and the MAD algorithm.
Analysis of KL divergence selection subcarrier algorithm
To verify the effectiveness of KL divergence selection subcarrier algorithm, we calculate the KL divergence of 30\(\times\)3510 subcarriers (390\(\times\)9=3510 samples in Table 1) and then calculate the sum of KL divergence of the subcarrier of 3510 samples, where \(a=1,2,\cdots ,30\). The experimental results are shown in Fig. 10. Figure 10 shows that the sum of the KL divergence of the second subcarrier is the largest. Therefore, we select the data of the second subcarrier for the gesture recognition.
To verify whether the data of the second subcarrier is better than other subcarriers for the gesture recognition, we use the same data as Sect. 4.2.1 for experiments. In the experiment, we only use the data of the 2th, 19th, and 28th subcarriers ranked 1, 15 and 30 in the sum of the above KL divergence to perform the gesture recognition. The experimental results are shown in Fig. 11. Figure 11 shows that the accuracy of the gesture recognition using the data of the second subcarrier is all the highest, 96.67%, 99.52%, 99.52%, and 98.10%, respectively. The accuracy of gesture recognition using the data of the 28th subcarrier is the lowest. This is because the sum of KL divergence of the second subcarrier is the largest, indicating that the CSI amplitudes caused by the gestures change greatly. Therefore, the extracted features can more accurately characterize the corresponding gestures, thus improving the accuracy of the gesture recognition. The experiment verifies the reliability and effectiveness of the proposed KL divergence selection subcarrier algorithm.
Evaluation of MMADKLGR method
Impact of training samples
When training a machine learning algorithm, the number of training samples is an important factor affecting the accuracy of the machine learning algorithm. To evaluate the impact of training samples, we use the second group of experimental data, randomly select 15 to 180 samples with the step size 15 to train the four machine learning algorithms, and test these algorithms by using the remaining 210 samples. The experimental results are shown in Fig. 12. From 15 to 180 training samples, the gesture recognition accuracy of the subspace KNN algorithm is all the highest, and the stability is also the best. The advantage is especially obvious when the number of training samples is small. When the number of training samples reaches 165, the accuracy stabilizes at 99.52%. The reason is that the subspace KNN does not directly use the feature matrix for training, but instead samples the feature matrix to form multiple subtraining sets before training. Although the number of features of training samples is reduced for each subtraining set, the number of features is increased for the classification algorithm as a whole. Therefore, the subspace KNN algorithm effectively improves the accuracy of the gesture recognition and has better stability. According to the above experimental results, the gesture recognition accuracy increases with the increase in the number of training samples. When the number of training samples reaches a certain value, the gesture recognition accuracy remains stable. Therefore, the number of training samples does not need to be too large, because too many training samples will greatly increase the workload of sample collection and make the training time of classification algorithm too long. In the paper, 165 samples are randomly selected from 180 training samples for training.
Impact of transceiver spacing
To verify the performance of the MMADKLGR method in the case of different transceiver spacing, we use the first, second, third, and fourth set of data in Table 1 to conduct experiments, and the experimental results are shown in Fig. 13. Figure 13 shows that when the distance between the transceivers is 1.5 meters, the gesture recognition accuracies of the four machine learning algorithms are all greater than 99%. As the distance between the transceivers increases, the accuracy of the gesture recognition begins to gradually decrease. However, the accuracy of the subspace KNN algorithm has also declined, but it is much higher than other algorithms. This shows that the subspace KNN algorithm still has a high gesture recognition accuracy and good robustness for a large transceiver spacing.
Impact of human body position
To verify the impact of human body position on the MMADKLGR method, we set the human body at the center of transceiver connection, 0.6 meters away from the center vertically, and 1.2 meters away from the center vertically, as shown in Fig. 8. The experiment is carried out by using the data of groups 2, 8 and 9 in Table 1. The experimental results are shown in Fig. 14. Figure 14 shows that the gesture recognition accuracy of the four machine learning algorithms is high when the human body is at the center of the transceiver connection, and the accuracy of the other two experiments is low. The experimental results show that the accuracy of the gesture recognition decreases rapidly when the human body gradually moves away from the center of the transceiver connection. Therefore, to improve the accuracy of the gesture recognition, it is better for the human body to be at the center of the LOS path of the transceiver.
Impact of indoor environment
The application scenario of the MMADKLGR method is indoors, and there are usually many environmental changes in the indoor environment. Therefore, we use the second, fifth, sixth, and seventh groups of data in Table 1 for experiments, and the experimental results are shown in Fig. 15. On LOS path of the transceiver, the second, fifth, sixth and seventh groups of data are respectively collected in the following four cases: no obstacle and no interference (referred to as no interference), no obstacle but Bluetooth headset interference (referred to as Bluetooth), computer case but no interference (referred to as computer case) and blackboard (2 m\(\times\)1.2 m) but no interference (referred to as blackboard). As shown in Fig. 15, the gesture recognition accuracy of the subspace KNN algorithm is the highest in the three cases of no interference, Bluetooth and blackboard, and the recognition accuracy in the case of computer case is slightly lower than that of bagging tree and linear SVM algorithm. The interference of Bluetooth headset has little impact on the accuracy of the gesture recognition. This is because the transmission distance of Bluetooth is short and the power is small, so the interference to the CSI is also small. When the computer case blocks the LOS path of the transceiver, according to the multipath effect theory, the WiFi signal can also be transmitted to the receiver through other reflection paths in the surrounding environment. However, when the blackboard blocks the LOS path of the transceiver, the power reduction in the received signals is very large, and the accuracy of the gesture recognition is very low because the blackboard blocks too many transmission paths of signals. Therefore, in the application environment of the MMADKLGR method, it is better not to place any obstacles on the LOS path of the transceiver.
Impact of features
In the MMADKLGR method, the features are the important factors that determine the recognition accuracy of the method. In the paper, we use mean, median, variance, rootmeansquare, upper quartile, lower quartile, skewness factor and CSI amplitude as the features of the gesture recognition. However, these features are common statistical features in time domain. At present, other features such as energy, zerocrossing rate and entropy are used in some stateoftheart methods [12, 36,37,38,39]. To verify the performance of different combined features and classifiers, we use the seven features given in Sect. 3.4, together with the sample entropy, timedomain energy and frequencydomain energy proposed in our previous work [12], to carry out some experiments. In the experiments, we use the different combinations of the above features and the four classifiers used in this paper. The experimental results are shown in Fig. 16, where the meanings of the feature combinations are shown in Table 2. From Fig. 16, it can be show that each classifier can obtain similar recognition accuracy when using Comb1 features and Comb2 features, while the recognition accuracies of all classifiers are improved when using Comb4 features, and the recognition accuracy of the subspace KNN is always the highest. When Comb3 features are used, the recognition accuracies of the four classifiers are not significantly different, while they are also improved when Comb5 features are used. However, compared with that when using Comb4 features, the recognition accuracy of the subspace KNN is reduced when using Comb5 features, while the recognition accuracies of the other three classifiers are improved, and the recognition accuracy of the bagging tree is higher than that of the subspace KNN. This shows that we need to select different classifiers according to different feature combinations when designing a gesture recognition algorithm.
Discussion and limitation
To verify the generalization of the MMADKLGR method, we analyzed the impact of training samples, transceiver spacing, human body position and indoor environment on the performance of the method. From the experimental results, it can be seen that the recognition accuracy of the MMADKLGR method can meet the needs of most applications and is very high even when the number of training samples is small. As the transceiver spacing increases, the recognition accuracy of the MMADKLGR method gradually decreases, so the transceiver spacing should not be too large when using the method. The human body position has a great impact on the recognition accuracy of the MMADKLGR method. When the human body gradually moves away from the LOS path, the recognition accuracy of the method decreases rapidly. Therefore, the human body should be on the LOS path when using the method, otherwise the high recognition accuracy cannot be guaranteed. Indoor environment also has a great impact on the recognition accuracy of the MMADKLGR method. If there are large obstacles or walls between transceivers, the recognition accuracy is low, but Bluetooth interference or small obstacles have little impact on the accuracy. In summary, the MMADKLGR method can obtain high recognition accuracy by using few training samples when the transceiver spacing is small, there are no large obstacles on the LOS path, and the human body is located on the LOS path. If the above conditions cannot be met, the recognition accuracy of the method will reduce. It is necessary to comprehensively judge the application possibility of the method according to the actual situation and the degree of accuracy reduction.
Conclusion
For the problems of subcarrier selection and motion interval extraction in the existing gesture recognition methods based on CSI, we propose a gesture recognition method based on the MMAD and KL divergence, which is called the MMADKLGR method. This method uses the MMAD algorithm to extract the motion interval of the CSI data, uses the properties of KL divergence to select subcarriers, and uses some extracted features to recognize the gestures through the subspace KNN algorithm. Through experimental comparison, we analyze the proposed MMAD algorithm and KL divergence subcarrier selection algorithm. The experimental results show that the proposed interval extraction and subcarrier selection algorithms can effectively improve the accuracy of the gesture recognition. To comprehensively evaluate the MMADKLGR method, we also analyzed the impact of the number of training samples, the transceiver spacing, the human body position, the indoor environment and the features on the proposed method through a large number of experiments and gave better application parameters of the method. In future work, we will test more gestures to further expand the application range of the MMADKLGR method, and study the impact of different gestures and application scenarios on the selection of features and classifiers.
Data availibility
Not available online. Please contact the author for data requests.
Abbreviations
 CSI:

Channel state information
 MMAD:

Misalignment mean absolute deviation
 GR:

Gesture recognition
 KNN:

Knearest neighbor
 RSS:

Received signal strength
 PCA:

Principal component analysis
 MAD:

Mean absolute deviation
 SVM:

Support vector machine
 LOS:

Line of sight
References
S.S. Rautaray, A. Agrawal, Vision based hand gesture recognition for human computer interaction: a survey. Artif. Intell. Rev. 43, 1–54 (2015)
S. Herath, M. Harandi, F. Porikli, Going deeper into action recognition: a survey. Image Vis. Comput. 60, 4–21 (2017)
J. Wang, T. Liu, X. Wang, Infrared hand gesture recognition with convolutional neural networks in doubleteachers instruction mode classroom. Infrared Phys. Technol. 111, 103464 (2020)
C. Shen, Y. Chen, G. Yang et al., Toward handdominated activity recognition systems with wristbandinteraction behavior analysis. IEEE Trans. Syst. Man Cybern. Syst. 50, 2501–2511 (2020)
D. Halperin, W. Hu, A. Sheth et al., Tool release: Gathering 802.11n traces with channel state information. ACM SIGCOMM Comput. Commun. Rev. 41, 53 (2011)
X. Dang, Y. Huang, Z. Hao et al., PCAKalman: devicefree indoor human behavior detection with commodity WiFi. EURASIP J. Wirel. Commun. Netw. 2018, 214 (2018)
L. Zhang, E. Ding, Y. Hu et al., A novel CSIbased fingerprinting for localization with a single AP. EURASIP J. Wirel. Commun. Netw. 2019, 51 (2019)
J. Wang, L. Zhang, C. Wang et al., Devicefree human gesture recognition with generative adversarial networks. IEEE Internet Things J. 7, 7678–7688 (2020)
X. Shen, Z. Ni, L. Liu et al., WIPass: 1DCNNbased smartphone keystroke recognition using WiFi signals. Pervasive Mob. Comput. 73, 101393 (2021)
X. Cheng, B. Huang, CSIbased human continuous activity recognition using GMMHMM. IEEE Sens. J. (2022). https://doi.org/10.1109/JSEN.2022.3198248
J. Wang, X. Zhang, Q. Gao et al., Devicefree simultaneous wireless localization and activity recognition with wavelet feature. IEEE Trans. Veh. Technol. 66, 1659–1669 (2017)
Y. Tian, S. Li, C. Chen et al., Small CSI samplesbased activity recognition: a deep learning approach using multidimensional features. Secur. Commun. Netw. 2021, 5632298 (2021)
L. Yang, H. Su, C. Zhong et al., Hyperspectral image classification using wavelet transformbased smooth ordering. Int. J. Wavelets Multiresolut. Inf. Process. 17, 1950050 (2019)
E. Guariglia, Primality, fractality and image analysis. Entropy 21, 304 (2019)
X. Zheng, Y. Tang, J. Zhou, A framework of adaptive multiscale wavelet decomposition for signals on undirected graphs. IEEE Trans. Signal Process. 67, 1696–1711 (2019)
X. Liu, H. Zhang, Y. Cheung et al., Efficient single image dehazing and denoising: An efficient multiscale correlated wavelet approach. Comput. Vis. Image Underst. 162, 23–33 (2017)
E. Guariglia, S. Silvestrov, Fractionalwavelet analysis of positive definite distributions and wavelets on d’(c), in Engineering Mathematics II, Springer Proceedings in Mathematics and Statistics, pp. 337–353 (2017)
Y.Y. Tang, Document Analysis and Recognition by Wavelet And Fractal Theories (The World Scientific Publishing Co, Singapore, 2012)
E. Guariglia, Harmonic Sierpinski gasket and applications. Entropy 20, 714 (2018)
Z. Tian, J. Wang, X. Yang et al., WiCatch: A WiFi based hand gesture recognition system. IEEE Access 6, 16911–16923 (2018)
T. Zhang, T. Song, D. Chen et al., WiGrus: a WiFibased gesture recognition system using software defined radio. IEEE Access 7, 131102–131113 (2019)
H. Thariq, H. Ahmad, K. Narasingamurthi et al., DFWiSLR: devicefree WiFibased sign language recognition. Pervasive Mob. Comput. 69, 101289 (2020)
D. Jiang, M. Li, C. Xu, WiGAN: a WiFi based gesture recognition system with GANs. Sensors 20, 4757 (2020)
Z. Hao, Y. Duan, X. Dang et al., WiSL: contactless finegrained gesture recognition uses channel state information. Sensors 20, 4025 (2020)
S. Tan, J. Yang, Y. Chen, Enabling finegrained finger gesture recognition on commodity WiFi devices. IEEE Trans. Mob. Comput. 21, 2789–2802 (2022)
X. Zhang, C. Tang, K. Yin et al., Wifibased crossdomain gesture recognition via modified prototypical networks. IEEE Internet Things J. 9, 8584–8596 (2022)
Y. Gu, X. Zhang, Y. Wang et al., WiGRUNT: WiFienabled gesture recognition using dualattention network. IEEE Trans. Hum. Mach. Syst. 52, 736–746 (2022)
L. Davies, U. Gather, The identification of multiple outliers. Publ. Am. Stat. Assoc. 88, 782–792 (1993)
F.R. Hampel, The influence curve and its role in robust estimation. J. Am. Stat. Assoc. 69, 383–393 (1974)
K. Ali, A.X. Liu, W. Wei et al., Keystroke recognition using WiFi signals, in The 21st Annual International Conference on Mobile Computing and Networking, 7–11 September 2015, Paris, France, pp. 90–102 (2015)
W. Wang, A.X. Liu, M. Shahzad et al., Devicefree human activity recognition using commercial WiFi devices. IEEE J. Sel. Areas Commun. 35, 1118–1131 (2017)
J. Liu, Y. Wang, Y. Chen et al., Tracking vital signs during sleep leveraging offtheshelf WiFi. In, The 16th ACM International Symposium on Mobile Ad Hoc Networking and Computing, 22–25 June 2015, Hangzhou, China, pp. 267–276 (2015)
S. Kullback, R.A. Leibler, On information and sufficiency. Inst. Math. Stat. 22, 79–86 (1951)
Z. Akhtar, H. Wang, WiFibased gesture recognition for vehicular infotainment system—an integrated approach. Appl. Sci. 9, 5268 (2019)
Z. ChikrElmezouar, I.M. Almanjahie, A. Laksaci et al., FDA: strong consistency of the KNN local linear estimation of the functional conditional density and mode. J. Nonparametr. Stat. 31, 175–195 (2019)
R.C. Guido, A tutorial on signal energy and its applications. Neurocomputing 179, 264–282 (2016)
R.C. Guido, ZCRaided neurocomputing: a study with applications. Knowl.Based Syst. 105, 248–269 (2016)
R.C. Guido, A tutorialreview on entropybased handcrafted feature extraction for information fusion. Inf. Fus. 41, 161–175 (2018)
R.C. Guido, Enhancing teager energy operator based on a novel and appealing concept: signal mass. J. Franklin Inst. 356, 2346–2352 (2019)
Acknowledgements
The authors would like to thank the anonymous reviewers for their valuable comments and suggestions that helped to improve the quality of this manuscript.
Funding
This work is supported by the Natural Science Foundation of China under Grant 62076114 and Grant 71874025, and the Humanities and Social Sciences Research Planning Foundation of the Ministry of Education of China under Grant 20YJA630058.
Author information
Authors and Affiliations
Contributions
All authors have contributed equally. All authors have read and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Tian, Y., Zhuang, C., Cui, J. et al. Gesture recognition method based on misalignment mean absolute deviation and KL divergence. J Wireless Com Network 2022, 96 (2022). https://doi.org/10.1186/s13638022021784
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13638022021784
Keywords
 CSI
 Gesture recognition
 KL divergence
 Misalignment mean absolute deviation