 Research
 Open Access
 Published:
Time series classification based on statistical features
EURASIP Journal on Wireless Communications and Networking volume 2020, Article number: 46 (2020)
Abstract
This paper presents a statistical feature approach in fully convolutional time series classification (TSC), which is aimed at improving the accuracy and efficiency of TSC. This method is based on fully convolutional neural networks (FCN), and there are the following two properties: statistical features in data preprocessing and finetuning strategies in network training. The key steps are described as follows: firstly, by the window slicing principle, dividing the original time series into multiple equallength subsequences; secondly, by extracting statistical features on each subsequence, in order to form a new sequence as the input of the neural network, and training neural network by the finetuning idea; thirdly, by evaluating the classification performance about test sets; and finally, by comparing the sample sequence complexity and network classification loss accuracy with the FCN using the original sequence. Our experimental results show that the proposed method improved the classification effects of FCN and the residual network (ResNet), which means that it has a generalization ability to the network structures.
Introduction
Time series data are widely used for representing special data such as biological observations, stock prices, weather readings, and health monitoring data. How to extract useful information from time series data is becoming more and more important. Time series classification (TSC) is one of the key tasks, which is aimed at predicting class labels for time series. Nowadays, TSC is widely used successfully in medical diagnosis and early warning [1].
Many scholars have studied TSC and provided some methods. The approaches can be divided into the following three categories: distancebased approach, featurebased approach, and ensemblebased ones [2]. The key idea of distancebased approach is to measure the similarity between any given two time series. Commonly used classification algorithms are Knearest neighbor (KNN), support vector machines (SVM) with similaritybased kernels, and dynamic time warping (DTW). The key idea of featurebased ones is mainly extracting a set of features representing the global time series patterns, and then commonly form a bagofwords (BoW) provided to classifiers [3]. These kind of classification algorithms are the bagoffeatures framework (TSBF) [4], bagofSFAsymbols (BOSS) [5], and BOSSVS method [6]. The key idea of third ones is to combine different classifiers together to achieve a higher accuracy. For example, PROP combines 11 classifiers based on elastic distance measures with a weighted ensemble scheme [7]. Shapelet ensemble (SE) provides the classifiers through the shapelet transform in conjunction with a heterogeneous ensemble [8]. In summary, these methods described above commonly require data preprocessing and feature extraction.
In recent years, deep neural networks have been used for TSC tasks, for example, multiscale convolutional neural network (MCNN) [1], fully convolutional network (FCN) [2], and residual network (ResNet) [2]. FCN and ResNet do not require a lot of data preprocessing or feature engineering and can provide a good classification effect. Karim et al. proposed longshortterm memory fully convolutional networks (LSTMFCN), which further improved the performance of the FCN [9]. But it increased the complexity of calculations and classification. Based on FCN, this paper studies the time series preprocessing and found a way to improve the performance of FCN classification. The key idea is described as follows: given some original time series of UCR data sets [10], the method uses the window slicing principle to divide the original time series into multiple equallength subsequences and then extracts statistical features on each subsequence to form a new sequence as the input of the neural network. In the network training process, we introduce the finetuning idea and evaluate the classification performance of the network by using a test set. Finally, the sample sequence complexity and network classification loss accuracy are compared with the FCN using the original sequence. Experimental results show that FCN performs better on the samples of the statistical features of the extracted subsequences, and the found method also improves the classification performance of ResNet, which means that it has generalization ability to the network structures.
The remainder of the paper is organized as follows. Section 2 gives some basic notations such as time series classification. Section 3 introduces the network architecture and finetuning of models. Section 4 provides experiments and results analysis. Section 5 improves the experiments from two aspects: increase the number of original sequences and extract more statistical features. Finally, we present conclusions and future work.
Some basic notations
This section will introduce some basic notations such as TSC [11].
Definition 1
An ordered set of real values with the form X = [x_{1}, x_{2}, ..., x_{t}] is called a univariate time series (UTS). The length of X is equal to number t.
Definition 2
Given N different UTS with X^{i} ∈ R^{T}, X = [X^{1}, X^{2}, ..., X^{N}] is called an Mdimensional multivariate time series (MTS).
Definition 3
A data set D = {(X_{1}, Y_{1}), (X_{2}, Y_{2}), ..., (X_{N}, Y_{N})} is a set of pairs (X_{i}, Y_{i}) where X_{i} could either be a UTS or MTS with Y_{i} as its onehot label vector. For a data set containing N classes, Y_{i} is the vector of length N, where j ∈ [1, N] is equal to 1 ifX_{i} is j and 0 otherwise.
Definition 4
Given some time series of length t, X = {X^{t}, t ∈ 1, 2, ...N} and their labels {Y_{1}, Y_{2}, ..., Y_{N}}, TSC refers to input a new time series and output its labels Y_{i}.
Network architecture and finetuning of models
Network module and architecture
In FCN, a time convolution block is used as a time series feature extraction module, which consists of a convolution layer, a batch normalization layer, and an activation function [4, 12,13,14,15]. Some researchers provided methods for training deep networks [15,16,17,18]. The activation function may be a linear correction unit or a parametric linear correction unit. Convolution layer is shown in Fig. 1. Generally, the input to a temporal convolutional network is a time series signal. Given L convolutional layers, filters can capture how the input signals evolve over the course of an action. For each layer, all of the filters are parameterized by tensor \( {W}^{(l)}\in {R}^{F_l\times d\times {F}_{l1}} \) and biases\( {b}^{(l)}\in {R}^{F_l} \), where l ∈ {1, ..., L} is the layer index and d is the filter duration. For the l − th layer, the i − th component of the incoming (normalized) activation matrix \( {E}^{\left(l1\right)}\in {R}^{F_{l1}\times {T}_{l1}} \) from the previous layer for each time t where f(·) is a rectified linear unit [9, 12, 15].
When training neural networks, standardized input can increase the speed of neural network training. The method is to normalize the training data, which means that raw data is subtracted from its mean and divided by its variance. Standard processing is also required at each hidden layer of the neural network, which is the batch normalization, as shown in Fig. 2.
Batch normalization processes lth hidden layer input Z^{l − 1} as follows, ignoring the superscript l − 1:
where m is the number of single minibatch samples and ε prevents the denominator from being zero, which can be a value 10^{−8}. In this way, all the inputs of the hidden layer have a mean of 0 and a variance of 1.
In most cases, we do not expect all z^{(i)} means equal to 0 and the variance is 1. So, it usually needs to be further processed onz^{(i)}:
where γ and β are hyperparameters, which can be obtained by gradient descent. Setting γ and β to different values, you can get any mean and variance. Activation function ReLU is shown in Fig. 3. Compared with other activation functions, ReLU has the advantages as follows: for linear functions, ReLU is more expressive; for nonlinear functions, ReLU is due to nonnegative interval gradients. Hence, there is no vanishing gradient problem.
After the time series is extracted by the time convolutional block, the extracted features are input to the global average pooling module to output the classification result. The global average pooling module is composed of the global average pooling layer and the Softmax layer. Global average pooling layer is shown in Fig. 4.
The last layer of the traditional CNN is a fully connected layer. The number of parameters is very large, which is easy to cause overfitting (such as AlexNet network). Therefore, the paper proposes to use the global mean pooling layer instead of the fully connected layer [13]. For different traditional connection layers, global averaging is performed for each feature sequence so that each feature sequence can get an output. This can greatly reduce the network parameters and avoid overfitting. On the other hand, each feature sequence is equivalent to an output feature, and this feature represents the characteristics of the output class.
TSC is multicategory. Generally, the most common method for solving multiclassification is to design noutput nodes. For example, the neural network can design an ndimensional array as an output. Each dimension in the array corresponds to a category of the time series. The Softmax layer transforms the output of a neural network into a probability distribution. Each dimension value represents the probability of belonging to this category, and hence by cross entropy, it can be predicted about the distance between the probability distribution and one of the real answers.
FCN is performed as a feature extract. Its overall structure is shown in Fig. 5. The FCN contains three time convolutional blocks and one global average pooling one, where filter sizes of convolutional blocks are 128, 256, and 128, respectively.
The convolution operation is fulfilled by three 1D kernels without striding, whose sizes are 8, 5, and 3, respectively. The 1 × 1 convolution kernel can operate on a crosschannel aggregation and further reduce the dimension. The calculation of the time convolution block [2] is as follows:
where ⊗ is the convolution operator.
The global average pooling block further processes the feature sequence, and the final classification result is given by the Softmax layer. Residual network is shown in Fig. 6. By adding a long jump connection to each fragment, ResNet extends the neural networks to very deep structures, which can allow the gradient flow to pass directly through the bottom layer. It achieves the most advanced performance in target detection and other visually related tasks [14]. The residual network also performs well in solving TCS. The convolutional blocks in Eq. 4 are used to build each residual block. Let Block_{k} denote the convolutional block with the number of filters k, and the residual block is formalized as follows [2]:
The number of filters is the pair (64, 128, 128). The residual network used consists of three residual blocks and a global mean pool block, as shown in Fig. 6.
Finetuning of models
Finetuning can be described as transfer learning on the same data set [9]. The training procedure consists of two distinct phases. In the initial phase, the model is trained on a given data set with selected optimal hyperparameters. In the second one, finetuning is applied to the initial model. In fact, the procedure of transfer learning is iterated over in finetuning phase. Each repetition is initialized using the model weight of the previous iteration. At each iteration, the learning rate is halved, which accelerates the iterative process. The procedure is repeated K times, until the initial learning rate gets the lowest threshold, where K is an arbitrary constant. The algorithm is shown in Algorithm 1.
Experiments and result analysis
Experiment settings
In the experiment, FCN neural network is trained with original sequence data set and data set based on statistical feature processing, and the network performance is evaluated on the test set. In order to present the generalization ability of the neural network structure based on statistical feature sequence, the same experiment on ResNet is carried out to evaluate and compare. The original sequence data set comes from UCR time series data, which contains 44 classes, divided into training sets and test sets, as shown in https://www.cs.ucr.edu/~eamonn/time_series_data_2018/. Among them, the class with the longer time series is selected as the input sample. The training sample format in the data set is as follows:
where N is the number of training samples and t is the length of the time series in the training samples. The only preprocessing of the original sequence before the experiment is standardization. Let μ be the average and σ^{2} be the variance of the data, then
Then the training samples are extracted from the data to form a matrix.
The class label y_{i} corresponding to the training sample is converted to a onehot vector Y_{i}, which is composed of only 0 and 1. If the training sample X^{i} belongs to a class, the corresponding class label is 1, and otherwise is 0.
where C is the number of types of time series. Normalize the time series samples (X, Y) into the FCN neural network for training. We call statistical feature data set, which is shown in Fig. 7.
With regard to time series preprocessing, the traditional machine learning method is to extract the statistical features of the entire time series and then use the algorithm of random forest to classify. The modern deep learning method is to convolute directly on the normalized original sequence [2], then enter into FCN. In this paper, we have found a better time series preprocessing method. Before the standardization, the original sequence is divided into multiple subsequences according to the given step size and subsequence length, and then the statistical features are extracted from each subsequence (maximum, mean, polar difference, variance, peak trough, kurtosis, etc.), and let these statistical features constitute a new time series as the input of the neural network. On the data set, the final input data samples are described as follows:
where sl is the length of the subsequence, s denotes to stride, and nsf refers to the number of statistical features selected. Moreover, new time series samples (X_{new}, Y) are standardized and then input into FCN neural network for training.
Hyperparameter setting
FCN and ResNet are trained with the Adam algorithm. Using the learning rate attenuation (the learning rate decreases as the number of iterations increases), the initial value of the learning rate is set to 0.001, and the final value of the finetuning 0.0001; in order to avoid the small amount of data and when the neural work is overfitting, the method of early termination is adopted, in which the optimal value does not change for 50 generations, and the network determines that the optimal value is found, and the program ends early. Batch normalization parameters β_{1} = 0.9, β_{2} = 0.999, ε = 10^{−8}. The loss function for all tested models is categorical cross entropy. The number of iterations is 2000.
Experiment and evaluation
In order to present the generalization ability based on the statistical feature method, the same experiments are performed on the residual network. The performance of FCN and ResNet has been presented to be superior to many other deep learning algorithms in dealing with the task of time series classification. According to the standardized MPCE score based on the mean t test given in [2], the rankings of FCN and ResNet are shown in Fig. 8. In the experiment, FCN and ResNet are trained with standardized raw data sets and new data sets based on statistical features, and then the test set is used for error and loss analysis to draw a conclusion. The experimental environment uses Google Colaboratory and the GPU model is Tesla K80. Google Colaboratory is a research tool open to Google, mainly used for machine learning development and research, where you can run python code to train neural networks.
Mean per class error (MPCE) is the arithmetic mean of each type of error, which can be used to evaluate the classification performance of the specific models on multiple data sets. The calculation formula is as follows:
The basic idea of MPCE is as follows: the expected error rate for a single class across all the data sets. By considering the number of classes, MPCE is more robust as a baseline criterion [2].
Results and analysis
The experiment limited the length of the time series, selected 14 classes with a time series length exceeding 350, and calculated the original time series and the time series dimensions after extracting the statistical features. The dimensions of the two time series are shown in Table 1, where the original time series (OTS) and new sequence of extracted features (SEF).
From the data in Table 1, it can be found that the original time series is divided into multiple subsequences and the data dimension of the new time series formed after extracting the statistical features is significantly reduced, and the new sequence maintains the statistical characteristics of the original series and removed redundant part, so if the new sequence is input to FCN, the calculation of the neural network can be simplified and the operation efficiency can be improved while ensuring the correct rate. The original sequence and the extracted features are first tested in FCN. The final loss and accuracy of the various types, as well as the final results of the network’s runtime, are shown in Table 2.
From Table 2, in many time series categories, the new sequence based on statistical features is more accurate in the FCN than the original sequence in terms of the loss function, and the overall performance is better. The FCN network has a short classification time and a small average error. The loss and accuracy of the original sequence and the new sequence based on statistical features are obtained and visualized. Accuracy images of the original sequence and feature subsequence on the FCN network, as shown in Fig. 9, the results show that the accuracy of the FCN based on the statistical feature subsequence is superior to the original sequence in most of the classes (the latter several classes are more obvious), and the trend of the two curves in the first few classes is more consistent, and the algorithm runs relatively. In the visual image of the loss of the original sequence and the feature subsequence on the FCN network, from Fig. 10, we can find that the loss of the FCN based on the statistical feature subsequence is generally smaller than that of the original sequence, and the trends of the two curves are relatively consistent.
In order to verify the generalization ability of the method based on statistical features, the same experiments are performed on the residual network and the same analysis. The accuracy, loss, and time of the original sequence and feature subsequence on ResNet are shown in Table 3.
Table 3 shows that the ResNet performs well in the statistical feature subsequence classification. In many time series categories, when the loss function is equivalent, the new sequence based on statistical features is more accurate than the original sequence in ResNet, and the overall performance is better than the original sequence in terms of accuracy or loss. The ResNet network has a short classification time and a small class average error. The loss and accuracy of the original sequence and the new sequence based on statistical features are obtained and visualized. From the visual image of the accuracy on the original sequence and feature subsequence using the ResNet network, Fig. 11 shows that the accuracy of ResNet based on the statistical feature subsequence is still better than that of the original sequence in many classes, but there are some fluctuations in curve changes. In the previous classes, the performance of ResNet based on statistical feature subsequence is improved. After the seventh time series, classes begin to improve. In the visual image of loss degree on the original sequence and feature subsequence, it can be seen that the loss of ResNet network on statistical feature subsequence is generally smaller than that of the original sequence, and the trend of change of the two curves is relatively consistent, as shown in Fig. 12.
Improved experiments and result analysis
In this section, we improve the experiments from the following two aspects: increase the number of original sequences and extract more statistical features.
Increasing the number of original sequence classes
In the previous experiment, we used sequence data from the UCR time series data set, as shown in https://www.cs.ucr.edu/~eamonn/time_series_data_2018/. Fourteen classes were selected from 85 classes with the length of time series exceeding 350 steps. This improvement increased the number of the original time series classes. The number is 22. In this case, the whole convolutional neural network still performs better in the subsequence based on the statistical characteristics. Accuracy and loss are shown in Figs. 13 and 14, respectively.
Extracting more statistical features
In the data experiment stage, more statistical features can be extracted, such as korder origin center distance, linear fitting best curve slope, peak value, maximum transform amplitude, and average transform rate. Then we provide on the contrast experiment from the accuracy and the loss aspect, as shown in Figs. 15 and 16.
From Figs. 15 and 16, although the 9 and 19 subsequences are not as accurate as the original sequences, the accuracy of the subsequences on most classes is improved with the addition of some features.
Conclusions
FCN has been shown to achieve stateoftheart performance on the task of TSC. Based on FCN, this paper studies the time series preprocessing and found a way to improve the performance of FCN classification. The method uses the window slicing principle to divide the original time series into multiple equallength subsequences, then extracts statistical features on each subsequence to form a new sequence as the input of the neural network, adds the finetuning idea in the network training process, and then evaluates the classification performance of the network by using test set. Finally, the sample sequence complexity and network classification loss accuracy are compared with the FCN using the original sequence. It is proved that the FCN performs better on the samples of the statistical features of the extracted subsequences. In order to prove that the proposed method has generalization ability to the network structure and the same experiment and analysis on the ResNet, the method based on statistical feature subsequence also improves the classification performance of ResNet.
Several problems remain unsolved. Some of the interesting problems are how to adjust subsequence length/step length and regularize a network and how to use LSTMFCN, ALSTMFCN [9], hierarchies of feature [19], or time aware [20, 21] to improve the performance of time series classification. In addition, we will try to apply the method of classification service in distributed cloud/fog environment [22,23,24,25,26,27,28,29,30,31,32] in the future.
Availability of data and materials
The original sequence data set comes from the UCR time series and the data can be found in https://www.cs.ucr.edu/~eamonn/time_series_data_2018/.
Abbreviations
 DTW:

Dynamic time warping
 FCN:

Fully convolutional neural networks
 KNN:

Knearest neighbor
 MPCE:

Mean per class error
 OTS:

Original time series
 SEF:

Sequence of extracted features
 SVM:

Support vector machines
 TSC:

Time series classification
References
 1.
Z. Cui, W. Chen, Y. Chen, Multiscale convolutional neural networks for time series classification [J]. CoRR1603(06995) (2016)
 2.
Z. Wang, W. Yan and T. Oates. Time series classification from scratch with deep neural networks: a strong baseline [J]. International Joint Conference on Neural Networks (IJCNN). 2017, 15781585.
 3.
J. Lin, E. Keogh, L. Wei, S. Lonardi, Experiencing SAX: a novel symbolic representation of time series [J]. Data Mining and knowledge discovery15(2), 107–144 (2007)
 4.
M.G. Baydogan, G. Runger, E. Tuv, A bagoffeatures framework to classify time series. IEEE Transactions on Pattern Analysis and Machine Intelligence35(11), 2796–2802 (2013)
 5.
P. Schäfer. The boss is concerned with time series classification in the presence of noise [J] Data Mining and Knowledge Discovery, 2015, 29(6): 15051530.
 6.
P. Schäfer, Scalable time series classification [J]. Data Mining and Knowledge Discovery30(5), 1273–1298 (2016)
 7.
J. Lines, A. Bagnall, Time series classification with ensembles of elastic distance measures [J]. Data Mining and Knowledge Discovery29(3), 565–592 (2015)
 8.
A. Bagnall, J. Lines, J. Hills, A. Bostrom, Timeseries classification with COTE: the collective of transformationbased ensembles [J]. IEEE Transactions on Knowledge and Data Engineering 27(9), 2522–2535 (2015)
 9.
F. Karim, S. Majumdar, H. Darabi, S. Chen, LSTM fully convolutional networks for time series classification [J]. IEEE Access., 1662–1669 (2018)
 10.
Y. Chen, E. Keogh, B. Hu, N. Begum, A. Bagnall, A. Mueen, and G. Batista, The UCR time series classification archive [Online], 2015, www.cs.ucr.edu/~eamonn/time_series_data/
 11.
H I Fawaz, G. Forestier, J. Weber, et al. Deep learning for time series classification: a review [J]. Data Mining and Knowledge Discovery, 2019, 33(4): 917963, 2019.
 12.
C. Lea, R. Vidal, A. Reiter, and G. D. Hager, Temporal convolutional networks: a unified approach to action segmentation [C]. ECCV 2016: Computer Vision – ECCV 2016 Workshops, 2016: 4754.
 13.
J. R. Chang, Y.S. Chen. Batchnormalized maxout network in network [J]. Computer Science, 2015.
 14.
He K, Zhang X, Ren S, et al. Deep residual learning for image recognition [C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770778.
 15.
S. Loffe and C. Szegedy. Batch normalization: accelerating deep network training by reducing internal covariate shift [J], International Conference on Machine Learning, pp.448456, 2015.
 16.
N. Srivastava, G.E. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, Dropout: a simple way to prevent neural networks from overfitting [J]. Journal of Machine Learning Research15(1), 1929–1958 (2014)
 17.
Y. Zheng, Q. Liu, E. Chen, Y. Ge, J.L. Zhao, Exploiting multichannels deep convolutional neural networks for multivariate time series classification [J]. Frontiers of Computer Science10(1), 96–112 (2016)
 18.
J. Long, E. Shelhamer, and T. Darrell, Fully convolutional networks for semantic segmentation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 34313440.
 19.
Y. Lei , Y. Yan, Y. Han, and F. Jiang. The hierarchies of multivalued attribute domains and corresponding applications in data mining [J]. Wireless Communications and Mobile Computing, Volume 2018, Article ID 1789121, 111, 2018.
 20.
L.Y. Qi, R.L. Wang, S.C. Li, Q. He, X.L. Xu, C.H. Hu, Timeaware distributed service recommendation with privacypreservation [J]. Information Sciences480, 354–364 (2019)
 21.
L. Y. Qi, X.Y. Zhang, W.C Dou, C.H. Hu, C. Yang, J.J. Chen. A twostage localitysensitive hashing based approach for privacypreserving mobile service recommendation in crossplatform edge environment [J]. Future Generation Computer Systems, 2018, 88:636643.
 22.
H.W. Liu, H.Z. Kou, C. Yan, L.Y. Qi, Link prediction in paper citation network to construct paper correlation graph [J]. EURASIP Journal on Wireless Communications and Networking (2019). https://doi.org/10.1186/s1363801915617
 23.
W. W. Gong, L.Y. Qi, Y.W. Xu. Privacyaware multidimensional mobile service quality prediction and recommendation in distributed fog environment [J]. Wireless Communications and Mobile Computing, vol. 2018, Article ID 3075849, 8 pages, 2018.
 24.
Y. W. Xu, L.Y. Qi, W.C. Dou, J.G. Yu. Privacypreserving and scalable service recommendation based on SimHash in a distributed cloud environment [J]. Complexity, Volume 2017, Article ID 3437854, 9 pages, 2017.
 25.
X. K. Wang, Laurence T. Yang, L.W. Kuang, X.G. Liu, Q.X. Zhang and M. Jamal Deen, A tensorbased bigdatadriven routing recommendation approach for heterogeneous networks [J], IEEE Network Magazine, 2019, 33(1): 6469.
 26.
X.K. Wang, Laurence T. Yang, H.G. Li, M. Lin, J.J. Han, Bernady O. Apduhan, NQA: a nested anticollision algorithm for RFID systems [J]. ACM Transactions on Embedded Computing Systems18(4) (2019). https://doi.org/10.1145/3330139
 27.
X. K. Wang, Laurence T. Yang, Y.H. Wang, X.G. Liu, Q.X. Zhang, and M. Jamal Deen, A distributed tensortrain decomposition method for cyberphysicalsocial services [J], ACM Transactions on CyberPhysical Systems, 3(4), doi:https://doi.org/10.1145/3323926, 2019.
 28.
X.K. Wang, L.T. Yang, X. Xie, J.R. Jin, M.J. Deen, A cloudedge computing framework for cyberphysicalsocial services [J]. IEEE Communications Magazine55(11), 80–85 (2017)
 29.
X.L. Xu, R.C. Mo, F. Dai, W.M. Lin, S.H. Wan, W.C. Dou, Dynamic resource provisioning with fault tolerance for dataintensive meteorological workflows in cloud [J]. IEEE Transactions on Industrial Informatics (2019). https://doi.org/10.1109/TII.2019.2959258
 30.
X.L. Xu, X.H. Liu, Z.Y. Xu, C.J. Wang, S.H. Wan, X.X. Yang, Joint optimization of resource utilization and load balance with privacy preservation for edge services in 5G networks [J]. Mobile Networks and Applications (2019). https://doi.org/10.1007/s11036019014488
 31.
X.L. Xu, S.C. Fu, W.M. Li, F. Dai, H.H. Gao, V. Chang, Multiobjective data placement for workflow management in cloud infrastructure using NSGAII [J]. IEEE Transactions on Emerging Topics in Computational Intelligence (2019)
 32.
X. L. Xu, C. X. He, Z. Y. Xu, L. Y. Qi, S.H. Wan, MZA Bhuiyan. Joint optimization of offloading utility and privacy for edge computing enabled IoT [J]. IEEE Internet of Things Journal, 2019, Doi: 10.1109/JIOT.2019.2944007.
Acknowledgements
Zhang Linkun and Wang Zhengyan help us read through the text and polish some sentences. The authors thank the anonymous reviewers for their valuable suggestions and contributions to the quality of the paper.
Funding
This work is partly supported by the Natural Science Foundation of China (under grant no. 61502272), the Promotive Research Fund for Excellent Young and MiddleAged Scientists of Shandong Province (under grant no. BS2014DX005), and Undergraduate Education Reform Project in Shandong Province (no. Z2018S022).
Author information
Affiliations
Contributions
YL and ZW designed the study. ZW performed the simulation experiments and wrote the paper. YL and ZW reviewed and edited the manuscript. Both authors read and approved the final manuscript.
Authors’ information
Yuxia Lei obtained his Ph.D. degree in computer software and theory from the Institute of Computing Technology, CAS, in 2010. He is an associate professor. His current research interests include concept lattice, artificial intelligence, and machine learning. Zhongqiang Wu received the B.Sc. degree in computer science from Qufu Normal University, Rizhao, China, in 2019.
Corresponding author
Correspondence to Yuxia Lei.
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Lei, Y., Wu, Z. Time series classification based on statistical features. J Wireless Com Network 2020, 46 (2020). https://doi.org/10.1186/s1363802016614
Received:
Accepted:
Published:
Keywords
 Time series classification
 Statistical features
 Full convolutional neural network