# Comparative evaluation of ARIMA and ANFIS for modeling of wireless network traffic time series

- Rajnish K Yadav
^{1}and - Manoj Balakrishnan
^{1}Email author

**2014**:15

https://doi.org/10.1186/1687-1499-2014-15

© Yadav and Balakrishnan; licensee Springer. 2014

**Received: **7 August 2012

**Accepted: **12 December 2013

**Published: **22 January 2014

## Abstract

Network traffic modeling significantly affects various considerations in networking, including network resource allocation, quality of service provisioning, network traffic management, congestion control, and bandwidth efficiency. These are very important issues in network protocol design, too. In this paper, a comprehensive comparison of modeling approaches of adaptive neuro fuzzy inference system (ANFIS) and autoregressive integrated moving average (ARIMA) for modeling of wireless network traffic in terms of typical statistical indicator and computational complexity has been attempted. ARIMA has been widely used in this area for past many years. On the other hand, ANFIS is comparatively new, and no network traffic modeling using ANFIS was attempted until recently to the best of our knowledge. At the same time, a detailed comparative performance evaluation of ANFIS with other modeling approaches in traffic modeling could not be found in existing literature. Reportedly, ANFIS provides a good precision in prediction in terms of statistical indicators and also gives effective description of network conditions at different times. However, the computational complexity of ANFIS for traffic modeling is a major concern and deserves a closer inspection. In our case of wireless network traffic, as a final result, we find that ANFIS model performs better than the best ARIMA model in three different scenarios.

## Keywords

## 1 Introduction

Network traffic modeling plays an important role in many areas of computer networks including but not limited to network traffic management, quality of service (QoS) provisioning, network protocol design, and bandwidth allocation. This has led to a great interest among researchers in accurate modeling of network traffic. Initial attempts in the past were mainly concentrated on *Poisson modeling* which did not compare well to the actual observations made at that time [1]. Then, a groundbreaking work by Leland et al. [2] proved that the network traffic exhibits *self similarity* and therefore its nature is entirely different, justifying not so accurate results from the Poisson models. This seminal paper also laid the foundation for subsequent network traffic modeling attempts.

Many different modeling approaches have been tried since then to accurately model and capture this self-similar nature of network traffic. One important category of the type of models is statistical (or regressive) models, which include autoregressive (AR), autoregressive moving average (ARMA), generalized autoregressive moving average (GARMA), autoregressive integrated moving average (ARIMA), and fractional autoregressive integrated moving average (FARIMA) [3]. Another approach is that of fractional Gaussian noise (fGn) and fractional Brownian motion (fBn) which generally result in better accuracy as compared to regressive models for *long-range dependent* data [4]. Artificial neural network and fuzzy logic-based methods have also gathered significant attention [5–8]. Some authors have used modeling approaches based on least-mean kurtosis [9] and chaos theory [10].

ARIMA is a widely used statistical model for time series analysis and has also been used successfully in network traffic modeling [11, 12]. Adaptive neuro fuzzy inference system (ANFIS) model [13] has been applied to forecast Internet traffic time series in [14]. Although other soft computing approaches have been tried earlier, ANFIS was not attempted prior to [14] in our knowledge. ANFIS is a combination of fuzzy logic and neural network approaches and inherently carries the advantages of both. This makes ANFIS quite attractive option for this purpose. However, one major drawback of ANFIS is that it is computationally expensive and complex. This paper is an attempt to compare the modeling approaches of ARIMA and ANFIS under different scenarios in order to conclude about their comparative suitability for computer network traffic modeling.

The paper has been organized as follows. In Section 2, we briefly highlight the related work in this area. Section 3 of this paper provides necessary description of ARIMA and ANFIS. Section 4 contains description of network traffic data collection and then data pre-processing. Modeling results and discussion have been presented in Section 5. Section 6 contains conclusions.

## 2 Related work

ARIMA has been discussed in [11, 12] highlighting its use in modeling and prediction of network traffic. Authors in [4] discuss ARIMA modeling of traffic in an institutional wireless network. A good discussion on the application of ANFIS to forecast Internet traffic time series can be found in [14]. ANFIS method is compared with ARIMA in [15] for forecasting WiMAX traffic time series. The authors of [15] argue that ARIMA is better than ANFIS based on their comparison result which showed lower root mean square error (RMSE) and processing time for ARIMA. However in doing so, no proper reason was given for choosing a particular ARIMA model for comparison. Weather forecasting is another prominent area in which attempts have been made to compare ARIMA and ANFIS approaches. The results of these comparisons, however, are contrasting. Some authors have found ARIMA preferable over ANFIS [16] while others recommend that ANFIS is a better approach [17]. We must note here that these are very specific applications in which dataset varies drastically from case to case leading to different results. In [18], comparisons between the two approaches have been made to forecast electrical energy consumption wherein authors conclude that ANFIS is more appropriate than ARIMA.

## 3 Modeling approaches: ARIMA and ANFIS

We now describe the basic framework of modeling approaches of ARIMA and ANFIS.

### 3.1 Autoregressive integrated moving average

In the framework of regression models, the computation of present output is done as a linear combination of some pre-specified number of past outputs and moving average of random white Gaussian noise [3].

Let us denote Ω as the lag operator such that Ω*X*(*t*) = *X*(*t* - 1). In general we write Ω^{
τ
}*X*(*t*) = *X*(*t* - *τ*). Also let us denote Δ as the difference operator so that Δ*X*(*t*) = *X*(*t*) - *X*(*t* - 1). It can be observed that Δ^{
τ
}*X*(*t*) = (1 - Ω)^{
τ
}*X*(*t*). Let us also define two polynomial functions *ϕ*(Ω) = (1 - *ϕ*_{1}Ω-.......- *ϕ*_{
m
}Ω^{
m
}) and *θ*(Ω) = (1 - *θ*_{1}Ω-.......- *θ*_{
n
}Ω^{
n
}) where *ϕ*_{1}, *ϕ*_{2},....*ϕ*_{
n
} and *θ*_{1}, *θ*_{2},....*θ*_{
n
} are coefficients of the lag operator Ω; *m* and *n* are the degree of the polynomials, respectively.

Given these notations, the definition of regressive models follows next.

*m,*generally denoted by AR (

*m*) [3] has the form

where *ε*(*t*) is random white Gaussian noise.

*m*,

*n*), generally denoted by ARMA (

*m*,

*n*) [3] has the form

*m*,

*τ*,

*n*) which is generally denoted by ARIMA (

*m*,

*τ*,

*n*) [3] has the form

It can be seen that ARIMA is the most general of all the three regressive models discussed above. Although other more generalized regressive models are also available, ARIMA will be the focus of our study in this paper.

### 3.2 Adaptive neuro fuzzy inference system

*viz*., rule base, database, and reasoning mechanism. The rule base is a collection of fuzzy if-then rules which decide the system’s behavior and response under different possible situations. The database contains the information about the membership functions in terms of their type and shape. Finally, the reasoning or decision-making mechanism is used to infer and derive output from the system. It may be noted that a FIS may need a fuzzification interface to convert crisp input values to fuzzy values suitable for processing. However, when the inputs themselves are fuzzy then this may not be required. Similarly at the output side, a defuzzification interface is used because in almost all of the real-world application, we need a crisp output value.

*Sugeno-type FIS*(see [19]) in Figure 2 and equivalent ANFIS architecture in Figure 3 has been shown next. The following common rule set for first-order Sugeno fuzzy model can easily be verified:

Rule 1. If *x* is *P*_{1} and *y* is *Q*_{1}, then *d*_{1} = *a*_{1}*x* + *b*_{1}*y* + *c*_{1}.

Rule 2. If *x* is *P*_{2} and *y* is *Q*_{2}, then *d*_{2} = *a*_{2}*x* + *b*_{2}*y* + *c*_{2}.

In the ANFIS architecture shown in Figure 3, each node in the same layer has the similar function. Here we denote output of the *i* th node in the layer *l* by ${O}_{i}^{l}$*.*

In layer 1, a linguistic label is associated with each input in terms of its membership grade. This membership grade can be defined by suitable membership functions ${{\mu}_{P}}_{{}_{i}}\left(x\right)$ and ${{\mu}_{Q}}_{{}_{i}}\left(x\right)$ with appropriate parameters. The parameters associated with these membership functions are called premise or nonlinear parameters.

The output of this layer is often called the firing strength of the corresponding rule.

Here, *a*_{
i
}, *b*_{
i
}, and *c*_{
i
}; *i* = 1, 2 are called consequent or linear parameters of ANFIS. The total number of parameters of ANFIS is the sum of premise and consequent parameters.

It must be noted that the structure of ANFIS explained above is not unique, and, in fact, arbitrary but meaningful assignment of node functions and configurations is possible.

## 4 Network traffic collection and data pre-processing

### 4.1 Network traffic data collection

After completing a brief introduction of modeling approaches of ARIMA and ANFIS, we now proceed to implement these concepts to the real-world network traffic data.

To begin, real-time network traffic trace is the first thing required. In the networking research community, Wireshark [20] is the most popular and sophisticated network traffic monitoring tool. It can capture data packets from the network and provides important information like packet size, packet transfer rate, and packet capture time as well as data packet contents. We collected the packet statistics from an institutional wireless network, discarding the user data for this study. Matshark [21] was used to extract the data from Wireshark and export it into MATLAB memory space.

### 4.2 Data pre-processing

The collected network traffic data was at nonuniform time scale. To get a time series data, samples at a uniform time scale are required. Data samples were extracted from the traffic trace at intervals of 0.1 s in MATLAB resulting in a time series data.

This ensured that all samples remain within the range [0, 1] so that the RMSE values after applying different models can be effectively compared.

where *y*_{
m
} and ${\widehat{y}}_{m}$ denote *m* th actual and model trained data samples, respectively. *N* is the sample size.

## 5 Results and discussion

We considered three different cases to evaluate and compare the performances of ARIMA and ANFIS. In the first case, *N* = 500 samples of the time series data were used for modeling. In the second and third cases, *N* = 1,000 and *N* = 1,500 samples were used respectively. Below we describe the individual cases using each approach.

### 5.1 ARIMA approach

**Descriptive statistics of data samples and best fit model in three cases**

Number of Samples | Min | Max | Mean | Best fit model | |
---|---|---|---|---|---|

Case 1 | 500 | 0.000 | 1.000 | 0.0986912 | ARIMA (1,0,0) |

Case 2 | 1,000 | 0.000 | 1.000 | 0.1074067 | ARIMA (1,0,0) |

Case 3 | 1,500 | 0.000 | 1.000 | 0.1084097 | ARIMA (0,0,6) |

### 5.2 ANFIS approach

MATLAB Fuzzy Logic Toolbox [23] was used for ANFIS modeling. The following discussion details the results obtained under each case.

**ANFIS specification for case 1**

Specification | Value |
---|---|

Selected ANFIS architecture | 2-2-2-2 |

Number of nodes | 55 |

Number of linear parameters | 80 |

Number of nonlinear parameters | 24 |

Total number of parameters | 104 |

Number of training data pairs | 500 |

Number of fuzzy rules | 16 |

**ANFIS specification for case 2**

Specification | Value |
---|---|

Selected ANFIS architecture | 2-2-2-2-2-2 |

Number of nodes | 161 |

Number of linear parameters | 448 |

Number of nonlinear parameters | 48 |

Total number of parameters | 496 |

Number of training data pairs | 1,000 |

Number of fuzzy rules | 64 |

A careful reader might have observed that the error along the *Y*-axis in Figure 6 remains almost equal to 0.0871 (shown only until four decimal places). This means that the error does not decrease appreciably with increasing epochs.

**ANFIS specification for case 3**

Specification | Value |
---|---|

Selected ANFIS architecture | 2-2-3-2-3-2 |

Number of nodes | 325 |

Number of linear parameters | 1,008 |

Number of nonlinear parameters | 56 |

Total number of parameters | 1,064 |

Number of training data pairs | 1,500 |

Number of fuzzy rules | 144 |

*m*,

*τ*,

*n*) is

*m*+

*n*+ 1, we summarize our results in Table 5.

**Summary of results**

Number of parameters | RMSE | |
---|---|---|

Case 1 (500 samples) | ||

ARIMA | 2 | 0.085 |

ANFIS | 104 | 0.080 |

Case 2 (1,000 samples) | ||

ARIMA | 2 | 0.089 |

ANFIS | 496 | 0.087 |

Case 3 (1,500 samples) | ||

ARIMA | 7 | 0.083 |

ANFIS | 1,064 | 0.081 |

From Table 5, we see that ANFIS model results in lower RMSE as compared to ARIMA in all the three cases considered here. At the same time, it can also be observed that the number of parameters in ANFIS is much larger than ARIMA in each of these cases. Computational complexity is empirically related to the number of parameters of a model which means that ANFIS is computationally more expensive and complex than ARIMA. The difference between the numbers of parameters becomes even larger when the number of inputs and the number of MFs of inputs of ANFIS are increased.

Hence, it is clear from the above results that although ANFIS performs better than ARIMA, this is achieved at the cost of complexity in computation which must be taken into consideration when ANFIS is used for network traffic modeling.

## 6 Conclusions

Network traffic modeling demands algorithms that are capable of dealing with the self-similar behavior of traffic data where conventional methods and assumptions fall short in terms of accuracy. Two different modeling approaches, autoregressive integrated moving average (ARIMA) and adaptive neuro fuzzy inference system (ANFIS), were applied to model the institutional network traffic data. For this study, three different cases are considered for modeling with 500, 1,000, and 1,500 samples, respectively, of an institutional wireless network traffic. We find that ANFIS performs better than ARIMA in all the three cases. However, this accuracy is achieved at the expense of computational complexity. Hence, it is recommended to use ANFIS approach only in those cases in which carrying out large computations is possible.

In future, coactive neuro fuzzy inference system (CANFIS) [19] can be used for network traffic modeling. Since it is a generalized form of ANFIS and allows avoiding some inherent constraints to ANFIS in its original form, we expect to get even better modeling results from CANFIS.

## Declarations

## Authors’ Affiliations

## References

- Paxson V, Floyd S: Wide-area traffic: the failure of Poisson modeling.
*IEEE/ACM Transac Network*1995, 3: 226-244. 10.1109/90.392383View ArticleGoogle Scholar - Leland W, Taqqu M, Willinger W, Wilson D: On the self-similar nature of Ethernet traffic (extended version).
*IEEE/ACM Transac Network*1994, 2(1):1-15. 10.1109/90.282603View ArticleGoogle Scholar - Ghaderi M:
*On the relevance of self-similarity in network traffic prediction (School of Computer Science, University of Waterloo, Waterloo, Canada)*. . Accessed 23 July 2011 https://cs.uwaterloo.ca/research/tr/2003/28/TR-CS-2003-28.pdf - Yadav RK: Modeling of self similartraffic in wireless networks. In
*IEEE International Conference on High Performance Computing (HiPC) at Proceedings of Workshop on Next Generation Wireless Networks (WoNGeN)*. Bangalore; 2011.Google Scholar - Chabaa S, Zeroual A, Antari J: Identification and prediction of internet traffic using artificial neural networks.
*J Intell Learning Syst Appl*2010, 2(3):147-155. 10.4236/jilsa.2010.23018Google Scholar - Wang F, Xia H: Network traffic prediction based on grey neural network integrated model. In
*International Conference on Computer Science and Software Engineering*. Wuhan; 2008:915-918.Google Scholar - Piedra N, Chicaiza J, López J, García J: Study of the application of neural networks in internet traffic engineering. In
*Information Science and Computing*. Institute of Information Theories and Applications, FOI ITHEA; 2008:3-47. ISSN: 1313–0455Google Scholar - Rahman A, Kennedy P, Simmonds A, Edwards J: Fuzzy logic based modelling and analysis of network traffic. In
*8th International Conference on Computer and Information Technology*. Sydney; 2008:652-657.Google Scholar - Zhao H, Ansari N, Shi YQ: Self-similar traffic prediction using least mean kurtosis. In
*Proceedings of International Conference on Information Technology: Coding and Computing Computers and Communications ITCC 2003*. Las Vegas; 2003:352-355.Google Scholar - Li D, Ji B, Xiang H: The on-line prediction of self-similar traffic based on chaos theory. In
*International Conference on Wireless Communications, Networking and Mobile Computing, 2006. WiCOM 2006*. Wuhan: Wuhan University; 2006:1-4.Google Scholar - Wang L, Li Z, Song C: Network traffic prediction based on seasonal ARIMA model.
*5th World Congress Intell Control Auto*2004, 2: 1425-1428.View ArticleGoogle Scholar - Zhou B, He D, Sun Z: Traffic modeling and prediction using ARIMA/GARCH. In
*Modelling and Simulation Tools for Emerging Telecommunication Networks*. New York: Springer; 2006:101-121.View ArticleGoogle Scholar - Jang JSR: ANFIS: adaptive- network based fuzzy inference system.
*IEEE Trans. Syst. Man and Cybernetics*1993, 23(3):665-685. 10.1109/21.256541 10.1109/21.256541View ArticleGoogle Scholar - Chabaa S, Antari J, Zeroual A: ANFIS method for forecasting Internet traffic time series. In
*Mediterranean Microwave Symposium (MMS)*. Tangiers, Morocco; 2009:1-4.View ArticleGoogle Scholar - Hernandez CAS, Pedraza LFM, Salcedo OJP: Comparative analysis of time series techniques ARIMA and ANFIS to forecast wimax traffic.
*Online J Electron Electric Eng (OJEEE)*2010, 2(2):223-228.Google Scholar - Rahman M, Islam AHMS, Nadvi SYM, Rahman RM: Comparative study of ANFIS and ARIMA model for weather forecasting in Dhaka. In
*International Conference on Informatics, Electronics and Vision (ICIEV)*. Dhaka; 2013:1-6.Google Scholar - Tektas M: Weather forecasting using ANFIS and ARIMA models. A case study for Istanbul.
*Environ Res Eng Manage*2010, 1(51):5-10.Google Scholar - Yayar R, Hekim M, Yilmaz V, Bakirch F: A comparison of ANFIS and ARIMA techniques in the forecasting of electrical energy consumption of Tokat province in Turkey.
*J Econ Social Stud*2011, 1(2):87-110.View ArticleGoogle Scholar - Jang JSR, Sun CT, Mizutani E:
*Neuro-Fuzzy and soft computing: a computational approach to learning and machine intelligence*. Englewood Cliffs: Prentice-Hall; 1997.Google Scholar - Wireshark ..http://www.wireshark.org . Accessed 10 November 2011
- Matshark ..http://www.wireshark.org/lists/wireshark-users/201011/msg00028.html . Accessed 14 November 2011
- IBM SPSS Statistics ..http://www-01.ibm.com/software/in/analytics/spss/products/statistics . Accessed 20 November 2011
- MATLAB Fuzzy Logic Toolbox ..http://www.mathworks.in/products/fuzzy-logic . Accessed 25 November 2011

## Copyright

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.