A proactive resource allocation method based on adaptive prediction of resource requests in cloud computing

With the development of big data and artificial intelligence, cloud resource requests present more complex features, such as being sudden, arriving in batches and being diverse, which cause the resource allocation to lag far behind the resource requests and an unbalanced resource utilization that wastes resources. To solve this issue, this paper proposes a proactive resource allocation method based on the adaptive prediction of the resource requests in cloud computing. Specifically, this method first proposes an adaptive prediction method based on the runs test that improves the prediction accuracy of resource requests, and then, it builds a multiobjective resource allocation optimization model, which alleviates the latency of the resource allocation and balances the utilizations of the different types of resources of a physical machine. Furthermore, a multiobjective evolutionary algorithm, the Nondominated Sorting Genetic Algorithm with the Elite Strategy (NSGA-II), is improved to further reduce the resource allocation time by accelerating the solution speed of the multiobjective optimization model. The experimental results show that this method realizes the balanced utilization between the CPU and memory resources and reduces the resource allocation time by at least 43% (10 threads) compared with the Improved Strength Pareto Evolutionary algorithm (SPEA2) and NSGA-II methods.

(1) We propose a runs test (RT)-based adaptive prediction algorithm for resource requests. This algorithm is built based on our previously studied ensemble empirical mode decomposition (EEMD)-Autoregressive Integrated Moving Average model (ARIMA) and EEMD-RT-ARIMA algorithms [5,6], and it can select a more accurate algorithm to implement the short-term prediction of resource requests via an adaptive prediction strategy. (2) We propose a proactive resource allocation strategy that combines the active prediction and the passive response of resource requests, which can allocate resources in advance for the future sudden resource requests to guarantee the timelessness of the resource allocation. (3) We further propose a resource proportion matching model to ensure the uniform usage of different types of server resources, which can reduce resource waste. Then, a mathematical multiobjective optimization problem of the resource allocation is formulated. (4) We improve the Nondominated Sorting Genetic Algorithm with the Elite Strategy (NSGA-II) to accelerate the solution speed of the multiobjective optimization mathematical problem, which further ensures the timelessness of the resource allocation.
The rest of this paper is organized as follows. Section II introduces related works. Section III describes the proactive resource allocation approach. Section IV presents the experiments and analysis. Section V concludes this paper.
A list of the mathematical notations used in this paper is given in Table 1.

Related works
There are some cloud resource allocation methods for big data applications [7], cloudbased software services [8], scientific applications [9], cloud manufacturing [10], workflows [11] and cloud healthcare [12]. Some algorithms or mechanisms have been applied in resource allocation, such as the grasshopper optimization algorithm (GOA) [13], antcolony optimization and deep reinforcement learning [14], the data-driven probabilistic model [15], and auction mechanisms [16]. Some researchers have proposed the existing resource and task scheduling methods. A systematic review classifies task scheduling approaches into the single cloud environment, multicloud environments and mobile cloud environments for different aims [17]. A comprehensive survey divides the scheduling techniques into three categories: heuristic, meta-heuristic and hybrid schemes [18]. Recently, state-of-the-art multiobjective VM placement mechanisms have been introduced [19]. A review of auction-based resource allocation mechanisms has been comprehensively conducted [20]. Resource allocation methods involve in the aims of reducing cost, minimizing the energy consumption, improving the resource utilization and guaranteeing the quality of service (QoS). A performance-cost grey wolf optimization (PCGWO) algorithm has been proposed to reduce the processing time and cost of tasks [21]. A JAYA algorithm has been used to optimize VM placement and minimize the energy consumption [22]. A fair resource allocation method has been proposed to rapidly and fairly allocate resources and maximize the resource utilization via a flow control policy [23]. These methods cannot provide an effective mechanism to ensure the timelessness of the resource allocation for a large number of sudden resource requests.
A multidimensional resource allocation model MDCRA that uses a single weight algorithm (SWA) and a double weight algorithm (DWA) to minimize the number of physical servers, save energy and maximize the resource utilization in cloud computing has been proposed [24]. It models the multidimensional resource allocation problem as a vector bin packing problem. The bin packing problem is an NP hard problem. At present, there is no polynomial complexity optimization algorithm to solve it. Moreover, it only solves the resource capacity constraint without considering incompatible constraints. An energy-efficient resource allocation scheme considers the energy consumption of the CPU and RAM to reduce the overall energy costs and maintain the service level agreement (SLA) [25]. An empirical adaptive cloud resource provisioning model has been proposed to reduce the latency of the resource allocation and SLA violations via speculative analysis [26]. Both methods focus on workload consolidation and prediction with one target of reducing SLA violations while our method considers the trade-off among the number of physical machines, resource performance and proportional matching. A levy-based particle swarm optimization algorithm has been proposed to minimize the number of running physical servers and balance the load of physical servers by reducing the particle dispersion loss [27]. A dynamic resource allocation algorithm has been proposed to solve resource scheduling and resource matching problems [28]. Here, the Tabu search algorithm is used to solve the resource scheduling problem, the weighted bipartite graph is used to solve the resource matching problem for the tasks on the edge servers, and an optimal solution is further proposed to schedule resources between the edge servers and a cloud data center. This algorithm concentrates on the resource scheduling between the edge server and cloud, but our method focuses on VM placement in a cloud data center. In addition, a cloud workflow scheduling algorithm is proposed based on an attack-defense game model, where a task-VM mapping algorithm is presented to improve the workflow efficiency and different VMs are provided for workflow executions [29]. A fog computing trust management approach that assesses and manages the trust levels of the nodes is proposed to reduce the malicious attacks and the service response time in the fog computing environment [30]. Everything-as-a-Resource as a paradigm is proposed to design collaborative applications for the web [31]. The proactive resource allocation method based on prediction is an effective solution to ensure the timelessness of resource allocation. One type of prediction method is based on machine learning. A prediction-based dynamic multiobjective evolutionary algorithm, called NN-DNSGA-II [32], has been proposed by combining an artificial neural network with the NSGA-II [33]. This algorithm first uses the neural network to predict the pareto-optimal solutions as the initial population of the NSGA-II and then solves the multiobjective optimization problem. The empirical results demonstrate that this algorithm outperforms nonprediction-based algorithms in most cases for the Pegasus workflow management system. However, this algorithm cannot predict the future VM requests, but it does predict a better solution to solve the workflow scheduling problem, which cannot alleviate the resource allocation delay for the future VM request increases. A hybrid wavelet neural network method has been proposed to improve the prediction accuracy through training the wavelet neural network with two heuristic algorithms [34]. The machine learning-based prediction needs to conduct training using a large amount of data, which increases the time consumption and thus cannot guarantee a timely resource allocation. A generic algorithm (GA)-based prediction method has been proposed, and its prediction accuracy is better than the gray model at improving the resource utilization of VMs and physical machines (PMs) [35]. An anti-correlated VM placement algorithm, in which the VMs and the overloaded hosts are predicted to provide the suitable VM placement, has been proposed to reduce the energy consumption [36]. Another type of prediction method is based on statistics. ARIMA is a classical prediction model for time series, and it can also be combined with other methods to predict nonstationary time series. ARIMA and other methods are often combined to implement prediction. A model combining the ARIMA and fuzzy regression, in which the prediction accuracy is improved by setting sliding windows, has been proposed to predict network traffic [37]. An adaptive workload forecasting method dynamically selects the best method from the simple exponential smoothing (SES), ARIMA and linear regression (LR) methods to improve the workload forecasting accuracy [38]. However, this method uses the previous predictions of a set of models and different amounts of training data to execute the next prediction, which increases the prediction cost. A framework combining the ARIMA and LR methods has been used to predict VM and PM workloads, PM power consumption and their total costs [39]. The combination of the ARIMA and Back Propagation Neural Network (BPNN) methods improves the workload prediction accuracy and promotes the minimization of the cost of an edge cloud cluster [40]. An adaptive prediction model has been used to select the best one from the LR, ARIMA and support vector regression (SVR) methods to obtain better prediction results according to workload features [41]. An ensemble model ESNemble combines five different prediction algorithms and extracts their features to forecast the workload time series based on an echo state network, and it outperforms each single algorithm in terms of the prediction accuracy [42]. The above methods combine the classic ARIMA model with other prediction methods, which improve the prediction accuracy to a certain degree. However, these methods may achieve low prediction accuracy for the current resource request sequences with complex characteristics and strong fluctuations. Data preprocessing should be performed to smooth the extremely nonstationary sequences to enhance the prediction accuracy. Our previously proposed EEMD-ARIMA and EEMD-RT-ARIMA algorithms improve the prediction accuracy through decomposing a nonstationary sequence into a few relatively stationary component sequences via EEMD method [5,6]. The main difference between EEMD-RT-ARIMA and EEMD-ARIMA methods is that EEMD-RT-ARIMA method reduces the cumulative error and the prediction time by selecting and reconstructing the component sequences with similar characteristics into few component sequences based on RT values when the original sequence has weak fluctuation. RT [43] is a method to check the randomness of a sequence. A RT is defined a component with successive symbols (0 or 1). For instance, a sequence '111,001,110,011′' includes three components with successive "1" and two components with successive "0. " Each component with successive "1" or "0" is regarded as a RT. The total RT number reflects the random fluctuation of the sequence. Any time series can be changed into a sequence with successive symbols (0 or 1) [44]. The larger the total number of RT, the stronger the sequence fluctuates. Once the original sequence has strong fluctuation determined by the RT, EEMD-ARIMA method can get higher prediction accuracy than EEMD-RT-ARIMA due to more stationary component sequences.

Methods
To address the resource allocation lagging behind resource requests, we propose a proactive resource allocation method based on the prediction of resource requests. Figure 1 shows the implement process of this method. First, a RT-based adaptive prediction method is used to forecast the future resource requests based on the past data of resource requests. Then, a proactive resource allocation strategy is proposed based on the prediction of resource requests. Finally, a multiobjective resource allocation method is proposed and solved by an improved NSGA-II algorithm.

RT-based adaptive prediction method
The prediction method is designed with two goals: reduce the prediction time and improve the prediction accuracy. The prediction procedure is shown in Fig. 2. The m component sequences are extracted from a VM request sequence using the principal component analysis method. Next, these component sequences are detected to find and preprocess the outliers. Then, the RT values of these preprocessed sequences are calculated. Finally, these sequences are predicted by adaptively selecting the EEMD-ARIMA or EEMD-RT-ARIMA method according to the comparison between their RT values and the thresholds.
A cloud platform provides many VM flavors, such as 2CPU4G (2 CPU cores, 4G memory) and 4CPU8G (4 CPU cores, 8G memory). We cannot predict each type of VM requests due to the high prediction time. Therefore, principal component analysis is first used in our prediction method, which can extract the major component sequences to reduce the prediction time. For example, a VM request sequence with n types of VMs is denoted as S =< s 1 , ..., s i , ..., s k > , where s i represents the VM number of the ith request. A component sequence S l = < s l1 ,...,s li ,...,s lk > can be extracted from this sequence S for the VM type l , where s li denotes the VM number of the ith request. Thus, an original VM sequence can be divided into many component sequences. We can select the fewest component sequences to implement the prediction of VM requests, where the ratio of the sum of their VM requests to the total number of VM requests (it is called the proportion of VM requests) is beyond the predefined threshold T th at each sampling point. These components sequences can be regarded as the major component sequences. For example, there are two component sequences S h =< s h1 , ..., s hi , ..., s hk > and S g =< s g1 , ..., s gi , ..., s gk > , where s hi and s gi are the quantities of the different types of VM requests. For ∀s hi ∈ S h and s gi ∈ S g , resource requests RT-based Adaptive prediction method historical data

Multiobjective resource allocation method
Proactive resource allocation strategy Fig. 1 Implementation process of a proactive resource allocation. First, a RT-based adaptive prediction method is used to forecast the future resource requests based on the past data of resource requests. Then, a proactive resource allocation strategy is proposed based on the prediction of resource requests. Finally, a multiobjective resource allocation method is proposed and solved by an improved NSGA-II algorithm These two component sequences are selected into a set S main with the major component sequences to implement the prediction. Intuitively, the higher the threshold T th , the more the selected component sequences. The prediction of VM requests is more accurate, which ensures resource allocation to be more correct. However, the more the component sequences, the higher the prediction cost. For instance, if the threshold is set as T ′ th higher than the one T th , that is T ′ th > T th , three major component sequences S h ,S g and S l may be selected into the set S main . For ∀s hi ∈ S h , s gi ∈ S g and s li ∈ S l . .
Then, each component sequence is decomposed into many subsequences to perform the prediction using EEMD-ARIMA or EEMD-RT-ARIMA method. Supposing a component sequence is decomposed into m subsequences and each subsequence cost n seconds to finish the prediction. Here, the running time of each prediction is almost identical, we suppose that it is n seconds for each subsequence. If two component sequences S h and S g are selected under the threshold T th , the prediction cost can be calculated as follows. However, if three component sequences S h , S g and S l are selected under the threshold T ′ th , the prediction cost will be calculated as follows.
It can be seen that the prediction cost will greatly increase though more component sequences selected by setting a higher threshold can improve the prediction accuracy. This will cause a delayed resource allocation not to ensure the normal running of applications. Therefore, setting the threshold T th is important, which not only need to reflect the major VM requests but also reduce the prediction time cost. Supposing p major component sequences have been selected to predict the future VM requests and a unselected component sequence S l has more VM requests than other component sequences. The threshold T th can be set as an approximation of the minimum value of the proportion of VM requests according to the following formula when one of two conditions is satisfied. The threshold T th impact the prediction accuracy and the prediction time cost. ε l and ε t indicate a threshold of the proportion of VM requests and a threshold of the ratio of prediction time cost, respectively. When both the added proportion of VM requests s li /s i is beyond this threshold ε l and the added ratio of prediction time cost 1/(p + 1) is below this threshold ε t , the component sequence S l will be selected to predict the future VM requests.
The quartile method is adopted to detect the outlier points of these major component sequences. Firstly, we calculate the first quartile Q1 , the third quartile Q3 and the interquartile range (IQR) by the formula IQR = Q3 − Q1 , detect the outliers more than 1.5 times over Q3 or less than 1.5 times below Q1 and finally replace these outliers via a cubic spline interpolation method.
The preprocessed component sequences are executed using RT method. Then, we set up an adaptive prediction method based on the RT (APMRT). If the RT value of a component sequence is higher than a predefined threshold R th , the EEMD-ARIMA method is selected to predict the future resource requests. Otherwise, the EEMD-RT-ARIMA is selected to make the prediction. Thus, the prediction accuracy can be improved by preprocessing the outliers of the major component sequences and selecting a more accurate prediction method. We can determine the future number of each type of VM requests and proactively allocate resources to guarantee the timeliness of the resource allocation.
In this method, the time complexity of extracting a component sequence is O(k) . Thus, the time complexity of extracting m component sequences and data preprocessing becomes O(m · k) . Then, the RT values of the extracted sequences are calculated and predicted by using the adaptive prediction algorithm APMRT. The time complexity , where q,p are separately the number of the decomposed component sequences and the new component sequences reconstructed ( p < q ). Therefore, the time complexity of the APMRT algorithm is O(Q · m · k) or O(P · m · k) , which is less than the time complexity O(Q · n · k) or O(P · n · k) of all n component sequences. The prediction time is largely reduced by extracting the main component sequences.

Proactive resource allocation strategy
A cloud resource allocation algorithm should actively predict the future resource requests and allocate resources in advance to cope with the sudden increase of resource requests in the future. The proactive resource allocation framework is shown in Fig. 3. The RT-based adaptive prediction method is used to predict the future number of VM requests based on past data. A hybrid VM request queue is formed by combining the future VM requests predicted with the current VM requests.
Suppose that the current VM request sequence is denoted as indicates the VM number of the i th request at time t . The number D(t + h) of the future l major types of VM requests at time t + h predicted via the adaptive prediction method APMRT is denoted as follows.
D i (t + h) is the i th major type of VM requests at time t + h . The total number of VM requests N (t) at t time should be the sum of the current number of VM requests V (t) and the predicted number of VM requests D(t + h) as follows.
is the current number of VM requests at t time. If the predicted number of VM requests D(t + h) is not less than the threshold N th , some VMs should be allocated resources in advance. The parameter C(t) should equal to 1 and P(t) is a percentage (e.g., 30%) of VM requests to be allocated resources in advance with respect to the predicted number of VM requests D(t + h) . Otherwise, it does not need to provide VMs in advance, that is, C(t) = 0.
After the predicted number of VM requests D(t + h) is determined, the VM request sequence should be established. Assuming that the predicted number of VM requests is Fig. 3 Proactive resource allocation strategy. The RT-based adaptive prediction method is used to predict the future VM requests based on past data. A hybrid VM request queue is formed by combining the future VM requests predicted actively with the current VM requests ordered in descending order from VM type 1 to l , the largest VM requests (i.e., the type 1 VM requests) are placed at the front of the VM request sequence, and the smallest VM requests (i.e., the type l VM requests) are placed at the end of the VM request sequence. The predicted VM request sequence can be expressed as follows.
v i j (t + h) and v i j+1 (t + h) are the quantities of the j th and j+1 th VM requests with the same VM type i . Thus, the VM request sequence at time t can be expressed as follows.

Multiobjective resource allocation method
Our previous work has presented a multiobjective resource allocation method [45]. This method builds a multiobjective function with the minimum number of the used PMs min{ (npv ik − npp jk ) 2 , npv ik represents the normalized resource performance variable of VM v i , npp jk represents the corresponding normalized resource performance variable of the PM p j and k=1, 2, 3 denote the CPU, memory and disk resources, respectively. This paper proposes a new resource allocation method based on the prediction of VM requests (RAMPVR), which further considers two issues to improve the previous resource allocation method. One is to reduce the waste of physical resources. If the proportion of different types of resources from a VM request is closer to those free resources of a PM, it is less likely to cause resource waste for this PM. That is, the closer the resource proportion v i1 : v i2 : v i3 of a VM is to that p j1 : p j2 : p j3 of a PM, the lower the resource waste, where v i1 , v i2 and v i3 represent the requested number of CPU cores, memory capacity and disk size of the VM v i , respectively; and p j1 , p j2 and p j3 denote the free number of CPU cores, memory capacity and disk size of the PM p j , respectively. Therefore, we build the resource proportion matching distance model shown in formula (12), where p jk and v ik represent the free capacity of resource type k of the PM p j and the requested resource capacity of the VM v i , respectively, and R k denotes the coefficient that adjusts the imbalanced values of parameter H = p jk · v i1 /p j1 − v ik for different resource types. For instance, if the values of the parameter H for CPU and disk resources are 2 and 200, the disk will become the dominant resource. Therefore, the adjustment coefficient R k for the disk resource should be adjusted to a lower value than that for the CPU resource, such as using R k = 1 for the CPU resource and R k = 0.1 for the disk resource. (10) Thus, we set up a multiobjective optimization problem of resource allocation according to the number of the used PMs S x ij , the total resource performance matching distance S MD ij and the total resource proportion matching distance S MPM ij as follows.
The first goal of the multiobjective optimization problem M of resource allocation is to minimize the total number of the used PMs, as shown in formula (13), which depends on the value of each mapping element x ij between the VM v i and the PM p j under a solution S . The second goal of the problem M is to minimize the total resource performance matching distance under a solution S , as shown in formula (14), which depends on the resource performance matching distance MD ij between the VM v i and the PM p j . The third goal of the problem M is to minimize the total resource proportion matching distance under a solution S , as shown in formula (15), which depends on the resource proportion matching distance MPM ij between the VM v i and the PM p j . In addition, the total CPU, memory and disk capacities requested by the VMs placed on PM p j are less than its free CPU, memory and disk capacities, respectively. Thus, the constraint conditions are shown in formulas (16), (17) and (18), respectively.
Another is to optimize the solution algorithm that accelerates the solution speed of the multiobjective optimization function. The NSGA-II is a classical algorithm for solving a multiobjective optimization problem [46][47][48]. As a Nondominated Sorting Genetic Algorithm, it has been widely applied in solving the multiobjective problem and achieves good effectiveness [39][40][41]. However, the NSGA-II algorithm has a problem that the computation time of the fitness values (i.e., the objective functions) is too long to ensure (12) the timelessness of the resource allocation. Furthermore, the fitness values of a large number of individuals need to be calculated in the population evolution. Hence, we will improve the NSGA-II algorithm to accelerate the solution speed using the parallel computation of the fitness function. We adopt multicore processors to calculate the fitness values of the individuals in parallel, which can accelerate the convergence of the proposed algorithm. The fitness values of each individual are calculated as follows.

Prediction of VM requests
We select two time series S1 and L1 of continuous container requests, which are taken from the Alibaba cluster data [49] as the experimental dataset of VM requests. These time series only include the data on CPU and memory resources. We will use the sequence S1 as an example to illustrate the adaptive prediction process. This sequence S1 includes 95 sampling points (475 min) and 28 types of VMs, where each sampling point counts the total number of VMs in a 5-min period. We use the principal component analysis method to extract its component sequences S2 and S3 and calculate the threshold T th =85% according to the predefined thresholds ε l =5% and ε t =20% and formula (5)-(7). These sequences are all shown in Fig. 4, where the sequences S2 and S3 represent the VM numbers for the types of 4-core CPU and 1.56 memory (CPU = 400 means 4-core CPU and mem = 1.56 means 1.56 memory) and 8-core CPU and 3.13 memory (CPU = 800 and mem = 3.13), respectively. It is noted that the number of CPU cores and amounts of memory are normalized.
It can be observed that the number of VM requests dynamically changes and demonstrates the characteristic of suddenness, which makes the future resource requests difficult to predict. It can also be seen that the sequence S2 with the 4-core CPU and 1.56 memory is consistent with the trend of the sequence S1 . The sequence S3 with the 8-core CPU and 3.13 memory is roughly the same as the trend of the sequence S1 , but they have some differences in the detailed fluctuations.
Next, we use the quartile method to detect the outliers with red " + " shown in Fig. 5. Sequentially, they are replaced by new data generated by a cubic spline interpolation method. Thus, we get the preprocessed sequences shown in Fig. 6.
Then, we use the adaptive prediction method APMRT to implement the prediction for these preprocessed sequences. The RT values of these sequences are first calculated. And the threshold R th is set as 20, which can be roughly observed from the experimental testing for Alibaba cluster data [6]. It is noted that this threshold R th is different for different traces or scenarios, which need to be achieved from the experimental testing or (19) via your expert experience. We select the first 80 sampling points as the training data and the next 5 points, 10 points, and 15 points as the testing data, respectively. When the RT value of a sequence is lower than the predefined threshold R th , the EEMD-RT-ARIMA method is selected to execute the prediction; otherwise, the EEMD-ARIMA method is selected. Figure 7 shows the mean absolute percentage error (MAPE) of the prediction results. It can be seen that the MAPEs of the 10-point and 15-point predictions increase greatly compared with those of the 5-point prediction. For instance, the EEMD-RT-ARIMA method achieves a MAPE of 9.87% for the 5-point prediction of the sequence S1 , but it achieves MAPEs of 29.62% and 54.99% for the 10-point prediction and 15-point prediction, respectively. Similarly, the EEMD-ARIMA method achieves a MAPE of 11.28% for the 5-point prediction of the sequence S2 , but its MAPEs are 38.31% and 64.51% for the 10-point prediction and 15-point prediction, respectively. It implies that both methods are not suitable for long-term prediction but are for short-term prediction. The Box-plot of VM request sequences. The first, second and third subgraphs demonstrate the outliers with red " + " of the S1, S2 and S3 sequences, respectively reason is mainly due to the strong fluctuation of the sampling data in a short time. The EEMD-RT-ARIMA method achieves lower MAPEs than the EEMD-ARIMA method for the sequences S1 and S3 , while it is the opposite for the sequence S2 . We find that the RT values of S1-S3 are 19, 21 and 19, respectively. This indicates that the proposed prediction method is effective. When a sequence has strong fluctuation, the cumulative prediction error of the component sequences obtained by EEMD decomposition can be less than that caused by the non-stationary sequence. Thus, EEMD-ARIMA method can achieve higher prediction accuracy. Otherwise, EEMD-RT-ARIMA method can reduce more prediction error accumulation than EEMD-ARIMA, and thus, it can achieve higher prediction accuracy. Figure 8 depicts the future 5-point values predicted via the proposed APMRT method. In the same way, we predict the future 5-point values for the sequence L1 , as in Fig. 9. It is also possible that other factors impact the prediction Number of VM requests S1 S2 S3 Fig. 6 The preprocessed sequences of VM requests. The S1, S2 and S3 curves represent the preprocessed sequences after using a cubic spline interpolation method to replace the outliers of the original S1, S2 and S3 sequences, respectively The last three columns represent those obtained by EEMD-RT-ARIMA method. The blue, red and green parts represent the MAPEs of the S1, S2 and S3 sequences in each column, respectively accuracy. In the future, we will further study this issue. This paper pays more attention on virtual resource allocation based on an adaptive prediction of resource requests.

Simulation of the resource allocation
As shown in Fig. 8, 519 VMs (4-core CPU and 1.56 memory) and 62 VMs (8-core CPU and 3.l3 memory) are predicted for 425 sampling points. There is sudden growth over 300. Therefore, we should allocate resources for some VMs in advance to alleviate the latency of the resource allocation at 424 sampling points. We set up the ratio of the  Fig. 9 Number of the predicted VMs for the L1, L2 and L3 sequences. The first, second and third subgraphs depict the comparison between the actual values and the predicted 5-point values obtained by our proposed method APMRT for the preprocessed L1, L2 and L3 sequences, respectively proactive resource allocation P i = 0.3 . Thus, the number of VMs that needs to be created at 424 sampling points can be calculated by the formula 408 + (457 + 62)*0.3 = 564. The number of available PMs is 2972. The resource allocation problem becomes the problem of creating 564 VMs on 2972 available PMs. Similarly, we can observe that the method predicts 281 VMs (4-core CPU and 1.56 memory) and 31 VMs (8-core CPU and 3.l3 memory) for 1065 sampling points for the sequence L1 from Fig. 9. If we set the ratio of the proactive resource allocation P i = 1/3 , we should create 423 + (281 + 31)/3 = 527 VMs on 2972 PMs at 1064 sampling points.
Even if the prediction may fail, the proactive resource allocation will not be greatly affected. For example, if the predicted number of VM requests is 893 or 297 for 425 sampling point, the MAPE will exceed 50%, that is, the prediction fails. We should create 268 or 89 VMs more than the original number of 408 according to the proactive resource strategy for 424 sampling points. It is not more than the actual number of VM requests for 425 sampling points. However, the more the proactive number of VM requests is, the longer the resource allocation time is for 424 sampling points. Therefore, the prediction error should be limited in a certain range.
To verify the effectiveness of the proposed RAMPVR method, we adopt these following metrics to compare our method with others.
(1) Number of the used PMs: If the number of the used PMs is less, some idle PMs can be closed to reduce the energy consumption and cost. (2) Resource performance matching distance: The smaller the resource performance matching distance is, the better the VMs match with the PMs regarding their resource performance. (3) Resource proportion matching distance: The smaller the resource proportion matching distance is, the less the resource waste. (4) Resource utilization: A good resource allocation method should maximize and homogenize each type of resource utilization. (5) Time cost of resource allocation: Our prediction-based resource allocation method reduces the VM creation time by allocating resources for the future VM requests in advance. This paper mainly focused on reducing the solving time of this method. The lower the solving time is, the more resource allocation time that is reduced.
We set the population size, the crossover probability, the crossover distribution index and the mutation distribution index as 200, 0.85, 20 and 20, respectively; and set the reciprocal of the number of variables as the mutation probability in the simulation. The maximum evaluation times of the fitness values and the maximum number of iterations of populations are set as 20,000 and 100, respectively. We compare the proposed RAMPVR method with the round robin (RR), SPEA2 and NSGA-II methods in terms of the number of the used PMs, the resource performance matching distance, the resource proportion matching distance, the resource utilization and the solving time. SPEA2 is another presentative elite multi-objective evolutionary algorithm [50], which can obtain multiple pareto optimal solutions in a single run. It has been widely used in different domain [41,52] and has become the standard for performance comparison of multi-objective evolutionary algorithms [53,54]. Each method is executed 10 times and the respective average results are computed. The experimental results are shown in Tables 2 and 3.
It can be seen from Table 2 that the SEPA2, NSGA-II and RAMPVR methods use different numbers of PMs. Even the RAMPVR method uses different numbers of PMs in different experiments, such as 460 and 462 PMs. The less the number of the used PMs is, the more the saved resource cost is. The CPU and memory utilization of the used PMs are more balanced via resource proportion matching, which will reduce the resource waste. The SPEA2 method achieves CPU utilization of 58.62% and memory utilization of 60.28%, the NSGA-II method obtains 64.01% and 65.45%, and the RAMPVR method achieves 62.80% and 64.29% under the parallel computing of 8 threads, respectively. In addition, they basically keep a similar number of the used PMs, similar resource performance matching and similar resource proportion matching because they achieve the trade-off among them. The RR method demonstrates big differences in these aspects. It uses the most PMs because it adopts a polling mechanism. Furthermore, it achieves the highest resource performance, the highest proportion matching distances, and the most unbalanced resource utilization with CPU utilization of 74.90% and memory utilization of 28.76% due to the polling mechanism, which will cause high resource waste. However, it has a lower solution time of only 0.3 s because it uses a simple heuristic algorithm to solve the problem. Compared with the SPEA2 and NSGA-II methods, the RAMPVR method uses less time to solve the multiobjective functions. For instance, the SPEA2 and NSGA-II methods, respectively, use 1593 and 1551 s to solve the multiobjective problem, but the RAMPVR method only costs 886 s to solve it with the parallel computing of 10 threads. Table 3 also demonstrates this situation. Thus, the VM creation time can be greatly reduced

Conclusion
Cloud resource requests demonstrate the characteristics of being diverse, arriving in bursts and being uncertain, which causes the resource allocation to lag behind the resource requests and the quality of service not to be ensured in a cloud platform. This paper proposes a multiobjective resource allocation method based on an adaptive prediction method for resource requests. This method can allocate virtual resources in advance to alleviate the delay problem of resource provision by using an adaptive method to predict the future resource requests. The timelessness of the resource allocation is further guaranteed by improving the NSGA-II algorithm to reduce the solving time of the multiobjective optimization problem. In addition, the various types of resources in a PM are evenly utilized, which reduces resource waste. Two experiments are conducted to verify the effectiveness of our proposed method. The experimental results show that this method realizes the balance between CPU and memory resources and reduces the resource allocation time by at least 43% (10 threads) compared with the SPEA2 and NSGA-II methods.