To address the resource allocation lagging behind resource requests, we propose a proactive resource allocation method based on the prediction of resource requests. Figure 1 shows the implement process of this method. First, a RT-based adaptive prediction method is used to forecast the future resource requests based on the past data of resource requests. Then, a proactive resource allocation strategy is proposed based on the prediction of resource requests. Finally, a multiobjective resource allocation method is proposed and solved by an improved NSGA-II algorithm.
RT-based adaptive prediction method
The prediction method is designed with two goals: reduce the prediction time and improve the prediction accuracy. The prediction procedure is shown in Fig. 2. The m component sequences are extracted from a VM request sequence using the principal component analysis method. Next, these component sequences are detected to find and preprocess the outliers. Then, the RT values of these preprocessed sequences are calculated. Finally, these sequences are predicted by adaptively selecting the EEMD-ARIMA or EEMD-RT-ARIMA method according to the comparison between their RT values and the thresholds.
A cloud platform provides many VM flavors, such as 2CPU4G (2 CPU cores, 4G memory) and 4CPU8G (4 CPU cores, 8G memory). We cannot predict each type of VM requests due to the high prediction time. Therefore, principal component analysis is first used in our prediction method, which can extract the major component sequences to reduce the prediction time. For example, a VM request sequence with \(n\) types of VMs is denoted as \(S = < s_{1} ,...,s_{i} ,...,s_{k} >\), where \(s_{i}\) represents the VM number of the ith request. A component sequence \(S_{l} {\text{ = < s}}_{l1} {,}...{\text{,s}}_{li} {,}...{\text{,s}}_{lk} { > }\) can be extracted from this sequence \(S\) for the VM type \(l\), where \(s_{li}\) denotes the VM number of the ith request. Thus, an original VM sequence can be divided into many component sequences. We can select the fewest component sequences to implement the prediction of VM requests, where the ratio of the sum of their VM requests to the total number of VM requests (it is called the proportion of VM requests) is beyond the predefined threshold \(T_{th}\) at each sampling point. These components sequences can be regarded as the major component sequences. For example, there are two component sequences \(S_{h} = < s_{h1} ,...,s_{hi} ,...,s_{hk} >\) and \(S_{g} = < s_{g1} ,...,s_{gi} ,...,s_{gk} >\), where \(s_{hi}\) and \(s_{gi}\) are the quantities of the different types of VM requests. For \(\forall s_{hi} \in S_{h}\) and \(s_{gi} \in S_{g}\),
$${\text{if}}\;(s_{hi} + s_{gi} )/s_{i} \ge T_{th} ,\;\;\;select(S_{h} ,S_{g} ) \to S_{main}$$
(1)
These two component sequences are selected into a set \(S_{main}\) with the major component sequences to implement the prediction. Intuitively, the higher the threshold \(T_{th}\), the more the selected component sequences. The prediction of VM requests is more accurate, which ensures resource allocation to be more correct. However, the more the component sequences, the higher the prediction cost. For instance, if the threshold is set as \(T_{th}^{\prime }\) higher than the one \(T_{th}\), that is \(T_{th}^{\prime } > T_{th}\), three major component sequences \(S_{h}\),\(S_{g}\) and \(S_{l}\) may be selected into the set \(S_{main}\). For \(\forall s_{hi} \in S_{h}\), \(s_{gi} \in S_{g}\) and \(s_{li} \in S_{l}\).
$${\text{if}}\;(s_{hi} + s_{gi} + s_{li} )/s_{i} \ge T_{th}^{\prime } ,\;\;\;select(S_{h} ,S_{g} ,S_{l} ) \to S_{main}$$
(2)
.
Then, each component sequence is decomposed into many subsequences to perform the prediction using EEMD-ARIMA or EEMD-RT-ARIMA method. Supposing a component sequence is decomposed into m subsequences and each subsequence cost n seconds to finish the prediction. Here, the running time of each prediction is almost identical, we suppose that it is n seconds for each subsequence. If two component sequences \(S_{h}\) and \(S_{g}\) are selected under the threshold \(T_{th}\), the prediction cost can be calculated as follows.
$$C\left( {S_{h} ,S_{g} ,T_{th} } \right) = 2m \cdot n$$
(3)
However, if three component sequences \(S_{h}\), \(S_{g}\) and \(S_{l}\) are selected under the threshold \(T_{th}^{\prime }\), the prediction cost will be calculated as follows.
$$C^{\prime}\left( {S_{h} ,S_{g} ,S_{l} ,T_{th}^{\prime } } \right) = 3m \cdot n$$
(4)
It can be seen that the prediction cost will greatly increase though more component sequences selected by setting a higher threshold can improve the prediction accuracy. This will cause a delayed resource allocation not to ensure the normal running of applications. Therefore, setting the threshold \(T_{th}\) is important, which not only need to reflect the major VM requests but also reduce the prediction time cost. Supposing p major component sequences have been selected to predict the future VM requests and a unselected component sequence \(S_{l}\) has more VM requests than other component sequences. The threshold \(T_{th}\) can be set as an approximation of the minimum value of the proportion of VM requests according to the following formula when one of two conditions is satisfied. The threshold \(T_{th}\) impact the prediction accuracy and the prediction time cost.
$$T_{{th}} \leftarrow \min \left\{ {\left( {s_{{11}} + ... + s_{{1k}} } \right)/s_{1} ,...,\left( {s_{{i1}} + ... + s_{{ik}} } \right)/s_{i} ...,\left( {s_{{p1}} + ... + s_{{pk}} } \right)/s_{p} } \right\}$$
(5)
$${\text{S}}.{\text{T}}.\;\;\;s_{li} /s_{i} < \varepsilon_{l}$$
(6)
$${\text{or}}\;\;\;1/(p + 1) > \varepsilon_{t}$$
(7)
\(\varepsilon_{l}\) and \(\varepsilon_{t}\) indicate a threshold of the proportion of VM requests and a threshold of the ratio of prediction time cost, respectively. When both the added proportion of VM requests \(s_{li} /s_{i}\) is beyond this threshold \(\varepsilon_{l}\) and the added ratio of prediction time cost \(1/(p + 1)\) is below this threshold \(\varepsilon_{t}\), the component sequence \(S_{l}\) will be selected to predict the future VM requests.
The quartile method is adopted to detect the outlier points of these major component sequences. Firstly, we calculate the first quartile \(Q1\), the third quartile \(Q3\) and the interquartile range (IQR) by the formula \(IQR = Q3 - Q1\), detect the outliers more than 1.5 times over \(Q3\) or less than 1.5 times below \(Q1\) and finally replace these outliers via a cubic spline interpolation method.
The preprocessed component sequences are executed using RT method. Then, we set up an adaptive prediction method based on the RT (APMRT). If the RT value of a component sequence is higher than a predefined threshold \(R_{th}\), the EEMD-ARIMA method is selected to predict the future resource requests. Otherwise, the EEMD-RT-ARIMA is selected to make the prediction. Thus, the prediction accuracy can be improved by preprocessing the outliers of the major component sequences and selecting a more accurate prediction method. We can determine the future number of each type of VM requests and proactively allocate resources to guarantee the timeliness of the resource allocation.
In this method, the time complexity of extracting a component sequence is \(O(k)\). Thus, the time complexity of extracting m component sequences and data preprocessing becomes \(O(m \cdot k)\). Then, the RT values of the extracted sequences are calculated and predicted by using the adaptive prediction algorithm APMRT. The time complexity becomes \(O(2m \cdot k + q \cdot m \cdot k)\) or \(O(2m \cdot k + p \cdot m \cdot k)\), where \(q\),\(p\) are separately the number of the decomposed component sequences and the new component sequences reconstructed (\(p < q\)). Therefore, the time complexity of the APMRT algorithm is \(O(Q \cdot m \cdot k)\) or \(O(P \cdot m \cdot k)\), which is less than the time complexity \(O(Q \cdot n \cdot k)\) or \(O(P \cdot n \cdot k)\) of all \(n\) component sequences. The prediction time is largely reduced by extracting the main component sequences.
Proactive resource allocation strategy
A cloud resource allocation algorithm should actively predict the future resource requests and allocate resources in advance to cope with the sudden increase of resource requests in the future. The proactive resource allocation framework is shown in Fig. 3. The RT-based adaptive prediction method is used to predict the future number of VM requests based on past data. A hybrid VM request queue is formed by combining the future VM requests predicted with the current VM requests.
Suppose that the current VM request sequence is denoted as \(V(t) = < v_{1} (t),...,v_{i} (t),...v_{n} (t) >\), where \(v_{i} (t)\) indicates the VM number of the \(i\) th request at time \(t\). The number \(D(t + h)\) of the future \(l\) major types of VM requests at time \(t + h\) predicted via the adaptive prediction method APMRT is denoted as follows.
$$D(t + h) = D^{1} \left( {t + h} \right) + ... + D^{i} (t + h) + ... + D^{l} (t + h)$$
(8)
\(D^{i} (t + h)\) is the \(i\) th major type of VM requests at time \(t + h\). The total number of VM requests \(N(t)\) at \(t\) time should be the sum of the current number of VM requests \(V(t)\) and the predicted number of VM requests \(D(t + h)\) as follows.
$$N(t) = M(t) + D(t + h) \cdot C(t) \cdot P(t)$$
(9)
\(M(t){ = }v_{1} (t) + ... + v_{i} (t) + ... + v_{n} (t)\) is the current number of VM requests at \(t\) time. If the predicted number of VM requests \(D(t + h)\) is not less than the threshold \(N_{th}\), some VMs should be allocated resources in advance. The parameter \(C(t)\) should equal to 1 and \(P(t)\) is a percentage (e.g., 30%) of VM requests to be allocated resources in advance with respect to the predicted number of VM requests \(D(t + h)\). Otherwise, it does not need to provide VMs in advance, that is, \(C(t) = 0\).
After the predicted number of VM requests \(D(t + h)\) is determined, the VM request sequence should be established. Assuming that the predicted number of VM requests is ordered in descending order from VM type 1 to \(l\), the largest VM requests (i.e., the type 1 VM requests) are placed at the front of the VM request sequence, and the smallest VM requests (i.e., the type \(l\) VM requests) are placed at the end of the VM request sequence. The predicted VM request sequence can be expressed as follows.
$$V(t + h) = < v_{1}^{1} (t + h),v_{2}^{1} (t + h),v_{3}^{1} (t + h),...,v_{j}^{i} (t + h),v_{j + 1}^{i} (t + h)...v_{m}^{l} (t + h) >$$
(10)
\(v_{j}^{i} (t + h)\) and \(v_{j + 1}^{i} (t + h)\) are the quantities of the \(j\) th and \(j{ + }1\) th VM requests with the same VM type \(i\). Thus, the VM request sequence at time \(t\) can be expressed as follows.
$$V^{\prime}(t) = < v_{1} (t),...v_{n} (t),v_{1}^{1} (t + h),v_{2}^{1} (t + h),v_{3}^{1} (t + h),...,v_{j}^{i} (t + h),v_{j + 1}^{i} (t + h),...,v_{m}^{l} (t + h) >$$
(11)
Multiobjective resource allocation method
Our previous work has presented a multiobjective resource allocation method [45]. This method builds a multiobjective function with the minimum number of the used PMs \(\min \{ \sum\limits_{S} {x_{ij} } \}\) and the minimum total resource performance matching distance between VMs and PMs \(\min \{ \sum\limits_{S} {MD_{ij} } \}\), where \(x_{ij}\) denotes the mapping element between the VM \(v_{i}\) and the PM \(p_{j}\). If the VM \(v_{i}\) is placed on the PM \(p_{j}\), \(x_{ij}\) equals 1. Otherwise, \(x_{ij}\) equals 0. Thus, the formula \(\sum\limits_{S} {x_{ij} }\) represents the total number of the used PMs under a solution S. In the formula of the resource performance matching distance \(MD_{ij} = \sqrt {\sum\limits_{k = 1}^{3} {(npv_{ik} - npp_{jk} )^{2} } }\), \(npv_{ik}\) represents the normalized resource performance variable of VM \(v_{i}\), \(npp_{jk}\) represents the corresponding normalized resource performance variable of the PM \(p_{j}\) and \(k{ = }1,2,3\) denote the CPU, memory and disk resources, respectively.
This paper proposes a new resource allocation method based on the prediction of VM requests (RAMPVR), which further considers two issues to improve the previous resource allocation method. One is to reduce the waste of physical resources. If the proportion of different types of resources from a VM request is closer to those free resources of a PM, it is less likely to cause resource waste for this PM. That is, the closer the resource proportion \(v_{i1} :v_{i2} :v_{i3}\) of a VM is to that \(p_{j1} :p_{j2} :p_{j3}\) of a PM, the lower the resource waste, where \(v_{i1}\), \(v_{i2}\) and \(v_{i3}\) represent the requested number of CPU cores, memory capacity and disk size of the VM \(v_{i}\), respectively; and \(p_{j1}\), \(p_{j2}\) and \(p_{j3}\) denote the free number of CPU cores, memory capacity and disk size of the PM \(p_{j}\), respectively. Therefore, we build the resource proportion matching distance model shown in formula (12), where \(p_{jk}\) and \(v_{ik}\) represent the free capacity of resource type k of the PM \(p_{j}\) and the requested resource capacity of the VM \(v_{i}\), respectively, and \(R_{k}\) denotes the coefficient that adjusts the imbalanced values of parameter \(H = p_{jk} \cdot v_{i1} /p_{j1} - v_{ik}\) for different resource types. For instance, if the values of the parameter \(H\) for CPU and disk resources are 2 and 200, the disk will become the dominant resource. Therefore, the adjustment coefficient \(R_{k}\) for the disk resource should be adjusted to a lower value than that for the CPU resource, such as using \(R_{k} = 1\) for the CPU resource and \(R_{k} = 0.1\) for the disk resource.
$$MPM_{ij} = \sqrt {\sum\limits_{k = 1}^{3} {\left( {\left( {\frac{{p_{jk} \cdot v_{i1} }}{{p_{j1} }} - v_{ik} } \right) \cdot R_{k} } \right)^{2} } }$$
(12)
Thus, we set up a multiobjective optimization problem of resource allocation according to the number of the used PMs \(\sum\limits_{S} {x_{ij} }\), the total resource performance matching distance \(\sum\limits_{S} {MD_{ij} }\) and the total resource proportion matching distance \(\sum\limits_{S} {MPM_{ij} }\) as follows.
$$M:\min \left\{ {\sum\limits_{S} {x_{ij} } } \right\}$$
(13)
$$\min \left\{ {\sum\limits_{S} {MD_{ij} } } \right\}$$
(14)
$$\min \left\{ {\sum\limits_{S} {MPM_{ij} } } \right\}$$
(15)
The first goal of the multiobjective optimization problem M of resource allocation is to minimize the total number of the used PMs, as shown in formula (13), which depends on the value of each mapping element \(x_{ij}\) between the VM \(v_{i}\) and the PM \(p_{j}\) under a solution \(S\). The second goal of the problem M is to minimize the total resource performance matching distance under a solution \(S\), as shown in formula (14), which depends on the resource performance matching distance \(MD_{ij}\) between the VM \(v_{i}\) and the PM \(p_{j}\). The third goal of the problem M is to minimize the total resource proportion matching distance under a solution \(S\), as shown in formula (15), which depends on the resource proportion matching distance \(MPM_{ij}\) between the VM \(v_{i}\) and the PM \(p_{j}\). In addition, the total CPU, memory and disk capacities requested by the VMs placed on PM \(p_{j}\) are less than its free CPU, memory and disk capacities, respectively. Thus, the constraint conditions are shown in formulas (16), (17) and (18), respectively.
$${\text{S}}.{\text{T}}.\;\;\sum\limits_{S} {v_{i1} } \cdot x_{ij} \le p_{j1}$$
(16)
$$\sum\limits_{S} {v_{i2} } \cdot x_{ij} \le p_{j2}$$
(17)
$$\sum\limits_{S} {v_{i3} } \cdot x_{ij} \le p_{j3}$$
(18)
Another is to optimize the solution algorithm that accelerates the solution speed of the multiobjective optimization function. The NSGA-II is a classical algorithm for solving a multiobjective optimization problem [46,47,48]. As a Nondominated Sorting Genetic Algorithm, it has been widely applied in solving the multiobjective problem and achieves good effectiveness [39,40,41]. However, the NSGA-II algorithm has a problem that the computation time of the fitness values (i.e., the objective functions) is too long to ensure the timelessness of the resource allocation. Furthermore, the fitness values of a large number of individuals need to be calculated in the population evolution. Hence, we will improve the NSGA-II algorithm to accelerate the solution speed using the parallel computation of the fitness function. We adopt multicore processors to calculate the fitness values of the individuals in parallel, which can accelerate the convergence of the proposed algorithm. The fitness values of each individual are calculated as follows.
$$f_{1} (I_{k} ) = \sum\limits_{S} {x_{ij} }$$
(19)
$$f_{2} (I_{k} ) = \sum\limits_{S} {MD_{ij} }$$
(20)
$$f_{3} \left( {I_{k} } \right) = \sum\limits_{S} {MPM_{ij} }$$
(21)