Automatic UMTS system resource dimensioning based on service traffic analysis

Mobile network operators base their TDM network capacity dimensioning on Erlang B models. This approach was valid in legacy GSM networks. However, current Universal Mobile Telecommunications System networks deal with different resource consumption services such as voice, video call or data, and different limiting resources such as baseband processing capacity, transmission link capacity to the RNC, or spreading code tree. Operators need models to decide which resource must be upgraded, according to the demand of the services, in order to achieve expected overall service accessibility (i.e., the complementary of blocking probability). Network operation requires detecting when degradation is due to a lack of resources or to a hardware malfunction. Also, when operators need to prevent blockage in a high-capacity demanding event (for which they only have traffic predictions for each service) it is far from trivial to dimension resources. We have implemented a Kaufman Roberts approach to characterize the multiservice resource demand. Using real reported traffic Key Performance Indicators to calibrate the model, an estimated accessibility is obtained at a per-resource level and combined to find global estimated accessibility. The proposed model is intended to assist network operation, estimating individual resource shortage, differentiating congestion from hardware failures, and predicting the necessary resources to be deployed to tackle a high-capacity demanding event.


Introduction
Capacity management in mobile networks implies two main tasks: solving present blockage issues (detecting limiting resources, in order to upgrade them or differentiating congested from malfunctioning resources), and adequate resource provisioning to avoid blockage in a future event from which service demand is estimated (massively populated sports events, concerts, etc.). Mobile network vendors do not provide operators with the tools to tackle these issues. Universal Mobile Telecommunications System (UMTS) network equipment reports key performance indicators (KPIs) related to capacity shortage, such as failed connection attempts, but no clue is provided in order to detect the limiting resource or a subsystem malfunction, nor is any advice given on how many resources to provision in order to avoid blockage for a certain forecasted service demand. As an example, a high concentration of Smartphones in a concert may increase uplink data traffic, requiring certain uplink resources to be improved. Modeling how each service demands capacity from different resources becomes crucial in order to adequately dimension the network.
In this study, we have modeled each resource behavior for a given service (voice, data, video call) demand. We obtain the overall accessibility and validate the model in real scenarios. We also propose how to detect when a specific resource is congested (or whether it is malfunctioning), and what resources must be upgraded to attend to a certain service traffic mix.
Current UMTS Terrestrial Radio Access Network (UTRAN), commonly referred to as 3G (3rd Generation Wireless Mobile Communication Technology), is the incumbent mobile technology in Europe. The authors have studied real 3G network scenarios in Spain, where, at the moment, Long-Term Evolution (LTE)/4G (e-UTRAN) clusters are being deployed, but only in trial scenarios. So, not enough network KPIs are being collected in order to extend the dimensioning model to LTE. We have intentionally based our work on currently deployed 3G networks, where we have plenty of field experience to calibrate the model. We also propose how to extend this approach to future LTE-based networks.
As stated by the 3GPP, services in UMTS are classified as Circuit Switched (CS) and Packet Switched (PS). CS is connection-oriented services, such as voice and video telephony, while PS are data services, such as HSDPA, HSUPA, and Release 99 (R99). CS services are considered guaranteed traffic, while PS services are considered non-guaranteed.
A basic scheme of a UMTS access network is shown in Figure 1. Uu is the radio interface between UE and Node B. Iub is the transmission interface from node B to the Radio Network Controller (RNC). CS and PS services demand resources at different Radio Access Network (RAN) levels: radio interface (spreading codes-SCs), baseband processing capacity (channel elements-CE), and Iub capacity. PS and CS services share the consumption of CE and SC pool. Meanwhile, PS and CS services are carried by Iub_CS and Iub_PS, respectively. PS traffic uses the remaining capacity in the Iub once the CS traffic has been allocated.
Whenever a resource shortage is detected, more resources need to be installed in the system. At radio level, capacity enhancements can be made by the addition of new carriers. Each 5-MHz carrier contains a full SC tree. As stated by 3GPP, each service has a different Spreading Factor (SF) Code consumption. At node B, processing capacity is measured in terms of CE. One CE is the baseband processing capacity required in node B to provide one voice channel, including the control plane signaling. Each service has a different demand of CEs. Increasing the processing capacity of node B through additional CEs involves the installation of extra hardware baseband cards in node B. At the Iub, capacity upgrades will imply increasing the number of virtual channel connections (VCC) when using ATM, or increasing IP throughput when using IP. Iub CS is usually implemented through legacy ATM Constant Bit Rate Virtual Circuits while Iub PS is being migrated from ATM Variable Bit Rate VCs to IP.
Thus, three key points of constraint have been identified: radio, node B, and Iub capacity, hereinafter referred to as SC, CE, and Iub, respectively. A shortage of resources at any point implies a connection reject, degrading user's perceived quality. In consequence, being able to appropriately dimension UMTS resources at every subsystem is a must. Not in vain, insufficient resource allocation leads to service downgrade and over dimension is economically inefficient.
This study addresses the modeling of a UMTS scheme, considering resource availability at the three mentioned stages, i.e., SC, CE, and Iub. These are the resources that can be upgraded by network operation. Other radio interface conditions, such as uplink intercellular and traffic load interference level, or downlink power consumption (depending on traffic and propagation loss), are not considered in this study. The impact of these factors can be diminished enhancing the number of nodes in a specific area and reducing the soft handover area (decreasing nodes overlapping). But in real network operation, the former CE, SC, and Iub are the main limiting resources, and its upgrading is part of the daily operation tasks. The difficulty for operators is to detect which one is the limiting resource and whether the accessibility is restricted due to congestion or to hardware failure. It is also crucial to know how to improve capacity in each resource (downlink/uplink CEs, SC, downlink/uplink Iub) in order to keep accessibility at acceptable levels for a forecast of service traffic mix.
The model described in this article evolved from the initial propositions made in [1]. The proposed model uses as input data the traffic for each service (CS and PS), and estimates the blocking probability for each resource. A system overall blocking probability (or its complementary, the accessibility) is then computed. In contrast with the work in [1], the current study considers the interdependence between resources in the UMTS model. Full details of the model development are given in this study. The total combined accessibility is compared with the KPI-reported accessibility in three different scenarios, now also including a known hardware impaired scenario to better demonstrate the capabilities of the model. Services offered by the UMTS network under study are characterized in terms of their resource consumptions. Then, for each resource, a Kaufman-Roberts (K-R)-based algorithm [2,3] is applied using connection attempts to produce the estimated accessibility for each service class (CS and PS). The accessibilities per resource will be combined to obtain an overall accessibility. Overall accessibility is available in network-reported KPIs, though it is not available at a 'per resource' level. The fact that the model does reach this detail is a key feature.
The rest of the article is organized as follows. Section 2 discusses related works. Sections 3 and 4 provide details of the model design and implementation. Section 5 gives a numerical example to illustrate the model operation. Section 6 debates the model results and compares them with reported KPIs. The same modeling techniques used in 3G/UTRAN can be extended to include 4G/LTE (E-UTRAN), as will be discussed in Section 7. Finally, concluding remarks and future work are presented in Section 8.

Related work
Most models found in the literature study theoretically the capacity of a UMTS system but do not deal with the set of hardware subsystems suitable to upgrade or repair by network operation for a specific traffic demand. Conversely, results of those works are mostly validated through simulations and do not rely on a real service providing network, as the present study does.
Several articles have been dedicated to discuss radio interface issues, such as interference or power budget [4][5][6][7]. The common scope is to enhance network capacity associated to radio conditions, by reducing the soft handover area, controlling overshooting, or providing better power control algorithms. The Iub dimensioning has explicitly been discussed in several articles [8][9][10][11][12][13].
Main studies related to multiservice environments use K-R approach, or variations of this algorithm. Staehle and Mäder [14] proposed a K-R revised algorithm which considers state-dependant blocking probabilities to obtain a good approximation of uplink blocking probabilities. They first determine the blocking probability as a function of the own cell interference and the other cell interference. With the K-R model they combine their approach to a multiservice environment. Similarly, Iversen [15,16] considers uplink interference, using a modified K-R recursive algorithm. He discusses the usage of state-dependant blocking probabilities and the particularities considered to allow reversibility and reality at the same time.
In this study, we have not considered limiting factors such as uplink and inter-cell interference, as those are not adjustable by network operation. They are aspects more closely related to network planning and initial dimension. We have rather focused on system features which value can be changed by regular network operation (such as baseband capacity, Iub capacity, and number of carriers).
Mäder and Staehle [17] obtain a model considering the effects of soft blocking and imperfect power control. A K-R approach is used for the calculation of transmission power depending on the number of powercontrolled mobiles. Although very interesting, such approaches are not the main concern of daily network operators, according the author's experience.
Vassilakis et al. [18] investigate blocking probabilities in the uplink considering handoff. They also investigate the impact of elastic and streaming traffic on network capacity [19,20]. Sallent et al. [21] propose the radio conditions enhancement through antenna tilt optimization, obtaining better channel quality index and EcNo values, with impact on node capacity.
Renard et al. [22] provide an analytical model to dimension the X2 link, for LTE. They also use a K-Rderived formula, proving the possibilities of extension of the present work in upcoming LTE networks.
As LTE medium access protocol is orthogonal frequency-division multiple access (OFDMA), several articles dimension multiservice demand using K-R for OFDMA access. Blaszczyszyn and Karray [23] use an Erlang's loss model to dimension downlink in OFDMA networks. Karray [24] particularizes the downlink QoS study to streaming and elastic traffics.
As depicted from related work on LTE, most authors use K-R approximation to forecast system blocking in multiservice environments. Network resources in LTE are mainly similar to those in UTRAN, as both rely on baseband, radio carrier, and transmission interface capacity as critical resources. Nevertheless, these LTE studies lack of real field validation. Instead, our study has been written from on field experience and it is especially a practical for actual field network operation.

Blocking probability models
In UTRAN, a connection attempt is blocked whenever one of the necessary resources is not available. At the same time, resources need to attend different services simultaneously.
Algorithms such as K-R [2,3] provide a valid method for obtaining multiservice blocking probability for a single resource. Nonetheless, the industry has accepted K-R-based algorithms to model multiservice network blocking probability. As explained in Section 2, multiple extensions have been studied to improve the basic K-R model, particularly including issues regarding CDMA features. Nevertheless, the original algorithm and the efficient implementation provided in Stasiak et al. already provide notable results regarding degradation detection.
According to the authors' first definition, given a limited capacity for a single resource C, the blocking probability of service i, P b,i , can be described as in Equation (1). The distribution q is expressed in terms of the resource consumption of service, b i , and of a i , as the offered traffic intensity (in Erlang), as explained in Equation (2). The basic definition of the K-R algorithm is outlined in Figure 2.
Initial implementations of basic K-R recursive algorithms are computationally inefficient and thus, timeconsuming [15,16]. While the traditional recursive algorithm is widely used to obtain the blocking probabilities in multiservice scenarios [25,26], the authors have used an improved version, based on the fast Fourier transformation (FFT) as proposed in [3], notably reducing the computation consumption and producing faster results. The authors have verified the accuracy of the FFT implementation, in comparison with the recursive implementation, enhancing the computational performance.
Finally, each probability distribution will be composed as indicated in Equation (2). Namely, At each constrained resource, the modeling algorithm explained in Equation (1) calculates the blocking probability for each service i for the corresponding limited resource n. The total blocking probability at this resource, integrating all services, is calculated by the expression in (5) where P b,ni denotes the blocking probability for the service i at the resource n ca ni stands for the connection attempts for service i at the subsystem or constrained resource n. The blocking probability at resource n is used to shrink the traffic that reaches next constrained resource, n + 1.
Þ ; n > 1 ca n i for n ¼ 1 are the call attempts attended at the first subsystem Thus, the proposed model requires each scenario to be defined in the following terms: resources available at each subsystem, traffic demand, and connection attempts for each service. The unitary consumption for each service at each resource must also be known.
The resources at each subsystem, C, depend on the hardware configuration of the particular scenario. Iub CS and Iub PS are carried through different transmission systems, with different capabilities for UL and DL.
At the CE subsystem, C is expressed as the amount of CE available in UL and DL. A CE is the baseband processing capacity required in node B to provide one voice channel, including the control plane signaling. The particular hardware configuration of a node determines the number of cards and the amount of CE per card available (in the nodes under study, typically each card has 384 CEs). At SC, the factor C is defined as the number of SCs available in the tree for each service. There can also be more than one tree (adding a new 5-MHz carrier). A regular SC tree has codes ranging from SF 16 to SF 256 and each service consumes one or more codes at a certain SF level.
The services considered also depend on the particular scenario. Common UMTS services include voice, PS384 service, HSPA (UL and DL), and Video Telephony. Traffic and connection attempts for each service are obtained from network KPI. In addition to user data channels, signaling is also considered.
Once the resource consumption has been characterized for the different services and the traffic demand for each one is known, the modeling algorithm can be applied to estimate the per-resource and the overall accessibility.

Why model UMTS accessibility
Accessibility is a measure of accepted versus attempted connections and it is measured as a percentage. The network-reported KPI for accessibility accounts for both hardware failure and congestion connection blockage. Theoretical accessibility, as calculated in this study, measures just the blockage due to congestion. Thus, it is possible to discriminate whether a node rejects connection due to lack of resources or to resource malfunction. This process is depicted in Figure 3. Also, as this theoretical accessibility is obtained per resource, it becomes clear which one to upgrade for a certain known or forecasted traffic demand. As node B subsystems can be classified in CE, SC, and Iub, K-R is applied in each one considering all service traffic connection attempts. K-R model has been used for every resource in the system, as each of them share a pool of resources to attend a collection of services with a different resource consumption profile. Later, the three accessibilities are aggregated to obtain the overall accessibility. The same exercise is repeated in DL and UL. In order to compare the overall theoretical with the real accessibilities, the model uses as input the traffic demands reported by network KPIs for each service.
In the UMTS architecture, services follow different paths along node B depending on the nature of the traffic. The paths are described in Figure 4. The resources involved in DL channel include Iub DL, CE, and SC, while resources comprised in UL exclude SC.
Each subsystem receives a collection of connection attempts; some are rejected, and those accepted are handed to the next subsystem. The model considers this successive leakage of attempts, as stated in Equation (6).
On the DL, connection attempts first enter the Iub. Accepted connections through Iub are then processed at DL CE to decide whether sufficient resources are available. Finally, the remaining non-blocked traffic reaches the SC block. The output through the entire chain will K-R processing Detection  conform to the successful connections performed. On the UL, the connection attempts will first reach the UL CE and then the Iub UL. The output traffic will be the successful connections (Figure 4). SC is not involved in the uplink as each user terminal has its own set of orthogonal codes. Each block in Figure 4 is implemented as a K-R node, as formulated in Section 3. As previously stated, it must be considered that the blocked attempts do not try the next resource: blocking probability at each resource is used to reduce the input traffic to the next block and so on. Traffic intensity, a i , for each service is directly obtained from network-reported KPIs [27]. Earlier studies [1] did not include such particularization as resources were considered as independent with regard to the traffic input statistics.
To be able to detect short-term accessibility downgrades, it is convenient to deal with at least per minute traffic information. While most vendors provide 15-min resolution, it became necessary to estimate the values at a minute scale by means of interpolation. Input traffic from KPIs is interpolated using a Poisson distribution. For large values of the distribution statistic, this process produces a reconstruction bias as demonstrated in [1]. Values of lambda parameter as 1 were proved to be sufficiently accurate, as reconstruction bias was negligible.
Once the accessibility at each resource is calculated, it is necessary to estimate the overall accessibility. Accessibility is considered separately for rigid (CS) and elastic (PS) services. Nevertheless, formulation is equivalent to both approaches and, for the sake of clarity, only a general expression is shown here.
The overall accessibility for each traffic direction, UL and DL, is the result of the combination of the per resource probabilities taking into account the serial concatenation of the subsystems. The accessibility is therefore the ratio between attempted and successful calls. Accordingly, the combined accessibility would respond to the following expressions of accessibility for DL and UL, respectively.
In the case of Iub, formulation is valid both for CS and PS services.
To be able to obtain a fair comparison of the estimated accessibility with that reported by the network, global CS accessibility including UL and DL is combined in a single expression obtained as the total succeeded calls normalized by the total input call attempts in both directions.

Numerical example
In order to illustrate the basic operation of the developed model, a simple numerical example will be outlined here.
First, we define the amount of resources available at the node. Let us consider a simple scenario with different resources, SC, CE, and Iub (in the DL path) and CE and Iub in the uplink. Iub PS DL uses IP, while Iub CS and Iub PS UL use ATM. Thus, all services but HSDPA use ATM. The number of carriers is one so we are considering one SC tree.
Iub_PS DL uses a DSL line with 23 Mbps capacity. The node has two E1 lines with one VCC channel for CS. The node has also two CE cards for UL and another set of two for DL. The resources are defined as Table 1 indicates.
Each of the services demanded has a unitary consumption on the different resources. The unitary consumptions for the services are detailed in Table 2. Voice consumes 1 CE both UL and downlink. 12.2 kbps codecs are used for voice. Consumption of CE for HSPA services are defined as 21 CE for each 16 users in the DL and 32 CE for each 16 users in the UL. The throughput for HSPA services is variable. The voice uses one SC with SF 256. HSUPA does not consume SC in node B as it relies on the user equipment uplink carrier SC tree.
Next, we define an arbitrary combination of traffic intensities and connection attempts for each service at a certain instant t i , as shown in Table 3. The developed model will obtain the accessibility at that instant. The actual implementation considers an array of traffic instants corresponding to the daily demand. The intensities shown at the instant t i are inspired in the values registered in the real scenarios for a 15-min observation period. To calculate the traffic demand in Erlangs for PS services, Equation (9) is used. λ states for the bps consumption, as reported in KPIs, whereas μ states for the data channel capacity in bps. In this example, it is defined as 1 kbps. The accessibilities reported in Table 4 are obtained at a per-resource level. Note that SC accessibility is not applicable for UL for the reasons explained before. The combination of different resources is performed as indicated in expressions (7) and (8).
Finally, through the combination method described previously, we obtain the following values for the overall scenario ACC CS = 51.07% ACC PS = 43.35%. These values are suitable for the comparison with the reported values in KPI in a given real scenario.

Validation case studies
Several real UMTS network scenarios have been selected to calibrate and test this model. KPIs from these scenarios have been used to validate the estimation of the overall accessibility. Customer devices in the tests were mainly Smartphones with HSPA available. Most Smartphones had one radio carrier capacity. Some 3G USB cards were also used, with two radio carriers capacity. The performance of the model is measured in terms of its ability to detect congestion and to identify the congested resources. For that purpose we have studied the behavior of two different scenarios providing network reports before and after performing a resource upgrade. Reports at these scenarios are collected in consecutive weeks, from the same node with similar traffic demands. It will also be demonstrated that how the model is capable of helping the operators to distinguish network congestion from hardware failure through another scenario, whose reports have been collected in a known hardware failure condition.
For the scenarios in this study, it is mandatory to know the available capacity at each resource. Services involved in the study range from voice connections to HSPA data. Traffic intensities and connection attempts for each service at every scenario are obtained from network KPIs.
The first case study shows a node with a deep degradation at a peak hour. The CS accessibility is the most affected parameter and for clarity issues, PS is omitted. Figure 5 shows the comparison between real and estimated CS accessibility in the first case study before and after performing an upgrade by augmenting the Iub CS capacity. The Iub CS capacity is doubled, as is presented in the spider plot within Figure 5. It should be noted that the resource plots have been normalized to the maximum values for each resource across scenarios. The shadowed parts in the CS accessibility plots indicate the period where the node accessibility is remarkably degraded.
As can be seen in the figures, theoretical and real accessibilities are analogous, allowing the operator to detect capacity degradations only observing the estimated accessibility. However, the offset in theoretical values is due to factors not taken into account in the model (such as radio conditions, propagation loss, interference, power limitations, etc.). The benefit of the model is to obtain values per resource, and to exclude the effect of allocation rejections due to hardware failures.
After upgrading the Iub CS, both the real CS accessibility and the CS model estimated accessibility confirm the capacity improvement. In addition, estimated accessibility can also be dissected at resource level (CE, SC, and Iub). This is a clear advantage of the model, as network-reported KPIs provide only the overall accessibility.
The supplementary accessibility values provided by the model at each resource constitute a valuable asset for network engineers. In the first scenario, the CS accessibility at Iub resource is the most limiting resource. Accessibility for CS services before and after the upgrading of the Iub is shown in Figure 6, left. The improvement of the accessibility at Iub is roughly the same improvement registered in the overall accessibility. It is  noticeable how the overall accessibility is reasonably close to the form of the Iub resource accessibility. This fact has come to confirm that the Iub was the most constrained resource at this node. The second case study depicted in Figure 7 demonstrates the performance of the model regarding PS accessibility. According to the accepted threshold for node degradation (around 99%), the real accessibility in the second case study, before the upgrade is performed, indicates that the node is consistently degraded after the eighth hour. Observing the estimated accessibility and focusing on the relative value of the estimated data, more than on the absolute value, degradation is observed at around the ninth hour.
When observing the per-resource accessibilities, the UL CEs are the most constrained resource. After    upgrading the UL CEs by adding another 384 CE, the PS accessibility has significantly improved according to both the real and the estimated accessibilities in Figure 7.
Again the accessibility at the CE UL resource has improved after the upgrading by about 30%. Similar to the results observed in the first scenario, the accessibility at the CE UL is very close to overall accessibility, proving the hypothesis that this was the resource that most contributed to the node degradation. Finally, in Figure 8, a case study consisting of a scenario with a known hardware failure is used to evidence the performance of the model regarding hardware failure identification. Reported accessibility is lower than that estimated in the model. This evidences the presence of hardware failure rejections, as the theoretical accessibility only considers congestion-originated rejections.

Extension to LTE
The approach developed and tested with 3G/UMTS can be migrated to 4G/LTE-based systems, albeit with a few modifications, due to technological differences. The baseband processing in the network nodes is performed by similar resources, conceptually equivalent to CE. On the other hand, the radio carrier capacity is no longer defined by SC tree, standard in CDMA. In LTE, the radio channel will be divided into multiple radio carriers and transmitted using frequency multiplexing techniques such as OFDMA. Full IP interface S1 replaces Iub, and links the E-NodeB directly to the gateway, bypassing the current RNC. Consequently, the model can be extended by redefining the module of radio interface resource occupation and modeling appropriately the new IP interface, S1. In addition, CE per service consumption table needs to be updated. Future development roadmap includes developing the 4G/LTE version of the model incorporating real system evaluations of real operative 4G networks as soon as they become available. At the time of publication, small demonstrative clusters are deployed, but none is yet in production service.

Conclusion
This study presents a model for the estimation of accessibility for UMTS services. It has proven to be very useful for network operation, serving as a tool to provision network resources to manage a certain capacity demand in a multiservice scenario. It also helps to detect poorly dimensioned and malfunctioning resources.
A K-R approach is implemented, but FFT version is used instead of the recursive algorithm, in order to reduce computation time. The model is validated through comparison between estimated and real network reported accessibilities. The validation of the model includes real scenarios with different blocking conditions.