Performance prediction and enhancement of 5G networks based on linear regression machine learning

The feature-rich nature of 5G introduces complexities that make its performance highly conditional and dependent on a broad range of key factors, each with unique values and characteristics that further complicate 5G deployments. To address the complexities, this work develops a new modular model based on machine learning on both architecture and service factors (5GPA) that actively contribute to variations in 5G network performance. The objectives are to address the complications during the design and planning phases according to the requirements before 5G deployment, simplify the whole feature-selection process for different deployments, and optimize 5G network performance. The model is implemented and the results are utilized to determine the correlation between the 5GPA factors and the overall performance. Additionally, a simulated 5G dataset is generated and utilized to make predictions on 5G performance based on unseen factors and values of interest. The reliability of the model is validated by comparing the predicted and actual results in the context of quality of service requirements. The results represent a high level of accuracy, with an average of 95%, and low error rates in terms of mean absolute error, mean squared error, and root mean squared error, averaging 7.60e−03, 1.18e−04, and 8.77e−03, respectively.


Introduction
Due to its potential capabilities, the fifth-generation new radio (5G NR) technology can offer promising solutions for a variety of networking requirements by providing improved capacity, latency, and reliability aspects.While these aspects are the key differentiating characteristics of 5G from previous network technologies, they depend on a broad range of performance affecting (5GPA) factors.This work categorizes the 5GPA factors into two groups: those related to 5G architecture, such as operating frequency, channel bandwidth, connection density, coverage area, and deployment environments, and those related to 5G services, such as the distribution of inter-arrival time, packet size, and protocol type, as detailed below.
To provide a wide range of services, 5G can operate in multiple radio frequency bands, each with its own set of benefits and challenges.The bands are divided into three categories low, medium, and high.The low band frequencies are those below 1 GHz and the medium band frequencies are between 1 to 6 GHz.The frequency bands below 6 GHz are commonly called Sub-6 or FR1.The high band frequencies, which are also known as millimeter wave (mmWave) or FR2, generally refer to any frequency above 24 GHz [1,2].To identify the operating bands, they are named with the prefix "n" in 5G networks.As these operating frequencies have different capabilities, they contribute differently to the performance of 5G networks.While higher frequency mmWaves are used for faster data transmission, lower frequencies are better suited for remote access links due to their ability to cover larger areas.To transmit data, users connect to the 5G node base station (gNB), which forwards the data to the end recipient via 5G core components.In this regard, the proximity of users to the gNB is a significant challenge in 5G networks.As users move away from the gNB, transmission speed and received power decrease, resulting in increased deployment complexity and cost due to the need for additional cells, gNBs, and antennas.The 5G frequency bands are also different in the amount of bandwidth they support per communication channel.There are multiple narrowband and wideband channel bandwidths, ranging from 5 to 400 MHz [3].While the Sub-6 supports channel bandwidths between 5 and 100 MHz, larger bandwidths of up to 400MHz are available with the mmWave [4].Wider channels with higher frequencies can elevate the performance of data-intensive services to the next level.However, they also generate a higher level of noise, which can result in data errors and transmission failures.Additionally, bandwidth sharing among users directly affects connection density and, therefore, the scalability of 5G networks.
For 5G deployment, allocation of the operating frequency and bandwidths may differ for outdoor and indoor environments.Therefore, the environment in which the 5G network is deployed can also affect the performance.The outdoor environments are categorized as rural macrocell (RMa) and urban macrocell (UMa), whereas the indoor environments include urban microcell (UMi) and indoor hotspot (InH) [5].In either of these environments, the channel state between the end-users and gNB can be either line of sight (LOS) or non-line of sight (NLOS) [6].These environments differ in their requirements due to varying characteristics.Urban environments are densely populated with people, numerous tall buildings, and other barriers, resulting in high levels of congestion.In contrast, rural environments have fewer and smaller buildings and are less congested.Additionally, the location of the gNB to which users connect depends on the environment type.In UMa and UMi, for example, the presence of tall buildings requires gNBs to be placed above and below the roofs of the buildings, respectively [5].In addition to these architectural factors, the factors related to the characteristics of mobile services, including distribution of inter-arrival time, packet size, and protocol type can further affect 5G transmission reliability, capacity, and latency.
As can be seen, the complexities associated with 5G make its capabilities highly conditional and dependent on a broad range of contributing key factors.The uncertainties about the performance efficiency of different 5G factors in different environments and use cases, complicate efforts to quantify their impact.To address these complexities, this work develops a 5G modular model based on machine learning (ML).The model is, to the best of our knowledge, the first to take into account both architectural and service key factors that contribute to variations in 5G performance, generate a simulated 5G dataset, and predict the impact of various factor combinations and interactions on 5G performance.The primary objectives of the model are to address the complications during the design and planning phases according to the requirements before deployment, simplify the whole feature-selection process for different 5G deployments, and thereby optimize overall 5G network performance.The key contributions of this work are summarized as follows.
• An in-depth evaluation is conducted on various 5G architecture-level and servicelevel factors (5GPA) that actively contribute to variations in 5G network performance.The factors include the most current operating frequencies in sub-6 and mmWave, channel bandwidth, environment, connection density, coverage area, and characteristics of mobile data, including inter-arrival time, protocol type, and packet length distribution.• The correlation between the 5GPA factors (predictor variables) and the overall performance of 5G networks (response variables) is measured in terms of quality of service (QoS) requirements.The effectiveness level of the given factors, as well as the influence of their dependency and interaction on the performance of different 5G deployments, are further determined.• Based on these evaluations, a 5G dataset is generated and utilized by the 5G multivariate linear regression (5GMLR) prediction module in the model.The 5GMLR can make predictions on 5G performance based on the factors and values of interest.• The predicted results are compared to the actual results to further validate the accuracy and reliability of the model.The reliability of the results is crucial in accurately determining 5G performance before actual deployments and prioritizing the factors and combinations that contribute to optimizing 5G performance.
To achieve these objectives, the network simulator (NS3) is used to implement the model and generate the 5G dataset.Additionally, for the prediction component of the model, Python, along with the NumPy, Pandas, and SKlearn libraries, are used on the Kali Linux 2022 operating system environment.
The rest of the work is organized as follows.Section 2 discusses the most recent relevant works.The model and implementation details are provided in Sect.3. Section 4 presents the results and corresponding analysis, and Sect. 5 concludes the work.

Related works
As an emerging technology, 5G is expected to have a profound impact on the diverse requirements of different networks and overcome the limitations that current technologies are unable to resolve.This results in a wide range of use cases that require further research.Therefore, there is an increased interest in various 5G aspects, most of which focus on the mmWave portion of the spectrum and multiple-input multiple-output (MIMO) antennas [7].Regarding the 5G environments and frequency spectrum, the authors in [8] take into consideration the mmWave with n257 and n260 as the frequency bands for the deployment of 5G networks in UMa and UMi environments with the NLOS condition.Their main focus is to measure the uplink and downlink throughput achievement when the users move away from their gNB.However, the constraints of the work are that it does not take into account Sub-6, other effective factors other than distance, and other important performance metrics.The 5G mmWave is also investigated in [9] using performance affecting factors, including distance, output power, antenna gain, MIMO, and bandwidth.The efficiency of 5G deployment is measured in terms of the received signal power, while no assessment is made on the other metrics and factors, as well as the Sub-6 bands.In contrast, the authors in [10] investigate both Sub-6 and mmWave frequency bands in UMa and UMi environments.
Both LOS and NLOS conditions are evaluated to measure the maximum achievable throughput by the users.Apart from not considering other 5G key factors, the work is limited to throughput performance, while other network performance metrics are not evaluated.Massive MIMO offered by 5G is investigated in [11].In terms of power consumption and energy efficiency, the performance of 5G using a variable number of antennas, equal to and larger than 4G, is being investigated and compared to 4G performance.Despite this, the work lacks an evaluation of other performance factors and metrics associated with 5G networks.The MIMO technology in 5G networks is also investigated in [12,13].Due to the importance of MIMO technology, the ultramassive MIMO effects in 6G networks are discussed in [14].While the bit error rate is measured to determine different MIMO configurations, a comparison with 5G and other effective factors is not provided.In contrast, the effects of connection density on the energy efficiency of 6G networks are evaluated in [15].However, the evaluation factor is limited to the connection density, and no other contributing factors are taken into account.The maximum transmission power and minimum rate are the two factors discussed in [16] to optimize 6G performance in terms of energy efficiency.
5G network scalability through increasing the number of users in the range of 10-50 and the corresponding effects on mmWave is investigated in [17].In this regard, the measurements of throughput, spectral efficiency of the network, and fairness parameters are provided.While the investigation is with no regard to environments, conditions, Sub-6 bands, and coverage area, the assessment of real-time requirements such as latency, loss ratio, and jitter is not provided.In contrast, the latency measurements for video and voice real-time data exchanged on 28GHz 5G networks with a variable number of users are provided in [18] for throughput and fairness measurements.Despite that, the work is limited to the number of users, ignoring other important 5G key factors.The authors in [19] consider 1, 10, and 100 users in 5G networks to measure throughput, delay, and jitter in 5G networks with no regard to other 5G performance affecting factors.The increasing number of users is also taken into account in [20] for the Internet of Vehicles (10,20,30,40,50, and 60 active vehicles).The throughput, delay, and loss ratio are measured to evaluate the 5G performance, while the evaluation scope does not cover other 5G factors.The distance between the users and gNB is another important factor that affects the 5G performance.The authors in [21] consider the mmWave spectrum and evaluate the performance of 5G networks when the users' distance from gNB varies in the range of 0 to 100 m.In this regard, SINR and throughput are measured without considering other key factors or metrics.
The distance is also investigated in [22] and [23] to determine the received power and path loss, respectively.Unfortunately, there are some drawbacks as other performance metrics and factors are ignored.
Traffic modeling with regard to packet length is investigated in [24].Different lengths of packets are transmitted to a maximum of two users on 5G networks in both uplink and downlink directions.The results are measured in terms of throughput, delay, and error rate to be compared with the 4G networks.However, no investigation of other 5G performance affecting factors and metrics is provided to extend the findings, while a very low number of users with zero competition to access media cannot accurately determine the resource allocation.The packet length for uplink and downlink directions to measure 5G throughput is also investigated in [25].Moreover, traffic modeling in the context of interarrival time and packet length for the TCP protocol is taken into account in [26].The TCP performance in terms of latency is evaluated in the mmWave spectrum with the existence of different users and different scheduling algorithms.However, an assessment of the other performance affecting factors and metrics is not provided.With regard to TCP performance in 5G, the TCP variants are evaluated in [27,28] in terms of throughput and delay.The traffic modeling in the context of data rate is evaluated in [29] and [30] in comparison with Wi-Fi6 and 4G, respectively, while other 5G factors are not investigated.The beamforming and scheduling algorithms to allocate resources to 5G users are among the other factors evaluated in 5G networks [31,32] and [33,34], respectively.
The related works reveal three main research gaps that need to be addressed.First, mmWave as a new spectrum in 5G has become the primary concern for the majority of these works, as compared to Sub-6.However, 5G Sub-6 as the spectrum used by the most recent wireless technology (802.11ax), is also required to enable coexistence and further expand 5G capabilities and services.Second, due to the feature-rich structure of 5G, there is a lack of information on the resulting combinations and interactions of certain factors on the actual performance experienced by 5G end-users.Third, the complexity of 5G networks demands predictive models to assist developers in fully determining performance requirements and ensuring the successful deployment of 5G networks by incorporating various aspects of 5G and analyzing outcomes in a flexible and reliable manner.To address these constraints, this work proposes a modular machine learningbased model on the applied to the 5G architecture and service key contributing factors.The model supports major 5G factors, including various frequency bands consisting of both mmWave and Sub-6, different deployment environments (RMa, UMa, UMi), available bandwidths for the given frequency bands, area of coverage, connection density, and traffic characteristics related to inter-arrival time, packet length, and protocol type.

Material and methods
To meet the increasing expectations and future needs, 5G includes a wide range of performance affecting factors (5GPA).Measuring the correlation between these factors and verifying every possible combination, interaction, and effect is essential to determine the constraints and challenges that must be addressed for 5G to achieve its goals.These further facilitate accurate pre-selection of the factors on the basis of the requirements before network deployment, which leads to the desired level of performance.However, while the involvement of many contributing factors brings flexibility to 5G deployments, accurately predicting performance based on all these interacting factors remains a major challenge in practice.To address the challenge, this work proposes a 5G model consisting of five distinct modules: design, system-level, service-level, performance modeling, and performance prediction.The proposed model is implemented using the network simulator (NS3) to collect performance results and generate a simulated 5G dataset.For the prediction module of the model, Python is used along with the NumPy, pandas, and SKlearn libraries, running on the Kali Linux 2022 operating system environment.Figure 1 presents the visual design of the model, and detailed explanations are provided below.

Design module
In order to develop the underlying 5G network, a design module is required to take the data from the relevant modules as input and provide the output.This module implements the 5G new radio (NR) network, which consists of the 5G radio area network (5G RAN) and core components [35].Within the 5G RAN, the 5G NR connection is provided to connect the user equipment (UE) to the gNB, which in turn is connected to the 5G core.As in real-world deployments, the gNB antennas at the Sub-6 frequency band are arranged in a square pattern with 4 × 4 elements, while for mmWave, a massive MIMO with 64 × 64 elements is provided [36].This helps to prevent overestimation or underestimation of the performance of 5G, thus preserving the accuracy of the results.The module also includes a 5G remote server to transmit mobile data to the UEs via the 5G network, with the transmission being characterized by both the system-level and service-level modules.

System-level module
To support diverse use cases, the model includes different 5GPA factors that are relevant to both 5G architecture and services.This provides 5G networks with a high level of flexibility in the factor planning and selection process.The system-level module is developed based on the constraints and complexities associated with different 5G architectural factors that have varying features.These factors are as follows.

Spectrum management
5G defines different carrier frequencies to offer different services.Higher carrier frequencies improve transmission speed and latency, making mmWave superior to Sub-6 for achieving faster and more responsive communication.Moreover, the shorter wavelength of the mmWave allows for the designing of smaller antennas, which can be beneficial in various applications.Additionally, the mmWave spectrum is less congested than Sub-6, which is already occupied by other wireless technologies, including 4G, Wi-Fi, and Bluetooth.Despite the advantages, mmWaves have some drawbacks.They have greater path loss and difficulty in overcoming obstacles, which limit the overall coverage area of 5G networks.In contrast, 5G deployments in the Sub-6 bands can provide greater network coverage at the cost of lower transmission speed.Therefore, to achieve the advantages that come with each frequency band, their proper integration according to the requirements must be ensured.This requires evaluating the available operating frequencies and precisely determining the performance achievement.Consequently, the model supports the low, mid, and high bands.For 5G to operate in the low-band, the n28 (784 MHz) is used, while in the mid-band of Sub-6, the n3 (1865 MHz), n7 (2655 MHz), and n78 (3500 MHz) bands are used.Moreover, for high-band mmWave, n258 (24 GHz) and n260 (37 GHz) are used.These operating frequencies are selected by the model based on, first, availability because for 5G as a growing technology, there is still a large amount of unused spectrum at present and second, international utilization in terms of the most widely used operating bands by operators and service providers across the world.

Environment
The urban environments (UMa or UMi) are characterized by many blocking objects, such as houses, tall buildings, and bridges, making them highly congested.In contrast, buildings in rural environments (RMa) are smaller in number and size, resulting in less congestion.The distinctive characteristics, challenges, and requirements of the environments directly affect the performance of 5G networks deployed around them.As a result, the model implements 5G in all three environments (RMa, UMa, and UMi) to assure compliance with real-world deployments.

Channel bandwidth
Each 5G frequency band has multiple channel bandwidths (C BW ).Wider channels can enhance transmission speed and scalability in terms of the number of simultaneous connections, but they are also more susceptible to radio interference.While increasing transmit power can address this issue, it is not optimal for battery-powered devices and may restrict the use of 5G in leading-edge applications such as the energy industry, remote monitoring in healthcare systems, and 5G-based IoT implementations.These imply the importance of bandwidth management according to the network requirements.The model supports all available 5G bandwidths to measure their effectiveness, identify possible benefits and limits in practice, and determine the optimal bandwidths that contribute to 5G performance optimization.For non-bandwidth use cases, the maximum supported network bandwidth for each 5G frequency band is provided.

Coverage area
Radio coverage area (CA) is a major concern for wireless systems.Higher frequencies have a shorter wavelength, leading to a limited coverage area.This can cause link stability and reliability issues, necessitating the deployment of additional cells and gNBs.However, such deployments can be expensive and further complicate the overall 5G implementation.Therefore, it is essential to determine both the maximum and effective coverage area of gNB (CA gNB ) across which 5G signals of different bands can travel.The model considers seven alternative distances, D1 to D7, to cover near, intermediate, and far locations.The objectives are to evaluate the impact of relocating UEs away from their associated gNBs in different 5G deployments and thereby determine the effective as well as maximum supported distances.To ensure compliance with cellular implementation, a default range of 100m is considered for non-CA use cases.The testing distances are listed in Table 1.

Connection density
Because bandwidth is limited, sharing it among 5G users can lead to a decrease in network performance.As user density increases and more users connect to 5G networks, the demand for sharing bandwidth increases, resulting in more collisions and higher latency.Consequently, mobile operators often limit the number of users on the network to avoid these issues, but at the cost of extra cell installation.Thus, user density remains a critical issue in cellular communication, and it is necessary to balance the number of users on the network to ensure optimal performance.In this context, the model supports a variable number of UEs (N UEs ) ranging as N UEs ∈ {5, 10, 15, 20, 25, 30, 35, 40} to achieve two objectives.First, to measure the performance of 5G networks and their ability to meet resource demands as connection density increases and congestion occurs.Second, to determine the density limit as the maximum number of allowed connections for which 5G networks provide services at a satisfactory level from the end-user's perspective.This further facilitates a high-level scalability analysis of the 5G networks to evaluate how efficiently they scale up in crowded areas.The default N UEs is 10 for non-scalability use cases.

Service-level module
Unlike the system-level module, which is architectural-based, the service-level module relies on mobile data modeling.Accurate and efficient traffic modeling is crucial to enhance network service quality, pre-determine necessary buffer and bandwidth resources, and meet real-time and reliability requirements of mobile data.The module characterizes the mobile traffic as follows.

Packet length distribution
5G is a potential connectivity solution for a variety of existing technologies, each with specific limitations on packet length.For instance, Ethernet has a maximum packet length of 1500 bytes, while IoT implementations, such as Zigbee, LoRa, and NB-IoT, have packet length limits of 128, 256, and 1600, respectively.Therefore, for 5G to be adopted as a connectivity standard for different technologies, it must meet the packet length criteria.The model supports various packet lengths (P Len ) ranging as P Len ∈ {64, 128, 256, 512, 1024, 2048, 4096} bytes to cover short (64, 128, 256 B), medium (512, 1024 B), and large (2048, 4096 B) packets with two main purposes.First, to determine the performance of 5G networks using different packet lengths and identify those that contribute to optimization.Second, to evaluate 5G capabilities and verify if it supports diverse services and meets the criteria of existing technologies in terms of packet length restrictions.The default P Len in the model is 1024B.

Protocol type
The transmission of different applications varies on networks based on the specific requirements of each application.Time-sensitive applications are real-time data and their performance depends on network latency and jitter, while high-performance applications demand greater bandwidth.The transport protocol used for transmission of time-sensitive and high-performance services is UDP and TCP, respectively and they can use either IPv4 or IPv6 as the network protocol.As the type of data can directly affect the performance of 5G, the model supports high-performance and realtime data delivery on both IPv4 and IPv6.The objectives are to evaluate the specific demands of diverse applications in 5G networks and determine to what extent they will meet those demands in the context of data type.The default protocols are UDP and IPv6 for use cases that do not include the type of protocol for data transmission.

Inter-arrival time distribution
Due to the limited resources of wireless devices, the distribution of packet inter-arrival time (P IAT ) is critical for resource management.Given that shorter intervals increase the rate of data transmission, rate adaptation mechanisms to manage the high utilization of computing resources are required, otherwise, exchanging a large number of packets in a short time can lead to poor performance and major issues, such as an increase in power consumption for packet processing, instability, and packet loss.The model evaluates the inter-arrival time distribution in conjunction with the channel bandwidth variations.The default inter-arrival time (P IAT = 8 × 10 -4 s) is reduced to one-fifth (P IAT = 1 × 10 -4 s), which increases the radio data rate (R DR ) from 10 to 50 Mbps.The objectives are, first, to verify how well 5G can adapt its available capacity to the processing changes and maintain stability, and second, to determine the effective rate to meet the QoS requirements.

Performance modeling module
The model includes a performance modeling module to evaluate different 5G deployments.The evaluation is conducted in terms of average throughput, end-to-end delay, packet loss ratio, jitter, fairness index (FI), channel spectral efficiency, throughput efficiency, and signal to interference and noise ratio (SINR) parameters.Due to limited and shared resources, wireless devices have to compete for available resources, which creates a challenge in resource allocation.Therefore, the fair and proper distribution of available resources between devices in order to preserve their satisfaction determines the efficiency of wireless networks.The most common way to measure fairness is Jain's index (FI) [37].The FI is a value between 0 and 1, with a number closer to 1 indicating a better allocation of resources in the system.To evaluate the effectiveness of 5G networks and their management abilities in allocating available resources among associated users, the model measures FI from the end user's perspective as follows.
In addition to fairness, the quality of data transmission can also be characterized in terms of signal to interference and noise ratio (SINR).The SINR is a key consideration in wireless measurements as an indicator of signal quality so that a higher level represents a better signal.The module considers the SINR from the user's point of view and performs all the measurements on the UE side.Accordingly, the SINR of the i th user is calculated in dBm, as shown in Eq. ( 2). (1) where : N UEs is the number of UEs ∈ {5, 10, 15, 20, 25, 30, 35, 40} T i is Throughput of the ith UE, ∀i ∈ 5 ≤ i ≤ N UEs In the above equation, PS i is the power of gNB signal received by the ith user to which it is connected, PN i is the power of noise for the communication channel of the ith user, PI_INTER m is the power of inter-cell interference received at the ith user from nearby gNBs, and PI_INTRA j is a sum of the power of intra-cell interference received at the ith user from all other UEs within the cell [38].With a single-cell 5G network, there is no explicit inter-cell interference from the nearby cells.In this case, intra-cell interference generated by other UEs within the same cell will constitute total interference.It is important to address both forms of interference because, although intra-cell interference may be avoided by using orthogonal modulation methods, it persists in 5G networks when multiuser MIMO is used.
The module further evaluates the channel spectral efficiency (C SE ).As mentioned, each 5G frequency band has multiple channel bandwidths, and efficient use of them can improve performance.Bandwidth usage is considered efficient when the maximum amount of data can be transmitted through it.The C SE is defined in Eq. ( 3) to measure the amount of data that each 5G bandwidth can transfer in bits/sec/Hz [39].
In addition to the efficiency of service delivery, it is also important to determine its quality.In this regard, the model takes into account the throughput efficiency (η).It is an indication of service quality with a value between 0 and 1 for which a value closer to 1 indicates higher quality [40].Throughput efficiency is calculated according to Eq. ( 4).
Consequently, by incorporating the given 5GPA factors, values, and criteria and providing over 2700 distinct scenarios, the model allows a wide range of use cases to precisely measure and analyze the performance of 5G networks in different deployments.Table 2 presents the factors used for performance modeling, along with their testing and default values.Table 3 provides the common 5G parameters used by the model.

Performance prediction module
The results from implementing the model are utilized for two primary purposes.While they are used directly in the data analysis process to evaluate 5G performance and identify the appropriate factor-value pairs for the specific use cases, they are also used to ( R DR is the radio data rate AvgT i is the average Throughput of the i th UE over time C BW is the 5G NR channel Bandwidth generate a 5G simulated dataset.Due to a large number of factors and values that can affect 5G performance, this dataset is highly important.It can be applied to predict 5G performance based on the 5GPA factors and also to validate the actual measurement results by comparing them with the predicted results.To achieve these, the model develops 5G multivariate linear regression (5GMLR) and incorporates it into the performance prediction module.It uses the simulated 5G dataset derived from the measurement results of the performance modeling module.The reason for using regression machine learning is that the simulated dataset contains multiple predictor variables, each with a different set of continuous response variables.The predictor variables are in system-level and service-level modules, whereas the response variables are in the performance modeling module.The 5GMLR can make predictions about 5G performance, verify general groupings of the 5GPA factors, and determine their correlation and influence level to provide the basis for further analysis and high-level decision making.
To implement the 5GMLR, Python (PyCharm 2022.1) with NumPy, pandas, and SKlearn libraries [41] are used in Kali Linux 2022 operating system.In order to provide a better understanding of all the factors and value combinations and prepare the dataset accordingly, we initially use the classification and regression tree (CART) to divide the Here, the X i predictor values are 5GPA factors as described before (i.e., the most current operating frequencies in Sub-6 and mmWave, channel bandwidth, environment, connection density, coverage area, and characteristics of mobile data, including inter-arrival time, protocol type, and packet length distribution) while the Yi response values are the evaluation parameters (i.e., throughput, delay, packet loss ratio, jitter, channel spectral efficiency, throughput efficiency, FI, and SINR).Hence, for each given X i (5GPA as predictor variables), the 5GRML calculates the values of θ i (intercept and coefficients) and Y j (metrics of interest as response variables) so that the sign of θ i reflects the impact is positive or negative, formulated as follows: Validation is a crucial part of the overall model development process to ensure that it performs as intended.Accordingly, validation of the 5GMLR prediction model is divided into two parts: applicability assessment to verify whether multiple linear regression is applicable to the simulated dataset and accuracy assessment to analyze the accuracy of the predicted results as follows.

Applicability assessment
In order to verify whether multiple linear regression is applicable to the 5G simulated dataset, the probability value (p-value) of the 5GMLR model is measured.The p-value represents the probability of obtaining the findings by chance [41].The p-value is compared to an alpha threshold value, where a p-value less than the alpha level indicates that the results are statistically significant and not due to chance.In this regard, as the p-value is a number between 0 and 1, a smaller value indicates a stronger relationship between the testing variables.To measure the p-value, we set the alpha threshold to 0.05 and conduct tests over the entire simulated dataset with the 5GPA factors and performance metrics as testing variables.The p-value results presented in Table 4 correspond to the connection density use cases.
(5) The above results imply that all of the p-values are less than alpha.The thorough testing of the 5G dataset confirms that the p-value is consistently lower than the alpha level, thus validating the suitability of linear regression for the simulated dataset and establishing the reliability of the 5GMLR model.Although p-values can indicate the presence of an effect, they cannot determine the magnitude of that effect.One commonly used method for this purpose is Cohen's d, which provides a numerical estimate of the magnitude of the effect [42].Therefore, we run further tests and measure the effect size on a numeric scale using Cohen's d.By considering the variables in two separate groups, control and experimental, Cohen's d determines the effect size of X i on Y relative to X j so that a larger effect size indicates a stronger relationship between the two groups.In this context, the values of 0.2, 0.5, 0.8, and 1.3 are considered small, medium, large, and very large effect sizes, respectively.For instance, we run tests in connection density use cases to predict 5G performance as the number of UEs increases in RMA 5G networks relative to UMa or UMi.Therefore, Cohen's d is measured with RMa in the control group and UMa and UMi in the experimental group.The results are provided in Table 5.
The above results predict that when the number of UEs increases in the 5G network operating on the n28 frequency band, the effect on delay for users located in RMa is close to that of users in UMa and UMi.This means that delay-sensitive applications are likely to run similarly in RMa, UMa, and UMi due to the small effect size.However, when 5G networks operate on n258, the effect size increases.The very large effect size (1.8) predicts that the transmission delay for users in UMa will be higher compared to those in RMa, which will have a major impact on delay-sensitive applications.The results further predict that the delay experienced by users in UMi relative to RMa will be even higher than in UMa due to a higher effect size (2.6).

Accuracy assessment
The next step is to evaluate the accuracy of the 5GMLR prediction model.The assessment is done by comparing the predicted values made by the 5GMLR model with the corresponding actual observation in the dataset to determine the similarity and, hence, the accuracy of these results.To measure the accuracy of the model, we use commonly used metrics, including the mean absolute error (MAE), mean squared error (MSE), root mean squared error (RMES), and coefficient of determination (R 2 ) [43].
The mean absolute error (MAE) measures the average magnitude of errors in terms of the absolute differences between the actual and predicted values in the dataset.Therefore, a lower MAE represents a better match between the actual and predicted values and, hence, a higher accuracy of the prediction model.In this context, zero error would indicate a perfect model.The MAE of the 5GMLR prediction model is calculated as follows.
The mean squared error (MSE) measures the average magnitude of errors in terms of the squared differences between the actual and predicted values in the dataset.We can then calculate the square root of the MSE values to determine the root mean squared error (RMSE).The MSE and RMSE represent the error magnitude of the model so that the lower values reflect the higher accuracy of the model.The MSE and RMSE calculations of the 5GMLR model are provided in the following equation.
In addition to the error metrics mentioned above, the accuracy of the 5GMLR prediction model is further measured using the coefficient of determination (R-squared).This metric represents the percentage of how well the predicted values can explain the variations in the response variables, with a higher percentage indicating a higher level of accuracy.The accuracy of the 5GMLR model is calculated as follows. ( where : Ỹi is the predicted value Yi is the actual value from the dataset n is the total number of sample observation (7) where : Ỹi is the predicted value Yi is the actual value from the dataset n is the total number of sample observation (8) where : Ỹi is the predicted value of Y i Y i is the actual value of Y i from the dataset Y i is the mean value of Y i n is the total number of sample observation We conduct various tests to measure the errors and accuracy of the 5GMLR prediction model.For the sake of simplicity, we present the errors and accuracy of the results using the following predictor variables: X 1 = {n28, 258}, X 2 = {RMa, UMa, UMi}, and X 3 (N UEs ), with delay as the response variable (Y = Delay).The comparison  between the actual and predicted values is provided in Fig. 2, while the error measurements and accuracy of the prediction model are provided in Table 6.
The obtained results indicate a low error rate, high accuracy, and a close match between the simulated and predicted results.These findings suggest that the 5GMLR prediction model is reliable and can effectively predict the performance experienced by users in 5G networks based on the predictor variables.

Results and discussion
This section presents the results obtained from the implementation of the model to determine the effectiveness of the 5GPA key factors, their interdependency, the efficiency of their combination and interactions, and the optimal levels that actively contribute to 5G performance optimization.

Connection density
Due to the limited capacity of networks, an increasing number of users cannot be supported indefinitely and at certain points, resulting in congestion and performance degradation.We measure the impact of an increasing number of UEs on providing a satisfactory level of performance for end-users.These measurements further determine the limits of 5G scalability in terms of the maximum number of supported users.The results are provided in Fig. 3.
The above results reveal that the best performance in three environments is achieved by the n78 band in terms of all the evaluation parameters.With regard to RMa environment, the n258 mmWave band provides a comparable level of performance to n78 and is therefore suitable for use in dense rural areas.These bands can efficiently manage an increasing number of users without loss of performance, even with the maximum number of simultaneous connections.In contrast, the n28, n3, and n7 bands can only manage up to 10 active users efficiently, after which congestion severely affects performance, especially when using the n28 band.Regarding 5G deployment in urban environments, UMa outperforms UMi using mmWave, while both perform similarly in the Sub-6 spectrum.Like RMa, the n78 in UMa provides the highest throughput as the maximum radio data rate (R DR = 10 Mbps).However, once the number of users increases to 35, a reduction in throughput occurs.Unlike in RMa, the n258 delivers significantly less throughput in UMa environments.The comparison between UMa and UMi reveals some similarities and differences.The n78 and mmWave bands achieve the best and worst results, respectively, while an increase in the connection density affects UMi more than UMa in mmWave bands.As a result, communication between devices in UMi is slower compared to UMa.These results conclude that to ensure optimal 5G performance in terms of connection density in RMa, utilizing n78 and n258 bands with up to 40 UEs meets the demands.Similarly, to optimize 5G deployments in both UMa and UMi, utilizing n78 with up to 40 concurrent users is the optimal condition.However, as 5G installations under these optimal conditions can be costly in terms of system resources, it is also essential to specify alternative deployment options that provide average performance, which may not be optimal but still acceptable.The results suggest that to achieve an average performance in RMa, the connection density should be below 40 for n78 and n258, and 25 for the other frequency bands.For Uma and UMi, the connection density should be limited to a maximum of 25 UEs to maintain decent performance, except for the n78 band, which is 40 users.The efficiency of throughput results is provided in Fig. 4. The model is implemented to further assess the fairness of distributing the required resources among users as they enter the 5G network.The fairness results are shown in Fig. 5.
The results show fairness improves in rural areas compared to urban.Except for the n7 band, the fairness in RMa is constant, and it offers maximum fairness (FI = 1) regardless of the increasing number of users.The n7 band is also fully capable of managing resources as long as there are not more than 15 UEs.Once the number of UEs exceeds 15, the fairness decreases slightly, which is negligible.A similar conclusion can be drawn for the n7 band in UMa as it offers lower fairness than the other Sub-6 bands.Regarding mmWave, it provides considerably less fairness than Sub-6 in urban environments.In UMa, the n258 is able to give a higher degree of fairness than the n260, in which the level of fairness decreases significantly as the number of users increases.Despite these differences, they both show comparable performance in UMi.As a result, the findings determine that as the number of users in 5G networks increases and the demand for available resources grows, the level of fairness provided by the Sub-6 bands remains superior to those offered by the mmWave bands in urban environments.In this regard, there are no significant differences in rural environments.
To provide better data visualization of the relationship between the variables in the 5GMLR model and extract the required information, a heatmap graph is drawn.The variables are weighted from − 1 to + 1, with values closer to 0 indicating no correlation, closer to 1 indicating a stronger positive correlation as increasing one variable increases the other, and closer to -1 indicating a stronger negative correlation as increasing one variable decreases the other.These insights represent the patterns and levels of dependency between the variables and facilitate proper factor selection for performance Fig. 6 The correlation between the X 1 (C CF ), X 2 (Env), and X 3 (N UEs ) variables when Y = Delay optimization.The heatmap graph for X 1 (C CF ), X 2 (Env), and X i (N UEs , i = 3) variables are provided in Fig. 6.
According to the predicted results, there is always a positive correlation between N UEs and delay.They move in the same direction, hence, increasing N UEs causes more latency in 5G networks for all the operating frequencies and environments.The results also demonstrate that the least impact due to an increase in N UEs occurs when 5G networks operate in the n78 frequency band in all environments, with an average impact of about 60%.

Coverage area
The distinct characteristics of 5G frequency bands result in different coverage areas for each band.This is particularly important when considering the diverse features of the environments in which they are deployed.Thus, identifying the supported coverage area is important to avoid ambiguous interpretations of users' locations and improve services.The model is implemented accordingly to establish the performance of 5G networks as a function of coverage area and determine the maximum as well as optimal ranges for data transmission.The results are presented in Fig. 7.
The results confirm that mmWave is more sensitive to link attenuation over long distances compared to Sub-6.The n28 band provides the greatest coverage in all environments, while the n78 band offers the best performance.According to Table 1, testing areas for n78 in RMa are between 100 m (D1) and 1000 m (D7), and it is capable of providing the best performance for all of these distances.Therefore, optimizing 5G performance using n78 in RMa can be achieved for any distance between 100 to 1000 m without any noticeable performance loss.However, the maximum distance reached by the n78 in urban is much shorter than in rural environments, which is 500 m for both UMa and UMi.With regard to the n28 band, it supports the longest CAs, which are 3000, 1500, and 1800 m in RMa, UMa, and UMi, respectively.However, to optimize performance, effective distances are much shorter.For users to obtain the best performance in RMa, their maximum distance from gNB should be up to 1500 m, as opposed to 700 m in UMa and UMi environments.After n28, n3 provides the longest distance in Sub-6.For n3 in RMa, the UEs can be up to 1800 m away from their gNB to receive signals, compared to 900 m and 1000 m in UMa and UMi, respectively.However, in order to optimize the 5G performance using the n3 band, any distances shorter than 1200 m will be optimal for UEs in RMa compared to 600 m and 500 m in UMa and UMi environments, respectively.Following n78, the n7 band achieves the best performance in Sub-6, but not the longest coverage.According to the RMa findings, n7 improves 5G performance for distances less than 1000 m.As the distance increases further, performance degradation begins so that the maximum distance at which 5G users can receive signals is 1400 m.However, regarding n7 coverage in urban environments, different findings are obtained.In UMa, 5G performance optimization is provided for distances less than 500 m, while the maximum achievable distance is 700 m.The n7 findings in UMi, on the other hand, show significantly shorter coverage, with 400 m as the effective distance for performance optimization.Then, beyond that, the performance begins to deteriorate, with the maximum practical distance being 800 m, at which users experience poor performance.The results also show that as the UEs move away from their associated gNB, the performance is negatively affected at a faster rate in the mmWave than in the Sub-6 spectrum.With regard to the n258 band, the highest coverage support is around 600 m, while 5G performance enhancement is provided for distances up to 200 m.With regard to UMa and UMi, the n258 mmWave achieves relatively similar maximum coverage, which is 300 and 290 m, respectively.However, the performance reduction in UMi occurs at a considerably faster rate than UMa.This provides an optimal distance of up to 100 m between UEs and gNB for 5G users in UMa environments, compared to 50m in UMi.With regard to the n260 mmWave band, the results show that the maximum distances at which the UEs receive gNB signals are 230 and 210 m in UMa and UMi, respectively.While UMa can optimize 5G performance for distances up to 130 m, the n260 in UMi is better for shorter ranges and performs poorly for any distance that is longer than 50 m.For these reasons, the mmWave results determine that n258 is more efficient than n260 in terms of better performance and longer coverage support.
To ensure a fast and seamless connection to 5G networks, even when end-users move away from their gNB, a good level of fairness needs to be ensured.Therefore, the model is implemented to measure the degree of fairness provided by 5G and determine the The results show that as the distance between the UEs and gNB increases, all the frequency bands except n258 and n28 can maintain a high degree of fairness.The n28 provides maximum fairness (FI = 1) for distances up to 1500 m in RMa.Beyond this distance, the fairness begins to decrease so that at the maximum distance (D7, 3000 m), the fairness provided by n28 is about 0.83, which is still a considerably high value.In UMa and UMi, its fairness is also at the highest level for distances up to 700 m, after which the reduction occurs.In this case, fairness reaches 0.82 in UMa and 0.75 in UMi at the lowest level, which is still satisfactory.With regard to n258 mmWave, RMa provides high fairness at distances up to 550 m, but as the distance extends to 600 m, the fairness quickly drops to a considerably lower level, about 0.70.The same behavior is observed in UMa, the n258 offering a high level of fairness at distances less than 280 m.However, once the distance increases to 300 m, it severely affects fairness and drops to around 0.71.The fairness reduction for n258 mmWave in UMi is more distance-dependent.In this regard, the resource allocation is fairly provided to users for distances shorter than 250 m.As the distance from the gNB increases to 290 m, the fairness level drops quickly to about 0.53.Therefore, the fairness results conclude that all the frequency bands can effectively manage the distribution of resources between users as they move away from the 5G networks and maintain a significantly high level of fairness, except for n28 and n258, which have lower fairness.These findings are also confirmed and predicted by the model, as shown in Fig. 9.
The results of the prediction model indicate that there is a positive correlation between delay and CA gNB , regardless of the type of predictor variables used.As the distance between UEs and their associated base station increases across all environments, Fig. 9 The correlation between the X 1 (C CF ), X 2 (Env), and X 4 (CA gNB ) variables when Y = Delay 5G networks experience less impact when utilizing the n78 frequency band.For other frequency bands, the results show a stronger correlation in UMa and UMi, meaning that users in these environments experience higher delays as distance increases compared to users in RMa environments.

Traffic and network services
To support the diverse range of internet applications, 5G networks need to be capable of meeting the distinct requirements of relevant services.The high-performance services require higher speed and are loss-sensitive, while real-time services are affected by latency and jitter.Fulfilling these requirements is particularly important for 5G to accommodate different types of services.To achieve this, multiple service-level factors are involved, among which are packet length (P Len ) and protocol type (TCP, UDP).Since different aspects of these factors can affect 5G performance in different ways, it is crucial to measure their impact and identify those that contribute to optimizing performance.The model is implemented based on the speed requirements of high-performance services, and the throughput results are shown in Fig. 10.
The results indicate that the n78 frequency band offers optimal performance for all packet lengths and environments over both TCP and UDP.With the exception of n28, all frequency bands provide the highest throughput when UDP services utilize smaller packets (64, 128, 256 B) in RMa.In this case, the n28 achievement is slightly higher for the smaller packets, but as the packet size increases, the throughput becomes comparable to those of the other bands.Conversely, when UDP-based services use the medium (512, 1024 B) and large (2048, 4096 B) packets in 5G networks, the throughput decreases compared to the smaller packets and remains constant.This similar performance achieved by the medium and large packets can offer a great degree of flexibility for supporting different systems with specific packet requirements.While the shorter packets (64, 128, 256 B) are optimal for UDP services in RMa, this is only true for TCP services in the mmWave and n78 bands.All other frequency bands offer the same throughput for all packet sizes, ranging from small to large.Therefore, 5G optimization is achieved by smaller packets and as the size of the packet increases, the throughput decreases to a comparable level for medium and large packets.UMi TCP Throughput (Mbps)  It is noteworthy that while mmWave in RMa can achieve the highest performance for both TCP and UDP services, it is not the same with UMa and UMi, where the n260 mmWave band provides the lowest performance.For UDP services, there are no substantial differences between UMa and UMi, as the optimal packets are those with shorter lengths.For TCP services, smaller packets are only suitable when utilizing the n78 band.When 5G operates on other bands, there is no optimal packet size because all packets attain the same level of throughput.This eliminates the packet length limitations and allows significant flexibility for 5G to meet the requirements of systems that impose specific packet restrictions, such as IoT.In addition to speed demands, high-performance services require reliable connections for packet forwarding, which can be affected by packet loss, presenting new challenges for these services.Accordingly, to determine the efficiency of 5G networks in meeting the reliability requirements of services, the model is implemented and loss ratio results are provided in Fig. 11.
The above results confirm the throughput findings and indicate that the reason for the lower throughput of the n28 and n260 bands is their excessive data loss.The results in the RMa environment signify that UDP services with smaller packets (64, 128, 256 B) suffer from substantial data loss.However, increasing the packet size to medium (512, 1024 B) and large (2048, 4096 B) can improve the reliability accordingly.Moreover, the smaller packets in RMa environment can severely disrupt the performance of TCP services in 5G networks using the n3 and n7 bands, while increasing the packet size to medium or large can mitigate the issue.Therefore, in order to improve the reliability of the UDP services in RMa environment, while medium and larger packets are better choices for the n28 and n3 bands, any packet size has the same impact on other frequency bands.In this case, the larger packets are able to improve the reliability of TCP services in n3 and n7 bands, but the improvements in other bands have no dependency on the size of the packets.These findings lead to the conclusion that the optimization of 5G performance to meet the reliability requirements of diverse services in RMa environments is provided by n78 and n258 mmWave bands.The same findings, as providing higher reliability of 5G systems in the presence of medium and larger packets, are achieved in UMa and UMi environments.For 5G optimization, any increase in the size of packets can assist in enhancing 5G reliability.
Real-time services have distinct requirements that differ from those of high-performance services.They are highly time-dependent, and therefore, uncertain time variations in the network can significantly impact their performance.To evaluate the efficiency of 5G in addressing latency-related issues and enhancing the performance of real-time services, the model is implemented, and latency results are measured and presented in Fig. 12.
The results show that delay decreases with increasing the packet size to 1024B, after which it remains constant with no noticeable variation.Based on recommendations, the performance of time-sensitive services is considered good or average, with endto-end latency values less than 0.15s and 0.4s, respectively [44].On this basis, the n28 with P Len = 64B in RMa environments achieves average performance for UDP services.However, as the packet length increases, it can reach a good level of delay and improve performance comparable to those of the other bands.For this reason, when employing the n3 and n28 bands, 5G optimization to meet the needs of time-dependent UDP services in RMa is provided by packets larger than 64B.However, it is independent of the packet size for the remaining bands.The RMa measurements for TCP services, on the other hand, show that while the n28 results do not show a noticeable impact, the smaller packets limit the performance of the n3 and n7 bands.Therefore, the medium and larger packets are best suited for optimizing the performance of 5G networks using n3 and n7 bands.A comparable outcome is observed for the n258 mmWave and n78 frequencies.Their optimum performance is not affected by the packet length, and they provide the least delay for transmitting TCP-based data in RMa.This flexibility makes them ideal for a variety of applications where certain packet characteristics are required.
With regard to 5G performance in UMa and UMi environments, the results imply that the lowest performance for UDP services is achieved using the n260 mmWave band.In UMa environment, the n258 can provide a better response time for UDP compared to the better performance of n260 for TCP services.In this regard, both the n258 and n260 mmWave bands deployed in UMi are better suited to meet the requirements of TCP services than UDP.Moreover, while all of the bands perform better with medium and large packets for UDP services, the n78 meets their criteria for all given packet lengths.
It also provides the best performance for TCP services in UMa and UMi environments.
Based on the results, it is concluded that using medium and larger packets can improve 5G performance by reducing the response time, which is critical for real-time services.
In addition to latency, its fluctuations, as jitter, also affect the performance of real-time services and the perception of quality by the end-users.The model is implemented in this regard to measure and determine the jitter levels provided by 5G networks for transmission of time-dependent services, and the findings are shown in Fig. 13.
Because the jitter values of less than 0.01s correspond to good performance [35], the results reveal that 5G networks successfully fulfill the jitter requirements of realtime services.As the packet size increases, the users experience higher jitter until the packet length reaches the medium, after which jitter reduction occurs.Nonetheless, 5G networks can maintain a very low jitter value and meet the demands of various timedependent services, regardless of the packet length.To determine the fair allocation of  the available resources based on the type of services, the model is implemented, and the fairness results are presented in Fig. 14.
The above results indicate that optimal fairness for resource allocation is provided by 5G in RMa environments.In this case, the resource allocation is 100% fair, regardless of the packet size or type of service.However, different results are obtained in UMa and UMi environments.While 5G fairness in UMa and UMi environments is optimal (FI = 1) when using Sub-6 bands, it is considerably lower with the n258 and n260 mmWave bands.These findings apply to all types of services, whether based on TCP or UDP.In UMa environments, the n258 provides higher fairness for UDP services than the n260.In this case, the fairness level is low only for P Len ∊ {64, 128}, and as the packet size increases up to 256B and above, the n258 fairness increases to the optimal value (FI = 1).With regard to n260 fairness achievement for UDP services in UMa, although it increases for larger packets, it is lower than n258 at its highest level.For TCP services, the n258 and n260 mmWave bands provide similar results, except that the fairness achieved by the n258 is not as high as the Sub6 bands, even for larger packets.In UMi environments, the fairness of resource allocation for UDP services by utilizing the n258 remains higher than those offered by the n260, except for 64B packets.As the packet size increases, the fairness level increases accordingly, while it is higher for the n258 compared to the n260 band.With regard to TCP services in UMi, different results are obtained, which suggest that the n260 offers a higher level of fairness than the n258.Based on the results, to optimize the fairness of both TCP and UDP services, it is recommended to use Sub-6 bands with no limit on packet size and mmWave bands with large packets.These findings were also predicted by our model, which indicates a negative correlation between P Len and delay for both TCP and UDP services, as shown in Figs. 15  and 16, respectively.Fig. 15 The correlation between the X 1 (C CF ), X 2 (Env), X 7 (P Len ), and X 7 (TP; TCP) variables when Y = Delay

Channel width
Although wide channels can improve the speed and scalability of 5G networks, they also cause higher power consumption and interference issues.The issues can be mitigated by narrowband channels, but this comes at the cost of performance.Due to this direct effect, it is important to manage the width of radio channels based on the requirements to minimize communication overhead.Accordingly, the model supports all available 5G channel widths as listed in Table 2.In this context, to precisely determine the efficiency level of each bandwidth separately, the data inter-arrival time is reduced to P IAT = 1 × 10 −4 s, resulting in an R DR = 50 Mbps.The results are presented in Fig. 17.
The above results signify the immediate effects of changing the width of 5G channels on the performance so that with an increase in the channel width, 5G performance improves accordingly.The results confirm that using larger channels substantially improves the performance to the point that there is a considerable difference even between two adjacent bandwidths.Optimal performance in RMa environments is achieved by combining higher frequencies and larger bandwidths.In this regard, the n258 mmWave band with a bandwidth of 400 MHz achieves its optimum performance, allowing the throughput to reach 50 Mbps, which is the maximum achievable level (equal to the radio data rate; R DR = 50 Mbps).While this combination in RMa improves performance, higher frequencies increase power consumption, which can be a problem for devices with limited capabilities.
On the other hand, the Sub-6 results suggest that they can provide high performance for devices with lower capabilities by offering performance similar to mmWave bands for similar channel bandwidths but at a lower frequency.Unlike the mmWave bands in RMa, they do not provide significant improvements in UMa and UMi environments, Fig. 16 The correlation between the X 1 (C CF ), X 2 (Env), X 5 (P Len ), and X 7 (TP; UDP) variables when Y = Delay even with a maximum bandwidth of 400 MHz.Although n258 outperforms n260, their low transmission speed and packet loss ratio of around 70% indicate that they do not meet the requirements for high-performance services.The Sub-6 bands in UMa and UMi environments, however, achieve a performance comparable to that in RMa, ensuring that the best performance is provided using the widest bandwidths.Furthermore, in order to determine the efficiency of each bandwidth in terms of the amount of information it can transmit, the spectral efficiency results are provided in Fig. 18.
The results indicate that except for 5 MHz channels, other Sub-6 bandwidths are used efficiently.Since the radio data rate is 50 Mbps, providing 5Mbp/s/MHz over the 5 MHz bandwidth in all 5G Sub-6 frequency bands implies a low-efficiency level.However, expanding the bandwidth to 10 MHz and above can considerably improve the spectral efficiency up to a maximum level of 40 Mbp/s/MHz.As a result, except for 5 MHz, the Sub-6, with all other bandwidths in RMa, UMa, and UMi environments, can fulfill the aim of optimizing the performance of 5G networks through effective bandwidth utilization.In the case of mmWave bands, opposite results are observed so that wider channels provide a lower level of spectral efficiency.In the Sub-6 bands, increasing the bandwidth enhances throughput and spectral efficiency, which implies that the bandwidths are used effectively.However, while expanding the bandwidth improves throughput in mmWave bands, the reduction in spectral efficiency determines that utilization of the wider channels is not efficient.Increasing signal power is an alternative approach to increase the network capacity, but it causes interference with nearby devices and also requires more energy, which is not ideal for battery-powered devices.Furthermore, a higher signal power does not always guarantee good communication quality if the noise power is high too.In order to determine the data transmission quality provided by the 5G bandwidths, the quality of signals in terms of SINR is measured and shown in Fig. 19.The above results confirm that for 5G in RMa, UMa, and UMi environments, the lower frequency bands and lower channel bandwidths are less affected by noise.As a result, for SINR to be sufficiently high, the combination of the Sub-6 lower frequency bands and smaller channels achieves the desired levels of SINR.However, as the bandwidth increases, the SINR decreases, resulting in lower signal quality and limiting the system capacity.In contrast to Sub-6, the mmWave bands have the lowest signal quality.When using the n258 mmWave band, it provides better signal quality in RMa compared to lower SINR values in UMa and UMi environments.With regard to mmWave bands in UMa and UMi environments, slightly higher SINR values show that the n258 outperforms the n260.From the findings, the negative SINR in UMi implies poor signal quality and inefficiency of wider bandwidths because the signal power is lower than the noise level.The positive correlation obtained by the prediction model also confirms the direct effect of increasing C BW on delay, as shown in Fig. 20.

Conclusions
The benefits offered by 5G rely on multiple interdependent factors, each with its own set of varying features, further extending the complexity of 5G deployment.The full benefits of 5G depend on optimizing these factors, which requires careful planning, testing, and coordination during the network deployment, addressing which is important for 5G to achieve its objectives successfully.Accordingly, this work proposes a modular ML-based model to measure the correlation of multiple 5GPA factors and set priority on the combinations that contribute to the 5G performance optimization before the actual deployments.The results indicate that the performance of 5G is significantly affected by an increasing number of active users, which leads to collisions and data loss.However, the n78 band in all environments and the n258 band in RMa environments are exceptions, as they can maintain optimal 5G performance even when the network is congested with a large number of users.For the other Sub-6 and n260 mmWave bands, limiting the number of end-users to below 15 can achieve the required performance.The findings on the 5G supporting coverage area signify that providing a larger coverage area for 5G does not necessarily equate to higher performance levels in that area.In this regard, although the greatest distance is achieved by the n28 band in all the environments, the best performance is achieved by the n78 band.These findings align with the traffic modeling results, showing that the n78 frequency band provides the optimal performance for both TCP and UDP services across all packet lengths and environments.The results further determine that the UMa and UMi environments are more sensitive to changes in packet size and type of services than RMa.While the frequency bands that provide higher throughput with smaller packets can fulfill the speed requirements of high-performance services, they are not suitable for real-time services due to higher latency and loss rates.Furthermore, the findings imply that increasing the bandwidth of 5G leads to an improvement in performance, with Sub-6 bands showing a higher improvement level compared to mmWave bands.

Fig. 1
Fig. 1 Proposed model.This figure represents the proposed model for performance prediction and optimization of 5G networks

Fig. 2 .
Fig. 2. 5GMLR results: predicted delay vs. actual delay.This shows the accuracy assessment of the prediction model

Fig. 3 .
Fig. 3. 5G density measurements in terms of a Throughput, b Loss Ratio, c Jitter, and d Delay in RMa (top), UMa (middle), and UMi (bottom) environments.This determines the capacity of 5G networks by increasing the number of users.

Fig. 4 .Fig. 5 .
Fig. 4. 5G throughput efficiency based on N UEs in RMa a, UMa b, and UMi c environments.This determines throughput efficiency of 5G as a function of the number of active users

Fig. 7 .
Fig. 7. 5G performance based on CA gNB in terms of a Throughput, b Loss Ratio, c Jitter, and d Delay in RMa (top), UMa (middle), and UMi (bottom) environments.This determines the performance of end-users as they move away from the 5G cell

Fig. 8 .
Fig. 8. 5G fairness based on CA gNB in RMa (a), UMa (b), and UMi (c) environments.This determines the capability of the 5G network to manage resources for the users as they move away from its cell

Fig. 10 .
Fig. 10.5G TCP and UDP Throughput as a function of P Len in RMa (a), UMa (b), and UMi (c) environments.This determines the 5G performance in terms of speed for high-performance applications with different packet size demands

Fig. 11 .
Fig. 11.5G TCP and UDP Loss Ratio as a function of P Len in RMa (a), UMa (b), and UMi (c) environments.This determines the reliability of 5G for diverse services with different protocols and packet sizes

Fig. 12 .
Fig. 12. 5G TCP and UDP Delay as a function of P Len in RMa (a), UMa (b), and UMi (c) environments.This determines the performance of delay-sensitive applications in 5G networks

Fig. 14 .
Fig. 14. 5G TCP and UDP fairness as a function of P Len in RMa (a), UMa (b), and UMi (c) environments.This determines the fairness provided by 5G for high-performance and delay-sensitive applications

Fig. 17 .Fig. 18 .
Fig. 17. 5G performance based on C BW in terms of a Throughput, b Loss Ratio, c Jitter, and d Delay in RMa (top), UMa (middle), and UMi (bottom) environments.This determines the performance of end-users when different bandwidths are utilized in 5G networks

Fig. 19 .Fig. 20
Fig. 19.5G SINR based on C BW in RMa (a), UMa (b), and UMi (c) environments.This determines the SINR of the end-users when 5G networks use different bandwidths

Table 2
5GPA factors for performance modeling

Table 3
Common 5G parameters used by the model 5G simulated dataset until the leaf node.Because the model aims to be as flexible and applicable as possible for a wide range of 5G use cases, the 5GPA as the root branches out so that the operating frequency and environment factors are evaluated for all other factors.This way, the performance is predicted based on the 5GPA factors so that the final decision depends on other decisions and the type of choices involved in the decision-making process.Therefore, given the dataset {Y 1 , Y 2 , …, Y m , X 1 , X 2 , …, X n } with n predictor values as {X 1 , X2, …, X n } and m response values considered as {Y 1 , Y 2 , …, Y m }, the 5GMLR makes a prediction on each Y j based on multiple X i where i ≤ n.By considering the intercept (θ 0 ) and regression coefficients (θ i where i ≥ 1), the 5GMLR is formed as gNB = 4 × 4, mmWave: NA gNB = 64 × 64 (massive MIMO) UE Antenna Height RMa, UMa: AH UE = 1.5 m, UMi: AH UE = 1 m gNB Antenna Height RMa: AH gNB = 35 m, UMa: AH gNB = 25 m, UMi: AH gNB = 10 m

Table 4 P
-values for Y = Delay

Table 5
Cohen's d with RMa in the control group and Y = Delay

Table 6
The 5GMLR assessment for Y = Delay