Dynamic access control method for SDP-based network environments

With online work environments and other distributed computing systems—such as cloud technologies or Internet of Things systems—becoming increasingly popular today due to the COVID-19 pandemic and general technological advances, the question of how to keep them secure has also become a pertinent concern. With this increased dependence on online systems for companies, cyberattacks have also been on the rise. To protect terminal devices, many companies have resorted to implementing a single boundary-defense model. This method has yielded positive results in securing the network from external threats, but it does not effectively protect network from internal security threats. With the vulnerabilities in the internal network security in mind, a dynamic access control method used with a zero-trust software-defined perimeter security model could be a viable solution. This study proposes a dynamic access control method using an engine with a new reward and penalty point-based system (RP Engine) and a dynamic task engine (DT Engine) for a zero-trust SDP security model.


Introduction
Network security has always been an important aspect of the growing technology sector.With the constantly increasing usage of computers and computer networks in the world, it will continue to grow in importance [1][2][3].This idea has been further reinforced by the increasing intensity and frequency of cyberattacks today, and they are projected to become worse as time goes on [4].
There are many ways to ensure network security, and there are also many aspects of network security that can be studied as well.When it comes to ensuring internal network security, there are a couple of different approaches that can be taken.Two prominent approaches are using a virtual private network (VPN) [5][6][7] or using an SDP [8,9].
With a VPN, there is a network that exists to connect a device with a specific serveror servers.While VPNs can be useful in securing a network, it unfortunately still comes with some weaknesses [10][11][12].VPNs can be troublesome to set up as they require complex network linkages, and can be more difficult to manage in terms of allowing multiple level accessing.If an unwanted user or attacker gains access to a VPN, it can become a single point of failure vulnerability.This could result in them gaining access to an entire network.
An SDP is essentially a way to implement the zero-trust security model [13].Its principle is, essentially, to 'never trust and always verify' .In other words, SDPs pre-validate and authenticate users-and their devices-and creates a specific network connection for the users in question.This approach, by vetting every user and device, arguably provides enhanced security.Additionally, it reduces potential attacking points since it lacks an accessible existing connection similar to a VPN.[14].
While this quick overview of what an SDP can do may make it seem straight forward, there are still many details to consider when it comes to actually setting them up.For example, it is known that SDPs work by pre-validating and authenticating users and their devices before creating a network connection [15]; however, there is the question of how it is actually implemented.There is also the question of how to grant users specific network access depending on what they need, as one of the main proponents of SDP is to grant users access to what they specifically need.
In light of the challenges regarding how SDPs are implemented and how users are granted specific network access, this study proposes a dynamic access control method using the RP engine and a dynamic task engine.The RP engine essentially grants access to users in a Least Privilege Rule framework.It assesses a user or device's access history to determine whether access should be granted or not in real time.The DT engine, on the other hand, handles the dynamic tasks in granting access to tasks to users-including non-predefined tasks.
To assess the effectiveness of the proposed method, a dynamic access control simulation is conducted.The assessment itself is done through a scenario analysis procedure, and its performance will be compared against the performance of traditional access control methods.

SDP environment
To evaluate the performance of the dynamic access control system proposed in this paper, a theoretical SDP environment was set up for a simulation.
As Fig. 1 shows, there are multiple components involved in the proposed SDP with the RP and DT engines.
At the higher levels of the diagram, there are DT, RP, and access decision (AD) engines.As mentioned earlier, the RP engine is the proposed engine that will determine scores with reward or penalty points for anyone trying to access a network judging from their past access histories and general security status.The DT engine is the dynamic task engine which helps with providing the scope what a user or device is authorized to access in a network.The AD engine essentially delivers a list of objects that are accessible to authenticated users as determined by the DT and RP engines.
In the lower areas of the diagram are the SDP controller and its associated SDP blocks.SDP controller determines which SDP hosts can communicate with each other and deliver information about them to an external authentication service.The Initiating SDP Host (IH) block communicates with the SDP controller to request a list of accepting hosts (AH) to connect to.The SDP controller can also request further information about the hosts' software or hardware through the IH.
To actually accept a host, there is another block called the Accepting SDP Host (AH).The functionality of the AH is to essentially reject connections with all hosts and external networks except for the SDP controller itself and IH's authorized by the controller.
Figure 2 shows an overview of how the proposed SDP environment works.Particularly, it explains how each of the blocks interact with each other in an intuitive way.
The following paragraphs explain each of the steps of Fig. 2.
Step 1: When a user attempts to log in, an SPA packet is sent from the IH to the SDP controller for identification purposes and will execute the user login procedure as a whole.Step 2: To initiate authentication steps, an AH list is requested from the AD engine.The authentication steps verify whether the user from the attempted login can be found within the AH list.
Step 3: After authentication steps are completed, the SDP controller requests user information and the AH list the terminal can access from the AC System.
Step 4: Upon receiving a request for an AH list from the SDP controller, the AD engine then requests an AH list from the existing (traditional) AC system.The AC system then responds with a list of AHs the user can access.
Steps 5 and 6: After receiving an AH list from the AC System, the AD engine then sends the DT engine a dynamic task query to determine the list of AHs required.The DT engine then responds with an AH list accordingly.
Steps 7 and 8: Then, the AD engine proceeds to send the RP engine a reward and penalty query to start calculating the score of the hosts from the AH list.The RP engine would then respond accordingly to the AD engine with the scores.
Step 9: When scores and an access scope are determined by the RP and DT engines, the AD engine then responds to the SDP controller's earlier request for an AH list.
Steps 10 and 11: The SDP controller relays this list to the Initiating SDP Host and Accepting SDP Host blocks.
Step 12: To access authorized AHs, the IH makes access through the mTLS tunnel protocol.

Reward and penalty engine
With the overall concept of how the proposed RP engine can be used in an SDP environment, it would be beneficial to explain exactly how the RP engine itself works.
As mentioned in previous sections, the RP engine is essentially used to dynamically calculate a score to see how safe a user is based on their previous access results and general security status.To explain the RP engine, some categories and terms used in the RP engine's scoring system should be explained.The details are shown in Table 1.
While the reward and penalty points used to score users are shown in Table 2, it should be noted that the factor of time is not accounted for on the table.
Imagine a scenario with two users-users A and B. Suppose user A might be given ten points ten days ago, and thirty points fifty days ago.Let us also say that user B was given thirty points ten days ago, and ten points fifty days ago.While the total sum of points users A and B may be the same (at sixty points accrued), user B should be considered to be more reliable than user A under the proposed engine.
To implement an RP engine that can also take into account the score trends over time of a user, weights need to be used in the scoring equation.To do so, more recent data Smoothing coefficient (0 <= a <= 1).If the smoothing coefficient is small, the effect of the EMA is smaller.If the smoothing coefficient is large, the effect of the EMA is greater.While using an EMA does help with allowing more recent data to influence scores more, there is a limit to simply multiplying recent data with smoothing coefficient.To effectively distribute weights over time, it is necessary to adjust the EMA using the reward for result (RR), penalty for result (PR), reward for device (RD), and penalty for device (PD) values as shown in Fig. 2.
Reward adjusted for the EMA results.
Penalty adjusted for the EMA results.
Reward for the device adjusted for the EMA results.
Penalty for the device adjusted for the EMA results. is the correction factor for penalty points for past approach results.
There is also the matter of calculating reward and penalty point values for historical interactions between the user and AH.To do so, two types of histories are used to calculate the reward and penalty point values.For the reward values, the reward value for past access results (PRV), reward information value (RRI), and history value for total compensation (SRV) are calculated.The PRV is the reward value assigned based on past access results from connections between the AH and the user.The RRI is assigned based on the user's ( 1) Here, δ is the PRV weight, and ε is the RRI weight.PRV has a direct effect on the user's access to AH i , while the RRI can set different weights that have an indirect effect.β is the correction factor of compensation values for past approach results.Since the rewards have now been calculated with the aforementioned equations to determine PRV, RRI, and SRV, the penalty points also need to be calculated through the PPV, RPI, and SPV values.These values are effectively the penalty point version of the reward values, and their calculations are fairly similar for each respective value category.
Here, δ is the PPV weight, and ε is the RPI weight.PPV has a direct effect on the user's access to AH i , while the RRI can set different weights that have an indirect effect.
Traditionally, access control was mostly limited to the relationship between a subject and an object.However, with modern trends, access control is now done through various devices.With that in mind, it is necessary to assess whether a single device can impact overall reliability and risk for a network of multiple devices.To conduct this assessment, the stability of a device can also be assessed with reward and penalty points through the two equations below.
Here, ERD and EPD are the reward and penalty values for the IH and AH.To prevent an inappropriate increase in reliability points when either reward or penalty points are missing, the weights ζ and η are multiplied and calculated.(6) PRV(User, AH) = ( ERR

Performance results and evaluation
To assess the proposed RP and DT engine's effectiveness, a theoretical simulation was conducted.Reward and penalty points were obtained according to the results from access to user AH.Different events were simulated, and the values used for the study are shown in Tables 3, 4, and 5.
To actually assess the performance of the RP engine, the raw values calculated were then accumulated for comparative purposes in a series of charts as illustrated in Fig. 3.
From the results in Fig. 3, it can be seen that the gap between penalty points and reward points widened and the difference became more negative over time.What this means is that the number of normal connections got overwhelmed by the number of abnormal connections in the environment.Additionally, as the simple traditional mechanism in SDP network environments does not consider the time factor, old rewards and penalties have the same weight as new rewards and penalties.Thus, it can be concluded that existing mechanisms in SDP network environments are ineffective.
After conducting the test using a traditional SDP, another test was run with slightly modified parameters.Access restrictions were strengthened by observing previous access results and security management scores.EMA as shown in Fig. 4 was used to increase the weight of reward and penalty points in accordance with their respective chronological orders.With the adjustments brought into effect through EMA, access   restrictions appeared to get strengthened.It should be noted, however, that there is a problem in predicting the trend of access according to the chart.This happened because the difference between reward and penalty points fluctuated greatly due to more recent events in the simulation.After the previous tests, a final test was run with the proposed new RP Engine.Figure 5 observes the difference between PRV and PPV values, as well as the device reward history SRVD and the device penalty history SPVD between the user and AH for the final test.When the graph starts going down to negative values, it means existing access rights were getting deprived.The graph looks slightly smoother in comparison with the graph from figure 3.2, meaning there was a better consistency in results.

Conclusion
Current network connection methods can expose server information during remote access sessions and may potentially even allow access to other network resources beyond the session.These are the kinds of severe security issues that are alleviated through the SDP environment proposed in this study.The SDP environment assures security by essentially only allowing designated devices to join designated services.
Another issue tackled by the proposed SDP environment is the issue of authorizing specific scopes of access to users through a dynamic task engine.In a traditional access control system, the scopes of access are heavily predefined, and so users would not be able to access different parts of the network at all.However, with the proposed environment, users can be safely granted access through risk assessments and the dynamic access control mechanism.
Results from the study have also shown that the proposed solution of using RP and DT engines in an SDP environment can work effectively.Values calculated from the study indicated that the system's performance was enhanced as there was a greater number of normal connections compared to the number of abnormal connections to the system.

Fig. 1 Fig. 2
Fig. 1 Dynamic access control elements in the SDP environment

Table 1
Standard data definition Category DefinitionRR (Reward for result) Compensation points based on access history resultsPR (Penalty for result) Penalty points based on access history resultsRD (Reward for device)Compensation points based on the device PD (Penalty for device) Penalty points based on the device points are given greater weights through the exponential moving average (EMA) formula.As the smoothing coefficient increases, more recent results would also have a greater influence on the EMA.

Table 2
Reward and penalty points according to the approach result

Table 3
Event history related to RPE test rewards/penalties

Table 4
Coefficients set in the experiment

Table 5
RPE experiment results