A wireless caching helper system with heterogeneous traffic and random availability

Multimedia content streaming from Internet-based sources emerges as one of the most demanded services by wireless users. In order to alleviate excessive traffic due to multimedia content transmission, many architectures (e.g., small cells, femtocells, etc.) have been proposed to offload such traffic to the nearest (or strongest) access point also called “helper”. However, the deployment of more helpers is not necessarily beneficial due to their potential of increasing interference. In this work, we evaluate a wireless system which can serve both cacheable and non-cacheable traffic. More specifically, we consider a general system in which a wireless user with limited cache storage requests cacheable content from a data center that can be directly accessed through a base station. The user can be assisted by a pair of wireless helpers that exchange non-cacheable content as well. Files not available from the helpers are transmitted by the base station. We analyze the system throughput and the delay experienced by the cached user and show how these performance metrics are affected by the packet arrival rate at the source helper, the availability of caching helpers, the caches’ parameters, and the user’s request rate by means of numerical results.

In this paper, we study a wireless system that serves heterogeneous traffic which we distinguish between two non-overlapping classes: (i) cacheable and (ii) non-cacheable traffic. The former originates from content that is promising to cache because it is frequently requested, e.g., popular movies, trending music tracks, static parts of web pages, etc. On the other hand, non-cacheable traffic consists of content that is unlikely to be frequently requested such as chat messages or dynamic parts of web pages, and thus, it is not sensible to cache. A user with limited cache storage requests cacheable content from a data center using a base station which has direct access to it through a backhaul link. Two wireless nodes within the proximity of the user exchange non-cacheable content and have limited cache storage. Therefore, they can act as caching helpers for the cached user by serving its requests for cacheable content when they do not exchange non-cacheable traffic. Files not available at the helpers can be fetched by data center through the base station. Additionally, the source helper is equipped with a queue whose role is to save excessive packets of non-cacheable traffic with the intention of transmitting them to the destination helper in a subsequent time slot. Concerning caching, we assume the content placement is given and hierarchical.
Additionally, several different performance metrics have been considered. In earlier studies of wireless caching, cache hit probability (or ratio) [13], and the density of successful receptions or cache-server requests [6,13] have been commonly investigated as a means of evaluating the performance of wireless caching systems. Furthermore, there are several studies regarding energy efficiency or consumption of the different caching schemes [13][14][15][16] as well as taking into account the traffic load of the wireless links [17,18]. Methods that reduce traffic load by optimizing the offloading probability or gain can be found in [19][20][21].
More recently, a considerable amount of research works analyze wireless caching systems by considering throughput [7,8] and/or delay [22]. Regarding the latter, the majority of research works cope with mitigating the backhaul or transmission delay under the assumption that traffic or requests are saturated. However, there are works that take into account stochastic arrivals of requests at different nodes [23,24].
Caching has been applied to several different network realizations, e.g., FemtoCaching [3] in which the so-called femto base station (FBS) serve a group of dedicated users with random content requests while simultaneously the non-dedicated users might be served with delay due to cache misses or no FBS availability. The coded/uncoded cached contents are stored in multiple small cells, the so-called femtocells. Given the file requests distribution and the cache size of each femtocell, the content placement is studied such that the downloading time is minimized.
The advent of vehicular networks necessitates the use of caches to reduce the latency of content streaming and increase the offered quality of service (QoS) [25,26]. Supporting vehicle-to-everything connections urges the exploration of alternative data routing protocols in order to avoid incurring excessive end-to-end delay and backhaul resources allocation. On the contrary, moving computational and storage resources to the mobile edge computing seems encouraging [27][28][29]. This can be done, e.g., by employing a new paradigm known as local area data network [30], or other advances in radio access networks (RANs) for Internet of Things (IoT) [31].
Many contemporary works consider to jointly optimize the problems of content caching (or placement), computing, and allocating radio resources. They usually consider and solve separately these important issues by formulating the computation offloading or content caching as convex optimization problems with different metrics, e.g., service latency, network capacity, backhaul rate etc. [32,33]. Works that simultaneously address the aforementioned problems together and propose a joint optimization solution for the fog-enabled IoT or cloud RANs (C-RANs) can be found in [34,35], respectively.
For some applications, e.g., broadcast or multicast applications, single transmissions from the base station to more than one user are useful. The authors in [36] propose a content caching and distribution scheme for smart grid enabled heterogeneous networks, in which each popular file is stored in multiple service nodes with energy harvesting capabilities. The optimization of the total on-grid power consumption, the user association scheme, and the radio resource allocation improves the reliability and performance of the wireless access network. The evolution of 5G mobile networks is going to incorporate cloud computing technologies. The authors in [37] propose the concept of "Caching-as-a-Service" (CaaS) based on C-RANS as a means to cache anything, anytime, and anywhere in the cloud-based 5G mobile networks with the intention of satisfying user demands from any service location with high QoS. Furthermore, they discuss the technical details of virtualization, optimization, applications, and services of CaaS in 5G mobile networks.
A key distinction among research papers in wireless caching is the assumption regarding the availability of caching helpers. Many papers consider that caching helpers can serve users requests whenever the requested file is cached while others adopt the assumption that caching helpers might be unable to assist user requests when, for example, serve other users of interest [3,13]. To the best of our knowledge, the proposed wireless caching model has not been studied in the literature. For instance, [8] does not take into account hierarchical caching even if it serves both two types of traffic with the assistance of one caching helper.

Contribution
In this paper, we study a wireless system in which we distinguish traffic between cacheable and non-cacheable 1 . When a cached user experiences a local cache miss, it requests cacheable content from a data center connected to a base station through a backhaul link. Two wireless nodes within the user's proximity exchange non-cacheable files and have limited cache storage. Therefore, they can act as caching helpers for the cached user by serving its requests for cacheable content when they do not exchange non-cacheable content for their own purposes. The source helper is equipped with an infinite queue whose role is to save packets of non-cacheable traffic for transmission to the destination helper in a subsequent time slot. Files not available at the helpers can be transmitted by the base station. We analyze the system throughput assuming that transmitting nodes have random access to the channel and, hence, the probabilities by which the caching helpers are available can be tuned. By adapting the availability of the caching helpers, we want to guarantee that user D will be served with non-cacheable traffic according to specific requirements, i.e., stability in our case. First, we characterize the system throughput concerning the case in which the queue at the source helper is stable as well as unstable. Moreover, we formulate a mathematical optimization problem to optimize the probabilities by which the helpers are available to assist the cached user to maximize the system throughput. Subsequently, we characterize the average delay experienced by the user from the time of a local cache miss until it receives the requested cached file. Finally, we provide numerical results to show how the packet arrival rate of non-cacheable traffic at the source helper, the availability of caching helpers, random access to the channel, caching parameters, and the user's request rate affect the system throughput and the delay.

Organization of the paper
In Sect. 2, we present the system model comprising the network, the caching, the transmission, and the physical layer model. Section 3 provides the analytical derivation of throughput for the cases of stable and the unstable queue at the source helper. The average delay performance is given in Sect. 4. In Sect. 5, we numerically evaluate our theoretical analysis of the previous sections and summarize the results. Finally, Sect. 6 concludes our research work.

Network model
We consider a network system with four wireless nodes: a pair of caching helpers S and D, a random user U within the coverage of the helpers and a base station (BS) node connected to a datacenter (DC) through a backhaul link as depicted in Fig. 1. We consider slotted time and that a packet transmission takes one time slot.
Helper S is equipped with an infinite queue Q and the packet arrivals follow a Bernoulli process with average arrival rate . It transmits packets to the destination helper D. In each time slot, user U requests for a file in its own cache. In case U's cache miss, which happens with probability q U , it requests the file from external resources, i.e., the caching helpers or the data center (through the BS). The data center stores the whole library and, hence, every file that U may request.
Requesting a file directly from the BS is not necessarily the best policy since the link connecting BS and U might be problematic. Consequently, limited throughput or increased delay might be experienced instead of fetching the file from one of the caching helpers. Moreover, the BS is not always available to help U; this happens with probability α in each time slot. Therefore, it is preferable to U when it is served by the caching helpers. Fig. 1 An example of our system model. Caching helpers S and D can be access points with storage capabilities. Node S has a dedicated connected user D, who is randomly generating requests for non-cacheable content. At the same time, there is a mobile device U within both helpers' proximity, which requests cached content from external resources with some probability in each time slot. Device U also has access to the data center DC through the BS, but the connection can be problematic, so it is preferred to be served by D or S, when possible. Please note that interference to the transmitters is not depicted

Cache placement and access
We assume the content placement is given and hierarchical, i.e., when the user node requests for a file that is not stored in its most popular files, it first probes the closest caching helper which stores the next most popular files. If this probe fails, then the second caching helper is probed for the requested file. If it also cache misses, then the file can be found in the data center. Additionally, the source helper is equipped with a queue whose role is to save the excessive non-cacheable traffic with the intention of transmitting it to the destination helper in a subsequent time slot.
Furthermore, the user device U and the caching helpers D and S have cache capacity to M U , M D , and M S files, respectively, and M U ≤ M D ≤ M S holds. We also consider the collaborative most popular content (CMPC) policy. According to CMPC, user U stores the first M U most popular files in its own cache, helper D stores the next most M D popular files, and S stores the next most M S popular files. Following CMPC requires exchange of information among devices, e.g., the cache size of each device and the content placement in each device. We assume that this information exchange is negligible.

Transmission model
In each time slot, S will attempt transmission of non-cacheable content to D with probability q S (if its' queue is not empty) and is available for U with probability 1 − q S . We assume that the caching helpers assist U only when specific conditions apply: D will attempt transmission to U with probability q D , and S will help U only when it is not transmitting to D. When the source caching helper S is transmitting to helper D and the user U requests a file from external resources, then U can be served by D or by DC. In that case, there are two parallel transmissions one from S to D and one from D (or DC, respectively) to U. If the caching helper S is available for U, then there are no parallel transmissions since only one of S, D, or DC can help U at the same time slot.
Regarding DC, we model its availability with a probability α to model the fact that it is not always available to serve U due to serving other users or failure. If the DC is always available to U, then α = 1 . On the other hand, if the DC is not available for U, then α = 0 .
We summarize the aforementioned events and notation in Table 1. Additionally, the operation of U, S, and D as flowcharts can be found in Figs. 2, 3, and 4.

Physical layer model
The wireless channel is modeled as Rayleigh flat-fading channel with additive white Gaussian noise. A packet transmitted by i is successfully received by j if and only if the signal-to-interference-plus-noise (SINR) between i and j exceeds a minimal threshold θ . Let P tx (i) be the power measured at 1 m distance from the transmitting node i, and r(i, j) be the distance in m between i and j . Then, the power received by j when i trans- is the path loss exponent. Self-interference is modeled using the self-interference coefficient g ∈ [0, 1] . The success probability in link (ij) is given by [39]: with T denoting the set of transmitting nodes at the same time, n j denoting the noise power at j, and l = 1 when j ∈ T and l = 0 otherwise.

Throughput analysis
In this section, we analyze the throughput of the system depicted in Fig. 1 . We are interested in the weighted sum of the throughput that helper S provides D along with the throughput realized by the cached user U. By denoting the former with T S and the latter with T U , the weighted sum throughput T w is given by: The average service rate of caching helper S is: Probability of queued node i attempting transmissions to j As a corollary of the Loynes theorem [40], we obtain that if the arrival and the service process of a queue are strictly jointly stationary and the queue's average arrival rate is less than the queue's average service rate, then the queue is stable. Thus, in our model, the queue at helper S is stable if and only if < µ . Finite queueing delay is a ramification of a stable queue, and, hence, by adding the aforementioned constraint we can enforce finite queueing delay on our wireless system. Moreover, the stability at S also implies that packets arriving at the queue will eventually be transmitted [40]. The throughput from S to D, denoted as T S , depends on the stability of the queue Q at S and is T S = if the queue is stable or T S = µ otherwise. Thus: with 1(.) denoting the indicator function.
The throughput realized by U, denoted by T U , depends on whether the queue at S is empty or not. The former happens with probability P(Q = 0) and the latter with probability P(Q = 0) . Therefore: • The queue at S is empty and U requests a file from external resources. In this case, U will be served: (i) by D with probability q D , or (ii) by S with probability q C in case of D's failure, or (iii) by the data center with probability α in case both helpers fail. • If the queue at S is non-empty and U requests a file from external resources, then there are two cases: either (i) helper S attempts transmission to the destination helper D (which happens with probability q S ) or (ii) helper S is available to serve U. In the first case, U will be served by D with probability q D or by the data center in case D fails to serve U. In the second case, U will be served by D with probability q D , or by S with probability q C in case D fails, or by the data center in case both helpers fail to serve U.
Considering all the details above, the throughput realized by user U is: where we have to differentiate cases of stable/unstable queue due to different P(Q = 0) and P(Q = 0) for each case. When the queue at S is stable, the probability that Q is not empty is given by: P(Q = 0) = /µ. In case the average arrival rate is greater than the average service rate, i.e., > µ , then the queue at S is unstable and can be considered saturated. Consequently, we can apply a packet dropping policy to stabilize the system and the results for the stable queue can be still valid.
If the queue at S is unstable, the throughput realized by U is: We formulate the following mathematical optimization problem to optimize the probabilities q S , q C , and q D that maximize the weighed sum throughput when the queue at helper S is stable: The first constraint ensures the stability of the queue at helper S and the second one defines the domain for the decision variables. To solve the aforementioned problem for the case in which the queue at S is unstable, we have to drop the first constraint and replace the expressions for and T U with the ones for µ and T ′ U , respectively. In Sect. 5, we provide results for maximizing the weighted sum throughput for some practical scenarios.

Delay analysis
Delay experienced by users is another critical performance metric concerning wireless caching systems. In this section, we study the delay that user U experiences when requesting cacheable content from external sources until that content is received. Let The average delay that user U experiences to receive a file from external resources is: where D S 1 is the delay to receive the file from S given D misses it: and D S 2 is the delay to receive the file from S given D caches it but does not attempt, i.e., q D = 0 , transmissions to U: We also need to compute delay caused by the data center DC: Additionally, we need to calculate the following: and: As one can observe, (7)-(15) are recursively defined. After some basic manipulations, (10) becomes: Assuming that q C P S→U − q C � = 1 , (8) becomes: Assuming that q C P S→U + (1 − q C )αP DC→U � = 0 and using (17), (11) and (12) become: Using (9), (13), (14) and applying the regenerative method [41], we get:   Substituting (20) to (9), (13), and (14) yields expressions for D S 2 , D DC,0,D , and D DC,1,D , respectively, that are functions of link success probabilities (see Tables 1 and 2) and cache parameters (see Table 3) only.

Results and discussion
In this section, we present numerical evaluations of the analysis in the previous sections. The parameters we used for the wireless links between wireless nodes can be found in Table 2. The helpers apply the CMPC policy as described in Sect. 2.2. We consider a finite content library of files, F = {f 1 , ..., f N } , to serve users requests. For the sake of simplicity, we assume that all files have equal size and that access to cached files happens instantaneously. The i-th most popular file is denoted as f i , and the request probability of the i-th most popular file is given by is the normalization factor and δ is the shape parameter of the Zipf law which determines the correlation of user requests. Consequently, the probability that user U requests a file that is not located in its cache is: The cache hit probability at the caching helper D is given by: and the cache hit probability at the caching helper S is given by: In the following results, we study the maximum weighted sum throughput which is defined as T w = wT S + (1 − w)T U or T ′ w = wT S + (1 − w)T ′ U when the queue at S is stable or unstable, respectively. The expressions for T S , T U , and T ′ U are given by (3)-(5) in Sect. 3. To maximize the weighted sum throughput, we solved the optimization problem (6a)-(6c) using the Gurobi optimization solver and report the results. To validate our theoretical results, we built a MATLAB-based behavioral simulator, which shows that the theoretical and the simulation results coincide after 50.000 time slots.

Maximum weighted sum throughput vs. average arrival rate
We consider a scenario where the wireless links parameters follow the values in Table 2. The cache sizes and cache hit probabilities are set as per Table 3 for two different values for the variable δ of the standard Zipf law for the popularity distribution that the cached files follow. In Fig. 5, the maximum weighted sum throughput versus the average arrival rate at helper S is presented for three different values of w when the queue at S is stable. We chose: (i) w = 1/4 as a representative case in which T U is more important than T S , (ii) w = 2/4 to equalize the importance of T U and T S , and (iii) w = 3/4 to put more emphasis on the importance of T S versus T U .
In case w = 1/4 , the maximum weighted sum throughput is a decreasing function of when δ = 0.5 (see Fig. 5a), but increasing when δ = 1.2 (see Fig. 5b). When w = 2/4 , the maximum weighted sum throughput is almost constant for any value of when  Table 4 The values of q * S , q * C , q * D for which the weighted sum throughput is maximized and the queue at S is stable for α = 0.7, M U = 200, M D = 1000, and M S = 2000  Table 5 The values of q * S , q * C , q * D for which the weighted sum throughput is maximized and the queue at S is unstable for α = 0. Furthermore, it is observed that the maximum weighted sum throughput is achieved when q * C = 1 for any value of w and when the queue at S is stable (see Table 4), but this is not the case when the queue is unstable, i.e., the average arrival rate is greater than the average service rate µ (see Table 5 for different values of δ ). When queue at S is unstable, it is optimal for helper S to avoid transmissions, i.e., q * C = 0 , to U when δ = 1.2 for any values of w and . When δ = 0.5 , helper S must always attempt transmissions to U, i.e., q * C = 1 , when w ∈ {1/4, 2/4} to maximize the weighted sum throughput.

Maximum weighted sum throughput vs. cache size M U
In this section, we study how the cache size M U affects the maximum weighted sum throughput. Recall that q U decreases as M U increases. We consider two different values for δ , same as previously, to examine how δ affects maximum weighted sum throughput given different values for M U . In Fig. 6, the maximum weighted sum throughput versus M U is presented for α = 0.7, M D = 1000, M S = 2000 and = 0.4 for which the queue at S is stable. We observe that as the cache size at U increases, the maximum weighted sum throughput remains almost constant when δ = 0.5 and slightly decreases when δ = 1.2 . This is expected since increasing cache size at U results in fewer requests for files from external results. Moreover, the maximum weighted sum throughput is higher when the value of δ is lower since, for a given cache size, e.g., M U = 200 , the probability of requesting content from external resources decreases as δ is increased.
In Fig. 7, the maximum weighted sum throughput versus M U is presented for the same parameters as in the Fig. 6 but unstable queue at S. The maximum weighted sum throughput is an increasing function of M U for every value of δ when w ∈ {2/4, 3/4} . This is expected since, for these values of w, the throughput achieved by D, i.e., T S , dominates T U , and T S is increasing due to the decrease of requests to external content by U (recall that as M U increases, q U decreases). When w = 1/4 , the maximum weighted sum throughput is almost constant ( δ = 1.2 ) or decreases ( δ = 0.5 ) as M U increases. The latter decrease can be attributed to the fact that T U ,i.e., the dominant term in the maximum weighted sum throughput, decreases as q U decreases (and M U increases).
The values of q * S , q * C , and q * D that achieve the maximum weighted sum throughput are given in Tables 6 and 7 when the queue at S is stable and unstable, respectively.
In case δ = 0.5 and the queue at S is stable, the maximum weighted sum throughput T w is achieved for q * C = 1 and q * D = 1 for every value of M U and w. This means that, for the aforementioned parameters, user U should always be assisted by both S and D to achieve maximum weighted sum throughput T w . This is not the case for δ = 1.2 while the queue at S is stable and M U ≥ 400 . For every value of w ∈ {1/4, 2/4, 3/4} , user U should only be assisted by S to achieve the maximum weighted sum throughput T w since q * C = 1 and q * D = 0 . We also observe that, in this case, S should more frequently assist D since q * S has almost always a higher value compared to δ = 0.5.
In case the queue at S is unstable, the values of (q * S , q * C , q * D ) for which the maximum weighted sum throughput T ′ w is achieved can be found in Table 7. We observe that neither helper S should serve helper D ( q * S = 0 ) nor the latter should assist U ( q * D = 0 ) to maximize T ′ w when (i) δ = 0.5 and w = 1/4 for any cache size M U or (ii) δ = 0.5, w = 2/4 and user U's cache can hold M U = 200 files.
Moreover, when δ = 0.5, M U ≥ 400 and w ∈ {2/4, 3/4} , helper S should only serve helper D and the latter should assist user U since (q * S , q * D ) = (1, 1) . However, helper S should slightly assist U in some cases when, e.g., M U = 400 or 600. When δ = 1.2 , helper S should only serve the destination helper D and the latter should assist user U for any value of M U and w. Additionally, helper S should not assist user U for any cache size M U but 400.
Furthermore, it should be noted that, for any value of M U , the maximum weighted sum throughput is decreasing as δ increases when w = 1/4 and increases as δ increases when w ∈ {2/4, 3/4}.

Maximum weighted sum throughput vs. average arrival rate when M D = 0
We consider a scenario where the system parameters are the same as in Sect. 5.1 (see Tables 2 and 3), but helper D cannot assist user U since its cache cannot hold any Table 6 The values of (q * S , q * C , q * D ) that maximize the weighted sum throughput T w for different values of M U when α = 0.7 and the queue at S is stable  540, 1, 0) files, i.e., M D = 0 . Consequently, q D = 0 and p hD = 0 as well. This scenario will allow the study of the maximum weighted sum throughput versus when only one of the two helpers, the least powerful, is unable satisfy U's needs for content from external resources. In Fig. 8, we plot the maximum weighted sum throughput versus when the queue at S is stable and M D = 0 . We observe that, when δ = 0.5 , the maximum weighted sum throughput (i) is a decreasing function of for w = 1/4 , (ii) slightly decreases for w = 2/4 , and (iii) increases for w = 3/4 . Recall that, by definition, in the first case T U dominates T S , in the second case both throughput terms contribute equally, and in the third case T S dominates T U . When δ = 1.2 , the maximum weighted sum throughput is increasing with . We observe that higher values of w yield steeper increases in the maximum weighted sum throughput.
When the queue at S is stable, the maximum weighted sum throughput is always achieved when q * C = 1 for any w, δ, and using the system parameters we quoted before. However, in case δ = 0.5 , helper S should nearly always assist U since q * S ∈ {0.977, 0.999} . When δ = 1.2 , helper S should always assist U as Table 8 depicts.
On the other hand, when the queue at S is unstable and δ = 0.5 , helper S should only assist U when w ∈ {1/4, 2/4} and only assist D when w = 3/4 (see Table 9). This Table 7 The values of (q * S , q * C , q * D ) that maximize the weighted sum throughput T w for different values of M U when α = 0.7 and the queue at S is unstable δ = 0.5  Table 8 The values of q * S and q * C for which the weighted sum throughput is maximized when the queue at S is stable, α = 0.  Table 9 The values of q * S and q * C for which the weighted sum throughput is maximized when the queue at S is unstable, α = 0. is expected since in the latter case, T S dominates T U and, hence, it is preferable that S always serves D to maximize the contribution of T S . In this case, if user U requests content from external resources, it will be only served by the data center. Moreover, when δ = 1.2 , it is optimal that helper S serves only D for any value of w.

Maximum weighted sum throughput vs. average arrival rate when M S = 0
Here, we study the maximum weighted sum throughput versus the average arrival rate when node S is not equipped with cache, i.e., M S = 0 , and, hence, q C = 0 and p hS = 0 . The parameters of helper D's cache and the wireless links can be found in Tables 3 and 2, respectively.  Table 10 The values of q * S and q * D for which the weighted sum throughput is maximized when the queue at S is stable, α = 0.7, M U = 200, M D = 1000, and M S = 0 In Fig. 9, we plot the maximum weighted sum throughput versus for which the queue at S is stable when M S = 0 for different values of w. Regarding δ = 0.5 , when w = 1/4 , the maximum weighted sum throughput is decreasing with . When w ∈ {2/4, 3/4} , and δ ∈ {0.5, 1.2} or w = 1/4 and δ = 1.2 , the maximum weighted sum throughput is an increasing function of .
In Table 10, we present the values of q * S and q * D that achieve maximum weighted sum throughput when the queue at S is stable for different values of w. Recall that, in this specific scenario, q * C = 0 since helper S has no cache and, thus, it cannot assist U. Therefore, S is only useful to helper D. We observe that the maximum weighted sum throughput is lowered compared to the case when M D = 0 for δ = 0.5 and slightly higher for δ = 1.2 (compare with Table 8). Additionally, helper S should almost always serve D and the later should always assist user U to achieve the maximum weighted sum throughput.
In Table 11, we present the values of q * S and q * D that achieve maximum weighted sum throughput when the queue at S is unstable for different values of w. In order to maximize the weighted sum throughput, helper S should always serve D for any values of δ and w apart from the case in which w = 1/4 and δ = 0.5 for which S should remain silent since q * S = 0 . Furthermore, helper D should always assist U requests for every value of w and δ we used. The maximum weighted sum throughput is higher compared to the case in which M D = 0 (compare with Table 9) for every value of w and δ apart from the cases in which δ = 0.5 and w ∈ {1/4, 2/4}.

Average delay at user U
Here, we present the numerical results of the average delay experienced by user U to receive content from external sources. The delay analysis can be found in Sect. 4.

Table 11
The values of q * S and q * D for which the weighted sum throughput is maximized when the queue at S is unstable, α = 0. 7 In the following plots, we study how the average arrival rate , the data center's random availability α , the probability that S attempts transmissions to D, q S , the probability that D attempts transmissions to U, q D , and the cache size at U, M U affect the average delay at U. The wireless links characteristics can be found in Table 2. The cache sizes were set to hold M S = 2000 and M D = 1000 files at S and D, respectively, and we used two different values for δ to examine its effect on the realized average delay. Hence, the values of q U , p hD , and p hS were given by (21)-(23) depending on δ . Also, we set q C = 0.5.
In Fig. 10, the average delay versus the arrival rate at helper S is depicted for q S = 0.9, q D = 0.8, α = 0.7 and M U = 200 . We observe that the delay increases with the arrival rate and the increase rate is steeper when δ = 0.5 compared to δ = 1.2 . As we explained in Sect. 2.2, higher values of δ yield more requests for a few most popular files. Therefore, for a given M U , the higher the δ , the lower the q U , i.e., user U requests files from external sources with lower probability, as well as lower value for cache hits p hD and p hS (for given M D and M S ). Fewer requests for files from external resources require fewer transmissions to U and, hence, less interference is realized. Consequently, less average delay is experienced at U.
In Fig. 11, we present the average delay at U versus data center's availability for two cases of arrival rate = 0.2 and = 0.4 . We observe that the delay is lower when = 0.2 since a higher average arrival rate is more likely to create a congested queue at S and, consequently, a higher delay. In case = 0.2 , the delay is decreased with the increase of α and the queue at helper S is stable for any α ∈ [0.2, 1] . Additionally, the decrease is steeper with α when δ = 1.2 . When = 0.4 and δ = 0.5 , the queue at S remains stable for α ∈ [0.2, 0.8] and the delay has the non-monotonic behavior of Fig. 11b. For α ∈ [0.8, 1] , the average delay starts decreasing with α and the queue at S is unstable. When δ = 1.2 , the queue at S is stable for every value of α and the delay is decreased with the increased availability of the data center.
In Fig. 12, we plot the average delay at U versus q S for = 0.2 and = 0.4 . We observe that as long as the queue at S is unstable, the delay increases with the q S increase. This is expected since as q S increases, helper S attempts more transmissions to helper D and, consequently, it is not only less likely to assist U but also U's probability to find an available helper is decreased (since the S − D pair communicates more). Regarding the case in which the queue at S is stable, increasing q S does not contribute to delay's improvement. Moreover, a lower value of q S is required to achieve queue stability at S when = 0.2 compared to = 0.4 . This is expected since a higher average arrival rate requires a higher average service rate to maintain queue stability.
In Fig. 13, we demonstrate the average delay at U versus q D for = 0.2 and = 0.4 . In the former case, the delay is slightly decreased with the increase of q D This can be attributed to helper D's increased assistance that yields more transmissions to U and, hence, potentially decreased delay. When = 0.4 , the average delay decreases considerably with q D when δ = 0.5 due to the increased assistance of helper D, but decreases slightly in case δ = 1.2 . This is expected, as we previously explained, since higher values of δ create more requests for a few most popular files, and, thus, U's request for external content is decreased. As a result, the average delay at U is decreased compared to lower δ values. In Fig. 14, we show the average delay at U versus the cache size at U for = 0.2 and = 0.4 . The cache size M U affects the request probability for external content, q U , and as it increases, the q U decreases. In any case, the queue at S is stable. When = 0.2 , the effect of M U on the average delay at U is minor. However, when average arrival rate is increased, then increasing cache size at U decreases the average delay especially when δ is lowered.

Summarized results
Here, we present a summary of the results in the previous parts of the manuscript. By denoting with T * w the maximum weighted sum throughput, the main observations are the following: • If Q, i.e., the queue at S, is stable , T * w is achieved when the caching helper S is always available to serve the cached user U, i.e., q C = 1 , provided that every other parameter is fixed. Instead, if Q is unstable, it is not always optimal to set q C = 0.
• T * w is almost constant or slightly decreasing with M U , i.e., the cache size at U, when Q is stable no matter what value for δ is assumed, given that every other parameter is fixed. This is not the case when Q is unstable, though. For instance, T * w is decreased with M U when w = 1/4 and δ = 0.5. • When one only caching helper (either S or D) is available to assist U, T * w is increasing with average arrival rate when δ = 1.2 , provided that every other parameter is fixed. On the other hand, when δ = 0.5 , the trend of T * w vs. varies with w.
Regarding D U , i.e., the average delay realized by the cached user U given by (7), the main observations are the following: • D U increases with the average arrival rate and the increasing rate is steeper, when higher δ is assumed, given that every other parameter is fixed. • D U is decreased with α , i.e., the probability of DC being available to U, when the average arrival rate is relatively low, e.g., 0.2, provided that every other parameter is fixed. D U will probably exhibit a different behavior with α for values that might cause an unstable queue at S. • D U increases with q S , i.e., the probability of S being available for D, when the queue is unstable, given that every other parameter is fixed. On the other hand, when Q is stable, increasing q S does not contribute to D U improvement. • D U is slightly decreased with q D , i.e., the probability of D being available for U, when the average arrival rate = 0.2 , provided that every other parameter is fixed. For double , D U decreases considerably with q D when δ = 0.5 is assumed, and slightly decreases with q D when δ = 1.2.
• D U is not considerably affected when we vary the cache size at M U from 100 to 1000 and = 0.2 , given that every other parameter is fixed. If the average arrival rate is doubled, then increasing M U is beneficial in terms of D U especially when δ = 1.2.

Conclusion
In this paper, we studied the effect of multiple randomly available caching helpers on a wireless system that serves cacheable and non-cacheable traffic. We derived the throughput for a system consisting of a user requesting cacheable content from a pair of caching helpers within its proximity or a data center. The helpers are assumed to exchange noncacheable content as well as assisting the user's needs for cacheable content in a random manner. We optimized the probabilities by which the helpers assist the user's requests to maximize the system throughput. Moreover, we studied the average delay experienced by the user from the time it requested cacheable content till content reception. Our theoretical and numerical results provide insights concerning the system throughput and the delay behavior of wireless systems serving both cacheable and non-cacheable content with assistance of multiple randomly available caching helpers.