RETRACTED ARTICLE: Support attack detection algorithm for recommendation system based on deep learning

Li, Xin; Wang, Zhixiao

doi:10.1186/s13638-023-02269-w

Research
Open access
Published: 20 July 2023

RETRACTED ARTICLE: Support attack detection algorithm for recommendation system based on deep learning

Xin Li^1,2 &
Zhixiao Wang¹

EURASIP Journal on Wireless Communications and Networking volume 2023, Article number: 63 (2023) Cite this article

1068 Accesses
1 Citations
Metrics details

This article was retracted on 14 May 2024

This article has been updated

Abstract

In order to enhance the performance of recommendation systems that support attack detection algorithms, we have designed a novel approach based on deep learning. Specifically, our algorithm focuses on improving convergence, user scoring accuracy, algorithm efficiency, and detection accuracy. To achieve this, we first construct a preliminary user rating matrix, which is optimised by incorporating the user preference word matrix and the weight of the preference word. Additionally, we adjust the size of the matrix using principal component analysis. Next, we construct a deep, bidirectional RNN model based on the deep learning network. This model is then combined with the user scoring matrix to identify the type of user. In the case of abnormal or false users, our algorithm is able to identify the recommendation system’s support attack through the detection results. The experimental results demonstrate the effectiveness of our algorithm. Specifically, our approach achieves fast convergence speeds, with the loss value remaining low throughout the process. Moreover, we achieve high average accuracy in user scoring, with a score of 97.14%. The detection time of the recommendation system support attack is also consistently lower than 0.7 s. Furthermore, our approach achieves an average accuracy of 98.09% in the detection of recommendation system support attacks. Overall, our algorithm shows promising results for practical applications in the field of recommendation systems.

1 Introduction

With the continuous improvement of Internet technology, the information in the network shows an explosive growth trend. In the face of such a large amount of information, how to save time and effort to find the information you need from this information has become a technical problem. To solve this problem effectively, the recommendation system was born and has been widely used in practise [1, 2]. Compared with traditional information search tools, this system can provide users with relevant and interesting information based on historical browsing data, which not only effectively improves the efficiency of information search but also presents the content that users are interested in, thus making many e-commerce enterprises profit [3]. Due to the strong participation of the recommendation system, the attacker will inject a large number of false user profiles into the system and never produce more favourable recommendation results. This attack mode is also called a trust attack. Some e-commerce companies officially recommend their products in these ways to gain more benefits, so detecting the recommendation system support attack and ensuring system security has become an important research topic in related fields.

To solve the above problems, reference [4] proposed a support attack detection algorithm based on mixed eigenvalues for recommendation systems. Crawler technology, the algorithm is mainly used to crawl in the recommendation system user data and background data, combined with the popular item card square estimate extraction with novelty item card recommendation system attack characteristics, according to the characteristics of the attributes of the collected data are classified, based on the results of classification, to distinguish between the normal and abnormal user and complete the recommendation system attack detection. But the convergence of the algorithm is poor, and the practical application effect is not good. Reference [5] proposes a support attack detection algorithm for recommendation systems based on improved K-means clustering. The improved K-means clustering algorithm is used to adaptively integrate recommendatory system data and extract prior knowledge of user characteristics and profiles. On this basis, a quadratic feature extraction method is used to extract the recommendation system support attack features, and the results are compressed so that the attack profile and the general profile are concentrated at both ends of the space to realise the recommendation system support attack detection. Data preparation and feature engineering are essential to constructing a network-based attack detection algorithm because they clean, standardise, and enrich the data with useful information. Data pre-treatment and feature engineering are crucial to developing the network-based attack detection method. We may increase the algorithm’s accuracy and reliability in recognising possible attacks by cleaning up consistent and relevant data and picking and creating valuable characteristics.

The categorization of support attacks and the score matrix in a recommender system have been shown to have an association with one another. Analysing user behaviour patterns in the score matrix is one method that may be used to detect support attacks. This method seeks to find variations in user behaviour that may be suggestive of manipulation. Algorithms based on machine learning may be used to categorise these patterns and determine the existence of possible support attacks. Utilising a score matrix that is more precise may help to increase the accuracy of the classification model, which in turn can be improved by adding deep learning methods and matrix resizing approaches. However, it is essential to keep in mind that support attacks may be intricate and challenging to identify and that the effectiveness of the classification model will be contingent on the quality and amount of the data that is readily accessible for examination. It is required to do further research and development in this area in order to increase the efficiency and precision of support attack detection systems.

A number of recent studies have been conducted on the topic of using deep learning algorithms to identify attacks on recommendation systems. To detect recommendation attacks, researchers have proposed many methods [6,7,8] by using traditional machine learning techniques, such as clustering techniques [9], hidden Markov models [10], and SVM [11,12,13]. Despite the effectiveness of these methods, a large number of them [14,15,16,17,18] are built based on hand-designed features, which is usually a challenging task to extract even for domain experts. The first deep learning-based detection method is built using a single convolutional neural network layer [16].

The authors of [19] provide a method that makes use of deep learning in order to identify fraudulent profiles that have been introduced into collaborative recommendation systems by dishonest users. In order to recognise shilling attacks against online recommender systems, the authors in [20] propose an adaptive RNN-based approach. A variety of machine learning methods and examples of their applicability to recommendation systems are discussed in detail in reference [21]. a technique based on deep learning that can identify many different kinds of cyberattacks on networks [22]. The research [23] offers a complete review and suggestions for feature assessment approaches for use in intrusion detection systems that make use of machine learning algorithms.

However, in practical application, it is found that the algorithm is difficult to reach expectations due to the low accuracy of attack detection in the recommendation system. Reference [24] proposes an integrated detection algorithm for support attacks based on a deep autoencoder for the recommender system. The wavelet transform method was used to grade the item’s popularity, and the popularity level matrix was constructed based on the user evaluation data. Combined with the matrix, the deep sparse autoencoder was used to extract the user rating features, and SVM was used to classify the user rating features to obtain the related support attack detection results of the recommendation system. However, the running efficiency of the algorithm is low, and the practical application effect is poor.

To effectively solve the problems of the above algorithms, a support attack detection algorithm based on deep learning was proposed, and the application effect of the algorithm was verified by experiments. Network attacks may affect a system’s performance. Let us consider that the support attack may flood the system’s servers with traffic, rendering the service inaccessible to users. Degraded performance or service interruption may ensue. This might lead to suggestions that don’t match the user’s tastes or are meant to influence them. We analyse user behaviour to protect the proposed recommendation system against network threats. Matrix factorization and matrix resizing have reduced the score matrix’s dimensionality and improved the recommendation system’s performance, making support attacks harder. Monitor the recommender system for unexpected trends and take security precautions to avoid attacks. This article presents the proposed methodology based on matrix factorization and resizing technique.

2 Methods

2.1 Support attack detection algorithm for recommendation system based on deep learning

2.1.1 Construction of user rating matrix

Suppose there are $M$ users $U = \left\{ {u_{1} ,u_{2} ,...,u_{m} } \right\}$, $N$ projects $I = \left\{ {i_{1} ,i_{2} ,...,i_{n} } \right\}$ in the system. $R \in r^{m \times n}$ is defined as the score matrix [25, 26], where the element $r_{ij}$ represents the score of the $i$-th user on the $j$-th item, and if there is no score, $r_{ij}$ is 0. The recommended system is shown in Fig. 1.

User score matrix $R$ can be expressed by the following formula:

$$R = \left[ {\begin{array}{*{20}c} {r_{11} } & {r_{12} } & {r_{13} } & \cdots & {r_{1n} } \\ {r_{21} } & {r_{22} } & {r_{23} } & \cdots & {r_{2n} } \\ {r_{31} } & {r_{32} } & {r_{33} } & \cdots & {r_{3n} } \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ {r_{m1} } & {r_{m2} } & {r_{m3} } & \cdots & {r_{mn} } \\ \end{array} } \right]$$

(1)

Due to the large amount of data in the recommendation system, the user score matrix preliminarily constructed cannot be used for accurate evaluation of users, so the matrix needs to be optimized [27].

It is assumed that each item has $t$ preference words in total, so preference words can be used to construct the recommendation system item preference word matrix, which is described as follows:

$$C = \left[ {\begin{array}{*{20}c} 1 & 0 & \cdots & 0 \\ 0 & 1 & \cdots & 1 \\ \vdots & \vdots & \ddots & \vdots \\ 1 & 1 & \cdots & 0 \\ \end{array} } \right]$$

(2)

In the above formula, 1 and 0, respectively, represent the existence or non-existence of the project preference word in the recommendation system, and $R$ and $C$ can be used to express the user’s interest in the preference word [28]. The specific formula is described as follows:

$$p_{ui} = \frac{1}{{\left| {I_{i} } \right|}}\sum\limits_{{j \in I_{i} }} {r_{uj} }$$

(3)

In the above formula, $p_{ui}$ represents the score of user $u$ on preference word $i$, $S_{i}$ represents the set of items including preference word $i$ reviewed by user $u$, and $r_{uj}$ represents the score of user $u$ on items in set $S_{i}$. The higher the score is, the more interested the user is in this kind of preference words [29]. Therefore, the matrix $P$ of user preference words can be calculated according to matrix $R$ and matrix $C$, as shown below:

$$P = \left[ {\begin{array}{*{20}c} {p_{11} } & {p_{12} } & {p_{13} } & \cdots & {p_{1t} } \\ {p_{21} } & {p_{22} } & {p_{23} } & \cdots & {p_{2t} } \\ {p_{31} } & {p_{32} } & {p_{33} } & \cdots & {p_{3t} } \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ {p_{m1} } & {p_{m2} } & {p_{m3} } & \cdots & {p_{mt} } \\ \end{array} } \right]$$

(4)

Through the above we can see that the user program rating matrix of $m$ row $n$ column into user preferences word $m$ line $t$ columns of the matrix, $t$ word number for project preference, far less than the number $n$ of the project, the user interest preference ratings can alleviate user project score matrix sparse sex [30, 31], so you need to need to calculate weight of the preference of word, the results are as follows:

$$\theta_{i} = \log \left( {N/N_{i} } \right)$$

(5)

In the above formula, $\theta_{i}$ represents the weight of preference word $i$ in each item, $N$ represents the total number of users in the recommendation system, and $N_{i}$ represents the number of users who have commented on preference word $i$.

Combined with the weight of the preference words, the user score matrix is optimized and the matrix $R^{^{\prime}}$ can accurately describe the user score is obtained. The score matrix of a recommender system is where all of the anticipated rating values for all of the users and products that are part of the system are stored. As a result of the fact that most users often only score a tiny portion of the objects in the system, the matrix is typically a sparse matrix, meaning that a vast percentage of the entries are zero. Collaborative filtering or other machine learning algorithms analyse user-item interaction data to discover trends and create predictions about user preferences. These algorithms are used to build the score matrix, which displays the user’s preferences. The anticipated ratings may be used to provide personalised suggestions for each user based on their previous actions, and this can be done using the data from the projected ratings. The score matrix is able to be changed in real time when fresh ratings data is gathered, which enables the recommender system to evolve and become more effective over time.

2.2 Matrix size adjustment

At present, principal component analysis (PCA), as a multivariable analysis method, has been widely applied in practice. This method can transform correlated high-dimensional index variables into low-dimensional unrelated index variables, and the transformed index variables are called principal components [32, 33]. When there are multiple variables that are connected with one another, principal component analysis (PCA) is used to both identify the most relevant characteristics or components from a dataset and to represent the data in a lower-dimensional space. PCA may be coupled with other methods, such as those for clustering or classification, either to pre-process the data or to increase the model’s accuracy. We have discovered that principal component analysis (PCA) is the method that works best for our studies taking into account the nature of the data, the purpose of the research, and the computing concerns involved.

Within the realm of recommender systems, there is a connection between deep learning approaches, matrix resizing, and the user score matrix. Analysing user-item interaction data and making educated guesses about user preferences may both be accomplished with the help of deep learning techniques. The score matrix that is produced as a consequence may be used to provide personalised suggestions for each user that are based on the user’s previous actions. Matrix downsizing methods may be used to lower the dimensionality of the score matrix without sacrificing any of the vital information it contains. This is necessary since the score matrix can be quite big and processing it can be very costly computationally. This has the potential to increase the speed and efficiency of the recommendation system. Matrix factorization methods, which use deep learning strategies to factorise the score matrix into lower-dimensional representations, are another tool that may be utilised to further enhance the recommender system’s performance. In general, a mix of these methodologies is used so that the performance of the recommender system may be improved, and consumers can get the most useful suggestions as a result.

Therefore, this paper uses this method to adjust the size of user scoring matrix so that the matrix is not applicable due to the large amount of data. The specific steps are as follows.

(1)
The original sample data were standardized with Z-Score. The original sample is an $m \times n$-order matrix $X$, $m$ are the sample number, and $n$ are the dimension of index variables. The sample matrix is normalized, and the transformation process is as follows:
$$Z_{ij} = \frac{{x_{ij} - \mu_{j} }}{{\sigma_{j} }},\quad i = 1,2, \ldots ,m;\quad j = 1,2, \ldots ,n$$
(6)

Then the following formula holds:

$$\mu_{j} = \overline{x}_{j} = \frac{{\sum\nolimits_{i = 1}^{m} {x_{ij} } }}{m}$$

(7)

$$\sigma_{j}^{2} = s_{j}^{2} = \frac{{\sum\nolimits_{i = 1}^{m} {\left( {x_{ij} - \overline{x}_{j} } \right)^{2} } }}{m - 1}$$

(8)

In the above formula, $\mu$ represents the average value of the indicator variable, and $\sigma$ represents the standard deviation of the indicator variable.

(2)
Find the covariance matrix. Sample data after standardized transformation is $m \times n$-order matrix $X$, $m$ are sample number, and $n$ are dimension of variables. Covariance matrix of correlation between variables is calculated [34], which is as follows:
$$C = \frac{1}{n - 1}X^{T} X$$
(9)
(3)
The covariance matrix is eigendecomposed. The covariance matrix is the $n \times n$-order matrix $C$ and $n$ are the dimensions of variables, and the eigenvalue decomposition of matrix $C$ is as follows:
$$C = U\sum {U^{T} }$$
(10)

In the above formula, diagonal matrix $\sum {}$ contains eigenvalues, and matrix $U$ contains corresponding eigenvectors [35].

(4)
The number of principal components is selected through cumulative contribution rate, as follows:
$$G\left( k \right) = \sum\limits_{i = 1}^{k} {\lambda \left( i \right)} /\sum\limits_{i = 1}^{n} {\lambda \left( i \right)}$$
(11)

In the above formula, $\lambda \left( i \right)$ represents the $i$-th largest eigenvalue, and $n$ represents the total number of eigenvalues. When the cumulative contribution rate is greater than 85%, the corresponding value of $k$ is the number of principal components selected.

(5)
Work out the principal component matrix. The principal component matrix S after dimensionality reduction is as follows:
$$S = XU$$
(12)

In the above formula, $X$ is the sample matrix of order $m \times n$, $U$ is the characteristic vector matrix corresponding to the principal component of order $n \times k$, $U$ is also known as the principal component load coefficient matrix, $m$ is the number of samples, $n$ is the dimension of variables, and $k$ is the number of principal components.

After standardization, singular value decomposition is performed on the scoring matrix, and then the user scoring matrix based on principal components is calculated, which can be adjusted according to different principal components. The derivation process is as follows:

$$X^{T} = U\sum {V^{T} }$$

(13)

$$C = \frac{{R^{\prime } }}{n - 1}X^{T} X = \frac{{R^{\prime } }}{n - 1}U\sum {V^{T} V\sum {U^{T} } } = \frac{{R^{\prime } }}{n - 1}U\sum {^{2} U^{T} }$$

(14)

In the above formula, $C$ is the covariance matrix and $X$ is the standardized scoring matrix. By singular value decomposition of $X$, user scoring matrix $U$ based on principal components can be obtained.

2.3 Deep learning network construction

Deep learning has been widely applied in various fields of society. It has very powerful data analysis ability and still performs well in complex situations. In order to improve user type recognition ability, recursive neural network is used as the deep learning network in this paper. The structure of recursive neural network is shown in Fig. 2:

In the preceding formula, $x_{t}$ represents the flow sequence entered in step $t$. $s$ represents the state of the hidden layer during the step, which is also the storage unit of RNN. The output layer is calculated by the following formula:

$$o_{t} = g\left( {Vs_{t} } \right)$$

(15)

The hidden layer can be calculated by the following formula:

$$s_{t} = f\left( {Ux_{t} + Ws_{t - 1} } \right)$$

(16)

If formula (16) is repeatedly substituted into formula (15), the following results can be obtained:

$$o_{t} = g\left( {Vs_{t} } \right) = Vf\left( {Ux_{t} + Wf\left( {Ux_{t - 1} + } \right.} \right.Wf\left( {Ux_{t - 2} + } \right.Wf\left( {Ux_{t - 3} + } \right. \cdots \left. {\left. {\left. {\left. {} \right)} \right)} \right)} \right)$$

(17)

To further improve the accuracy of user type recognition, a new reverse RNN is added to the original RNN to form a bidirectional RNN. In the forward calculation, the hidden layer values are related to $s_{t}$ and $s_{t - 1}$. The value of the hidden layer in the reverse calculation is related to $s{^{\prime}_{t}}$ and $s{^{\prime}_{t - 1}}$. So now the output is:

$$o_{t} = g\left( {Vs_{t} + V\prime s{^{\prime}_{t}} } \right)$$

(18)

$$s_{t} = f\left( {Ux_{t} + Ws_{t - 1} } \right)$$

(19)

$$s\prime_{t} = f\left( {U\prime^{{x_{t} }} + W\prime s\prime_{t - 1} } \right)$$

(20)

After adding more hidden layers, we arrive at a deep RNN model. In addition, we improve the hidden layer network to bidirectional hidden layer network. This is called depth bidirectional RNN. The depth bidirectional RNN model structure is shown in Fig. 3.

The calculation formula is updated as follows:

$$o_{t} = g\left( {V^{\left( i \right)} s_{t}^{\left( i \right)} + V\prime^{\left( i \right)} s\prime_{t}^{\left( i \right)} } \right)$$

(21)

$$s_{t}^{\left( i \right)} = f\left( {U^{\left( i \right)} s_{t}^{{\left( {i - 1} \right)}} + W^{\left( i \right)} s_{t - 1} } \right)$$

(22)

$$s^{\prime\left( i \right)}_{t} = f\left( {U{^\prime{\left( i \right)}} s^{\prime{\left( {i - 1} \right)}}_{t} + W^{\prime{\left( i \right)}} s^{\prime}_{t + 1} } \right)$$

(23)

$$s_{t}^{\left( 1 \right)} = f\left( {U^{\left( 1 \right)} x_{t} + W^{\left( 1 \right)} s_{t - 1} } \right)$$

(24)

$$s^{\prime\left( 1 \right)}_{t} = f\left( {U^{\prime{\left( 1 \right)}} x_{t} + W^{\prime{\left( 1 \right)}} s{^\prime}_{t + 1} } \right)$$

(25)

These additional hidden layers improve user type recognition. The additional hidden layer brings a more nuanced learning ability that can cope with the complexity of the environment in which the recommendation system is used.

2.4 Generation of attack detection algorithms

In this paper, user score matrix is used to identify users, including target users, real users and fake users [36], and the user with the highest similarity in user $i$’s friend set is selected as the associated user, so as to predict user $i$’s score on the target project. The user rating matrix and user relationship matrix are shown in Table 1 and Table 2, respectively.

Table 1 Sample user score matrix

Full size table

Table 2 Example user relationship matrix

Full size table

Table 2 is the user relationship matrix. When the value is 1, it means that the user is a friend; when the value is 0, it means that no friend relationship has been established.

In the above table, Alice refers to the target user, Useri refers to the real user in the system, and Attackeri refers to the fake user injected by the attacker. Take $K = 2$ as an example. For Alice, when $n$ false users are not injected, the associated user set that meets the conditions is normal user 1 and normal user $m$ in the system, because these two users have a high similarity with Alice and have established a relationship with Alice. After the fake user is injected by the attacker, since both fake user 1 and $p$ have social relations with Alice, and their score similarity is higher than other normal users who have social relations with Alice, Alice’s associated user set is false user 1 and false user $n$. Notice that although the score of real user 2 is relatively similar to that of Alice, it is still not included in Alice’s associated user set due to the fact that real user 2 is not associated with Alice. Because the recommendation result of social recommendation to Alice depends on the associated user set, Alice’s score will be completely manipulated by fake users.

In order to improve the convergence level of the recommendation system’s support attack detection algorithm, user scoring accuracy, algorithm operating efficiency and support attack detection accuracy, a support attack detection algorithm based on deep learning is designed for the recommendation system. The algorithm production process is shown in Fig. 4.

3 Application examples

The purpose of a recommendation system’s support attack detection algorithm can be to identify and thwart attempts at attacking the algorithm itself. For instance, some attackers may attempt to manipulate the system by submitting fabricated rating data or by establishing fictitious user profiles in order to slant the recommendations in favour of certain products. The recommendation system is able to recognise any abnormalities or suspicious behaviours in the rating data or user behaviour by using an algorithm for the identification of Support attacks, and then it may take the right measures to prevent the attacks from happening. This may include reporting questionable accounts for further investigation, screening out phoney ratings or reviews, or changing the recommendation algorithm to account for the bias produced by the attack. Alternatively, this may involve removing fraudulent ratings or reviews.

In this approach, the support attack detection method may contribute to improving the accuracy and fairness of the recommendation system, which in turn can improve the user experience and promote user engagement.

Example Scenario 1

Consider that the support attack may flood the system’s servers with traffic, rendering the service inaccessible to users. Degraded performance or service interruption may ensue. This might lead to suggestions that don’t match the user’s tastes or are meant to influence them. We analyse user behaviour to protect the proposed recommendation system against network threats. Matrix factorization and matrix resizing have reduced the score matrix’s dimensionality and improved the recommendation system’s performance, making support attacks harder. Monitor the recommender system for unexpected trends and take security precautions to avoid attacks. This article presents the proposed methodology based on matrix factorization and resizing technique.

Example Scenario 2

The purpose of a recommendation system’s support attack detection algorithm can be to identify and thwart attempts at attacking the algorithm itself. For instance, some attackers may attempt to manipulate the system by submitting fabricated rating data or by establishing fictitious user profiles in order to slant the recommendations in favour of certain products.The recommendation system is able to recognise any abnormalities or suspicious behaviours in the rating data or user behaviour by using an algorithm for the identification of Support attacks, and then it may take the right measures to prevent the attacks from happening. This may include reporting questionable accounts for further investigation, screening out phoney ratings or reviews, or changing the recommendation algorithm to account for the bias produced by the attack. Alternatively, this may involve removing fraudulent ratings or reviews.

4 Comparative experiments

In order to verify the experimental application effect of the deep-learning-based support attack detection algorithm for the recommendation system designed in this paper, an experimental test is conducted. The overall test environment is shown in Table 3.

Table 3 Test environment

Full size table

Data sets used in this experiment mainly include Movie Lens Movie evaluation data set, Netflix data set, Eachmovie data set, etc. The data set information is shown in Table 4.

Table 4 Data set information

Full size table

The supporting attack detection algorithm of recommendation system based on mixed eigenvalue designed in reference [4], the supporting attack detection algorithm of recommendation system based on improved K-means clustering designed in reference [5] and the supporting attack detection algorithm of recommendation system based on deep learning designed in this paper are taken as experimental comparison algorithms. The practical application effect of different algorithms is verified by comparing their convergence, user scoring accuracy, algorithm running efficiency and support attack detection accuracy.

4.1 Convergence of algorithm

The convergence of the detection algorithm based on mixed eigenvalue, the detection algorithm based on improved K-means clustering and the detection algorithm based on deep learning are compared, and the comparison results are shown in Fig. 5. It is worth mentioning that reaching quick convergence speed is a crucial aspect in the construction of any machine learning algorithm, as it may directly influence the efficiency and efficacy of the algorithm. This is something that should be kept in mind while developing any machine learning algorithm.

Analysis diagram 5 algorithm convergence is more the result shows that with the increase of the number of iterations, the detection algorithm based on mixed characteristic value of the loss value has remained at a high level, and still hasn’t been convergence in the process of 100 iterations, and the detection algorithm based on improved k-means clustering loss value with the increase of the number of iteration times of linear downward trend, It shows that the convergence of the algorithm is poor. Compared with these two algorithms, the detection algorithm based on deep learning achieves convergence in 20 iterations with fast convergence speed and a low loss value, indicating that the algorithm has good convergence.

4.2 User scoring accuracy

The user scoring accuracy of the detection algorithm based on mixed eigenvalue, the detection algorithm based on improved K-means clustering and the detection algorithm based on deep learning are compared, and the results are shown in Table 5.

Table 5 User scoring accuracy (unit: %)

Full size table

By analyzing the data in Table 5, it can be seen that the user scoring accuracy of different algorithms always fluctuates with the increase of the number of experiments. The maximum user scoring accuracy of the detection algorithm based on mixed eigenvalue is 85.62%, the minimum value is 80.26%, and the average value is 83.92%. The maximum user scoring accuracy of the detection algorithm based on improved K-means clustering is 86.52%, the minimum is 74.13%, and the average is 78.54%. The maximum user scoring accuracy of the detection algorithm based on deep learning is 98.55%, the minimum value is 96.35%, and the average value is 97.14%, indicating that the user scoring accuracy of the algorithm designed in this paper is higher than that of the detection algorithm based on mixed eigenvalue and the detection algorithm based on improved K-means clustering.

4.3 Efficiency of the algorithm

Algorithm efficiency is the most important index for speed measurement algorithm, which mainly use the testing time under different amount of data to reflect, therefore compared the detection algorithm based on mixed characteristic value, the detection algorithm based on improved k-means clustering, the detection algorithm based on depth study of the efficiency of the algorithm, the result is shown in Fig. 6.

Analysis of the data in Fig. 6 shows that the detection time of the detection algorithm based on mixed eigenvalues varies between 1.2 and 2.2 s, and that of the detection algorithm based on improved K-means clustering varies between 1.7 and 2.2 s. Compared with these two algorithms, the detection time of the detection algorithm based on deep learning is always lower than 0.7 s. It shows that this algorithm can quickly detect the recommendation system support attack, and the algorithm is more efficient.

4.4 Accuracy of support attack detection

The detection accuracy of recommendation system support attack based on the detection algorithm based on mixed eigenvalue, the detection algorithm based on improved K-means clustering and the detection algorithm based on deep learning are compared, and the results are shown in Table 6.

Table 6 Accuracy of support attack detection (unit: %)

Full size table

By analyzing the data in Table 6, it can be seen that the average accuracy of recommendation system support attack detection of the detection algorithm based on mixed eigenvalue is 79.84%, and that of the detection algorithm based on improved K-means clustering is 86.20%. Compared with these two algorithms, the average accuracy of the recommendation system support attack detection algorithm based on deep learning is 98.09%, which is much higher than the two algorithms, indicating that this algorithm can achieve accurate support attack detection in the recommendation system.

4.5 F-measure and average sparsity level

When evaluating the performance and robustness of a proposed algorithm for a support network attack recommendation system, it is important to use appropriate evaluation metrics to measure its effectiveness. We have now included the performance results in terms of average sparsity level and F-measure [37] for the datasets considered in this research work.

Due to the fact that customers often only evaluate a limited number of products, the sparsity of the rating data is typically rather high. Figure 7 shows the comparison of F-measure vs dimensions of rating matrix. The F-measure, is the harmonic mean of the recall and accuracy values. In order to analyse the detection results of the validation set and determine the ideal hyper parameters, we make use of the f-measure. The average sparsity level of the resize rating matrix is shown in Fig. 8, along with the detection results for various dimensions of the resized rating matrix. As can be seen in Fig. 8, the average sparsity level rises as the number of dimensions of any and all datasets that are employed in our tests becomes larger. The F-measure does not achieve its maximum value when the sparsity level is either too high or too low. The f-measure displays the greatest possible value when the dimension is set to the value 70.

Support attacks, in which malevolent users attempt to influence the recommendation system by submitting fraudulent ratings or reviews, have the potential to have a substantial impact on the system’s accuracy as well as its usefulness. Support attack detection algorithms are able to monitor user ratings and discover trends or abnormalities that point to an ongoing support attack. This enables the system to identify and get rid of any phoney reviews that have been posted.

There are a few different suggested approaches for the identification of support attacks in recommender systems. These include the creation of user profiles based on rating behaviour and the detection of deviations from predicted patterns of behaviour. By analysing user-item interaction data and creating more accurate predictions, deep learning approaches like matrix factorization have shown that they have the potential to improve the accuracy of recommender systems. In addition, matrix shrinking methods may be employed to decrease the dimensionality of the score matrix, which in turn makes the recommendation system more effective.

Although it is vital to keep in mind that support attack detection techniques may help enhance the accuracy of recommender systems, it is also important to keep in mind that these algorithms are not infallible and can be susceptible to attacks that are well devised.

5 Conclusion

The rapid growth and development of e-commerce has transformed people’s production and lifestyle. However, as e-commerce applications continue to expand, the occurrence of trust attacks has become a pressing social issue. Therefore, it is imperative to develop new trust attack detection algorithms for recommendation systems. Currently, traditional recommendation system support attack detection algorithms suffer from issues such as low precision, algorithm convergence, and efficiency in user ratings. To address these limitations, this paper proposes a novel recommendation system-based deep learning attack detection algorithm. Experimental results demonstrate that our algorithm has a fast convergence speed and consistently maintains a low loss value. Additionally, our algorithm achieves high accuracy in user scoring, with a maximum accuracy of 98.55%, a minimum of 96.35%, and an average of 97.14%. The detection time of our algorithm is always less than 0.7 s, and the average accuracy of recommendation system support attack detection is 98.09%. These results fully resolve the issues existing in traditional algorithms, ensuring the information security of e-commerce and promoting its further development. Overall, our proposed algorithm offers an effective solution to the problem of trust attacks in recommendation systems. It has significant implications for the development of e-commerce and represents a valuable contribution to the research in the field of information security.

Availability of data and materials

Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

Change history

14 May 2024
This article has been retracted. Please see the Retraction Notice for more detail: https://doi.org/10.1186/s13638-024-02366-4

References

K.J. Lee, Y. Hwangbo, B. Jeong et al., Extrapolative collaborative filtering recommendation system with Word2Vec for purchased product for SMEs. Sustainability 13(13), 7156–7163 (2021)
Article Google Scholar
J. Anitha, M. Kalaiarasu, Optimized machine learning based collaborative filtering (OMLCF) recommendation system in e-commerce. J. Ambient. Intell. Humaniz. Comput. 12(4), 1–15 (2021)
Google Scholar
Q. Yang, A robust recommended system based on attack detection. Concurr. Comput. Pract. Exp. 31(12), e4660.1-e4660.9 (2019)
Article Google Scholar
M.N. Lei, A.L. Ding, X.M. Wang et al., Tuo attack detection algorithm based on mixed eigenvalues. Comput. Technol. Dev. 31(10), 87–92 (2021)
Google Scholar
J. Satellite, H. Li, Research on semi supervised detection of confusion trust attack for commodity recommendation system. Sci. Technol. Dev. 16(9), 1125–1133 (2020)
Google Scholar
P.A. Chirita, W. Nejdl, C. Zamfir, Preventing shilling attacks in online recommender systems, in Proceedings of the 7th annual ACM international workshop on web information and data management (2005), pp. 67–74
B. Mehta, T. Hofmann, P. Fankhauser, Lies and propaganda: detecting spam users in collaborative filtering, in Proceedings of the 12th international conference on intelligent user interfaces (2007), pp. 14–21
B. Mehta, W. Nejdl, Unsupervised strategies for shilling detection and robust collaborative filtering. User Model User-adapt Interact 19(1–2), 65–79 (2009)
Article Google Scholar
J.S. Lee, D. Zhu, Shilling attack detection-a new approach for a trustworthy recommender system. INFORMS J. Comput. 24(1), 117–131 (2012)
Article Google Scholar
Z. Zhang, Z. Zhang, P. Zhang, S. Wang, UD-HMM: an unsupervised method for shilling attack detection based on hidden markov model and hierarchical clustering. Knowl. Based Syst. 148, 146–166 (2018)
Article Google Scholar
R. Burke, B. Mobasher, C. Williams, R. Bhaumik, Classification features for attack detection in collaborative recommender systems, in Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining (2006), pp. 542–547
C.A. Williams, B. Mobasher, R. Burke, Defending recommender systems: detection of profile injection attacks. Server Orient. Comput. Appl. 1(3), 157–170 (2007)
Article Google Scholar
F. Zhang, Q. Zhou, HHTSVM: an online method for detecting profile injection attacks in collaborative recommender systems. Knowl. Based Syst. 65, 96–105 (2014)
Article Google Scholar
Z. Yang, L. Xu, Z. Cai, Re-scale adaboost for attack detection in collaborative filtering recommender systems. Knowl. Based Syst. 100, 74–88 (2016)
Article Google Scholar
W. Zhou, J. Wen, Q. Xiong, M. Gao, J. Zeng, SVM-TIA A shilling attack detection method based on SVM and target item analysis in recommender systems. Neurocomputing 210, 197–205 (2016)
Article Google Scholar
C. Tong, X. Yin, J. Li, T. Zhu, R. Lv, L. Sun et al., A shilling attack detector based on convolutional neural network for collaborative recommender system in social aware network. Comput. J. 61(7), 949–958 (2018)
Article Google Scholar
Z. Wu, J. Gao, B. Mao, Y. Wang, Semi-SAD: applying semi-supervised learning to shilling attack detection, in Proceedings of the 5th ACM conference on recommender systems (2011), pp. 289–92
Z. Wu, J. Wu, J. Cao, D. Tao, HySAD: a semi-supervised hybrid shilling attack detector for trustworthy product recommendation, in Proceedings of the ACM SIGKDD conference on knowledge discovery and data mining (2012), pp. 985–93
Q. Zhou, J. Wu, L. Duan, Recommendation attack detection based on deep learning. J. Inf. Secur. Appl. 52, 102493 (2020)
Google Scholar
A.B. Chopra, V.S. Dixit, An adaptive RNN algorithm to detect shilling attacks for online products in hybrid recommender system. J. Intell. Syst. 31, 1133–1149 (2022)
Google Scholar
H. Iqbal, Sarker, machine learning: algorithms, real-world applications and research directions. SN Comput. Sci. 2, 160 (2021)
Article Google Scholar
Y. Wu, D. Wei, J. Feng, Network attacks detection methods based on deep learning techniques: a survey. Secur. Commun. Netw. 8872923 (2020)
A. Binbusayyis, T. Vaiyapuri, Comprehensive analysis and recommendation of feature evaluation measures for intrusion detection. Heliyon 6(7), e04262 (2020)
Article Google Scholar
Y. Hao, F. Zhang, Integrated detection method of Tuo attack based on depth automatic encoder. Comput. Eng. Appl. 55(1), 9–22 (2019)
Google Scholar
A. Br, B. Jks, C. Np et al., Enriching “user item rating matrix” with resource description framework for improving the accuracy of recommendation in E-learning environment—ScienceDirect. Mater. Today Proc. 70(9), 1–12 (2020)
Google Scholar
Y. Zhang, C. Zhao, M. Chen et al., Integrating stacked sparse auto-encoder into matrix factorization for rating prediction. IEEE Access 9(1), 17641–17648 (2021)
Article Google Scholar
A. Kutlimuratov, A. Abdusalomov, T.K. Whangbo, Evolving hierarchical and tag information via the deeply enhanced weighted non-negative matrix factorization of rating predictions. Symmetry 12(11), 1930–1945 (2020)
Article Google Scholar
M. Behrisch, T. Schreck, H. Pfister, GUIRO: user-guided matrix reordering. IEEE Trans. Visual Comput. Graphics 26(1), 184–194 (2020)
Google Scholar
J.C. Duan, S. Li, Enhanced PD-implied ratings by targetingthe credit rating migration matrix. J. Finance Data Sci. 11(2), 1–12 (2021)
MathSciNet Google Scholar
I. Gupta, A.K. Singh, GUIM-SMD: guilty user identification model using summation matrix-based distribution. IET Inf. Secur. 14(6), 773–782 (2020)
Article Google Scholar
W. Zhang, X. Li, J. Li et al., A two-stage rating prediction approach based on matrix clustering on implicit information. IEEE Trans. Comput. Soc. Syst. 99, 1–19 (2020)
Google Scholar
F. Caridi, A.F. Mottese, M. Messina et al., Fatty acids evaluation by principal component analysis for the traceability of sicilian and calabrian olive oils. Curr. Nutr. Food Sci. 17(2), 16–23 (2021)
Google Scholar
W. Huang, H.J. Liu, Y.F. Ma, Drivability evaluation model using principal component analysis and optimized extreme learning machine. J. Vib. Control 25(16), 2274–2281 (2019)
Article Google Scholar
S. Salata, C. Grillenzoni, A spatial evaluation of multifunctional ecosystem service networks using principal component analysis: a case of study in Turin, Italy. Ecol. Ind. 127(1), 107758–107767 (2021)
Article Google Scholar
X. Ni, H. Wang, C. Che et al., Civil aviation safety evaluation based on deep belief network and principal component analysis—ScienceDirect. Saf. Sci. 112(1), 90–95 (2019)
Article Google Scholar
C.M. Wang, Mathematical modeling and Simulation of network attack detection based on incremental learning. Comput. Simul. 38(1), 273–276 (2021)
Google Scholar
D.J. Hand, R.J. Till, A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach. Learn. 45(2), 171–186 (2001)
Article Google Scholar

Download references

Acknowledgements

Intelligent Transportation Application Research Based on Large Scale Vehicle Spacetime Track Data, Zhengzhou Shengda School of Economics and Management, Li Xin, Zhengzhou Municipal Bureau of Science and Technology (No. 232102211028). Henan Province Science and Technology Research Project, based on multi-source transportation big data driven regional transportation collaborative control algorithm, model, and software (No. 222102210110). Research on fraud attack detection method in collaborative filtering recommendation system, key scientific research project of colleges and universities in Henan Province (NO: 22B520048).

Author information

Authors and Affiliations

School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, Jiangsu, China
Xin Li & Zhixiao Wang
School of Information Engineering, Zhengzhou Shengda University, Zhengzhou, 451191, Henan, China
Xin Li

Authors

Xin Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhixiao Wang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

XL conceived of the study, and participated in its design and coordination and helped to draft the manuscript, ZXW participated in the design of the study and performed the statistical analysis, LA Modified the paper.

Corresponding author

Correspondence to Zhixiao Wang.

Ethics declarations

Competing interests

The author declares that there is nothing to declare.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article has been retracted. Please see the retraction notice for more detail:https://doi.org/10.1186/s13638-024-02366-4

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

About this article

Cite this article

Li, X., Wang, Z. RETRACTED ARTICLE: Support attack detection algorithm for recommendation system based on deep learning. J Wireless Com Network 2023, 63 (2023). https://doi.org/10.1186/s13638-023-02269-w

Download citation

Received: 10 April 2023
Accepted: 20 June 2023
Published: 20 July 2023
DOI: https://doi.org/10.1186/s13638-023-02269-w

RETRACTED ARTICLE: Support attack detection algorithm for recommendation system based on deep learning

Abstract

1 Introduction

2 Methods

2.1 Support attack detection algorithm for recommendation system based on deep learning

2.1.1 Construction of user rating matrix

2.2 Matrix size adjustment

2.3 Deep learning network construction

2.4 Generation of attack detection algorithms

3 Application examples

Example Scenario 1

Example Scenario 2

4 Comparative experiments

4.1 Convergence of algorithm

4.2 User scoring accuracy

4.3 Efficiency of the algorithm

4.4 Accuracy of support attack detection

4.5 F-measure and average sparsity level

5 Conclusion

Availability of data and materials

Change history

14 May 2024

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords