BRScS: A Hybrid Recommendation Model Fusing Multi-source Heterogeneous Data

Recommendation systems are often used to solve the problem of information overload on the Internet. Many types of data can be used for recommendations, and fusing diﬀerent types of data can make recommendations more accurate. Most existing fusion recommendation models simply combine the recommendation results from diﬀerent data instead of fully fusing multi-source heterogeneous data to make recommendations. Furthermore, users’ choices are usually aﬀected by their direct and even indirect friends’ preferences. This paper proposes a hybrid recommendation model BRS c S (an acronym for BPR-Review-Score-Social). It fully fuses social data, score, and review together, uses improved BPR model to optimize the ranking, and trains them in a joint representation learning framework to get the top-N recommendations. User trust model is used to introduce social relationships into the rating and review data, PV-DBOW model is used to process the review data, and fully connected neural network is used to process the rating data. Experiments on Yelp public dataset show that the BRS c S algorithm proposed outperforms other recommendation algorithms such as BRS c , UserCF, HRS c . The BRS c S model is also scalable and can fuse new types of data easily.


Introduction
With the development of information technology, how to efficiently and quickly find valuable information from massive data has become a major challenge for users.In order to solve the problem of Internet information overload and enable users to quickly obtain interesting information, the recommendation system came into being.The recommendation system essentially abstracts the user's interest characteristics from a bunch of disorganized raw data, and mines user's preferences to recommend different items or services [1].Currently, the recommendation system has been successfully applied to many fields, including social networks (Facebook, Twitter), e-commerce (Amazon, Alibaba, Netflix), information retrieval (Google, Baidu, Yahoo) [2,3,4].effects of deep learning in representation learning [8], the research based on data feature fusion has received extensive attention.The fusion model based on the data feature refers to merging features of heterogeneous data by means of averaging or concatenation [9].When adding a new heterogeneous data to the model, there is no need to redesign and train the original model so that the model is flexible and scalable.
Based on these, considering the impact of users' direct and indirect friends on users' decisions, this paper proposes a hybrid model based on social relationships that fuses multi-source heterogeneous data.The main contributions are as follows: • This paper introduces the social network into the recommendation algorithm.Not only the direct friends' influence on users' decision is considered, but also the indirect friends' influence.• A joint representation learning model that fuses scores, reviews, social relations is proposed.It fuses different types of data from the data source perspective rather than combining the recommendation results from different recommendation models.
• Experiments are performed to compare the proposed model BRS c S with other recommendation algorithms such as BS c , UserCF, BRS c , HRS c .BRS c S performs better than other recommendation models in terms of Precision, Recall and HT.In this paper, section 2 introduces the related work of recommendation algorithms.Section 3 illustrates our hybrid recommendation model based on the deep learning algorithm.The algorithms of different types of data are analyzed, and the objective function of fusing comments, scores and social information is derived.Section 4 proves the effectiveness of adding social network information on the recommendation results and compares the proposed model with other recommendation algorithms by experiments.Section 5 concludes the paper and discusses future work.

Content-based Recommendation
A content-based recommendation algorithm analyzes users' preferences for the items and recommends items, which have similar features to the items that the users like, to them.Based on the deep structured semantic model(DSSM), Elkahky [10] proposed a multi-view deep neural network to learn the features of users and items separately.The recommended items were determined by calculating the similarity between users and items.Zheng [11] proposed a deep collaborative neural network (Deep CoNN) model that uses two parallel neural networks to learn the features of comments and then constructs an interaction layer on the two neural networks to predict the user's score.Based on DSSM, Xu [12] proposed a label-based item recommendation, which inputs user information and item information related to the label into two deep neural networks respectively, makes recommendations by calculating the similarities of abstract features between the user and the item.Seo [13] presented an attention-based Convolutional Neural Network(CNN) model that combines reviews and ratings for product recommendations.
Deep learning can effectively alleviate the cold start problem of new projects.At the same time, it can integrate the feature extraction and recommendation process into a unified framework.However, the contentbased recommendation algorithm only recommends similar items to users based on users' historical preferences.It cannot recommend new interesting items to users and implement cross-category recommendations.

Collaborative Filtering Algorithm
A collaborative filtering algorithm is the most widely used algorithm in the recommendation system.The main idea is to find a certain similarity between users or items, and use this similarity to make recommendations for users [14].Collaborative filtering algorithms can be divided into user-based algorithms, item-based algorithms and model-based algorithms.The userbased collaborative filtering algorithm is the earliest recommendation algorithm.This algorithm first calculates the similarity between users, selects a similar user with the highest similarity to the target user.Then, it recommends items selected by the similar user to the target user [15].Currently, the item-based collaborative filtering algorithm is used broadly in the industry, which recommends items similar to the items liked by users to them [16].The model-based collaborative filtering algorithms mainly recommend items to users through machine learning and data mining models [17].
Salakhutdinov [18] applied deep learning to solve the recommendation problem for the first time, and proposed a collaborative filtering recommendation model based on the restricted Boltzmann machine(RBM).Sedhain [19] proposed a self-encoder-based collaborative filtering method, which utilizes an encoding process and a decoding process to produce an output and optimizes the model parameters by minimizing the reconstruction error.Wu [20] used the noise reduction self-encoder to solve the top-N recommendation problem, and proposed a collaborative noise reduction selfencoder model.It makes recommendations by taking the user's rating vector as an input and learning the user's low-dimensional vector representation.Covington [21] proposed a deep collaborative filtering model, first using the depth candidate video generation model to retrieve the candidate set and then using the depth ordering model to sort the candidate videos, which is superior to the matrix-based decomposition model.
The biggest advantage of a collaborative filtering algorithm based on deep learning is to introduce nonlinear feature transformation into the process of learning the implicit representations of user and item [22].Compared with the traditional collaborative filtering method, it has better performance.However, new items cannot be recommended to users because they have not been rated.The algorithm cannot solve the data-sparse problem and the cold start problem.

Hybrid Model
A hybrid model combines different recommendation models to take advantage of different models' merit and avoid their disadvantages [23].Commonly used hybrid recommendation algorithms include weighted hybrid recommendation algorithm, cross-harmonic recommendation algorithm and meta-model mixed recommendation algorithm [24].For example, Lee [25] learned semantic representation from the context of user conversations by combining recurrent neural networks and convolutional neural networks.Dai [26] proposed a dynamic recommendation algorithm that combines the convolutional neural network and multivariate point process by learning the co-evolutionary model of user-commodity implied features.
To sum up, although the recommendation of single or dual source data based on deep learning has achieved good results, the recommendation accuracy is still poor [27,28,29,30].The reason is that most hybrid recommendation models utilize limited kinds of heterogeneous data.With the development of the Internet, more and more data can be obtained.Using deep learning to fuse multiple heterogeneous data in the data source layer to improve the accuracy of the recommendation results is still worth studying [31,32,33,34].

Overview
This paper proposes a recommendation model based on deep learning, which can process multi-source heterogeneous data: score, review and social information.
For the score, the traditional matrix decomposition method suffers problems of sparse data and low accuracy.This paper adopts the neural network to transform scores into the user/item representations.For the reviews, the traditional topic model can't accurately represent the characteristics of the text.This paper utilizes the Distributed Bag of Words version of Paragraph Vector (PV-DBOW) algorithm to learn the feature representations of reviews.PV-DBOW assumes that the words in the document are independent and unordered, uses document vector representation to predict the words with higher accuracy.For social network data, this paper takes into account the impact of users' friends on users' selection, introduces the user trust model and integrates the social relationship information into the pairwise learning method, which improves the accuracy of the recommendation results.

Recommendation process
Due to the heterogeneity of different data, the traditional hybrid recommendation model usually fuses data at the algorithm level [30], i.e., makes the final recommendation by combining recommendation results from algorithms based on different data.With the development of deep learning, multi-source heterogeneous data such as scores and reviews can be accurately represented through deep networks, which makes it possible to fuse multi-source heterogeneous data fully at the data source level [30].The multisource heterogeneous data recommendation model proposed in this paper combines ratings, reviews, and social network information to make a more accurate recommendation.It has the advantages of high accuracy and strong scalability.
The recommendation process is shown in Figure 1.The score is a user's overall evaluation for an item, which reflects the user's satisfaction with the item.The multi-layer fully connected neural network is used to directly learn the feature vector representations of the user and the item.Reviews can reflect users' evaluations for items in detail and contain rich information about users and items.PV-DBOW algorithm is used to learn the feature representation of the paragraph, thus obtains the feature vector representations of the user and the item.The social network reflects friendships between users.The preferences of users' friends will indirectly affect the users' choices.The social network can be used to improve the prediction accuracy of a user's potential purchase behavior.Bayesian Personalized Ranking(BPR) model is used to rank the nonlinear characteristics of users and items, and further improves the accuracy of the recommendation results.

Recommendation Model
The recommendation model of multi-source heterogeneous data consists of four steps.Firstly, construct the user and item triplet optimization model.Secondly, extract social relations from the social network, and fuse social relation data, review data and scores together.Thirdly, obtain the feature representations of users and items through deep learning.Finally, a top-N recommendation list is acquired from the feature representations of users and items.The model is described in detail as below.

User Trust Model
Social networks can reflect the friendship between users.In real life, users are more likely to choose items that their friends buy or like.Thus, a user's behavior and preferences can be more precisely predicted based on the user's direct and indirect friend relationship.
The trust-based recommendation model assumes that users have similar preferences to their trusted users.In general, direct and indirect friends can affect a user's decisions on different levels, and indirect friends have less impact on the user's decisions than direct friends.According to Kevin Bacon's 6 degrees of separation concept [35,36], the similarity between users can be defined in (1): Among them, a and b represent any two users.l ab represents the distance between user a and user b, and the distance of direct friend is 1, the distance of indirect friend is 2, 3, 4, • • • , s(a, b) represents the similarity between two users.Figure 2 shows the distance values between users.
The similarity between users can be calculated based on the distances between users.We name the direct friend as the first-degree friend, the indirect friends with distance 2 as the second-degree friend, and so forth.We consider an indirect friend with distance 6 at most so that we name the model as 6 Degree Model.Algorithm 1 gives the model's pseudo-codes and shows how to calculate similarities between users.After obtaining the similarity between users from social networks, the influence of different friends on the user's selection can be get, then it can be input to a unified joint representation learning framework together with the other types of data.

Improved BPR Model
BPR is a pairwise learning model [37].A triplet (u, i, j) is constructed based on the user's preferences.A triplet can represent three cases: • User u purchases item i but does not purchase item j.It means user u has a preference on item i than item j. • User u purchases neither item i nor item j.It means the user's preferences cannot be determined.• User u purchases both item i and item j.It means the user's preferences cannot be distinguished.Comparing with pointwise learning, the BPR model has two advantages.The first is considering both the items purchased by the user and the items not-purchased during learning, and the items notpurchased are the ones to be sorted in the future.The second advantage is that this model can reach good results when a small amount of data is selected for recommendation.
BPR is a ranking algorithm based on matrix decomposition.Comparing with algorithms such as funkSVD, it is not a global scoring optimization but a ranking optimization for each user's own commodity preferences.Its result is more accurate.Figure 3 shows the triplet generation process.Plus(+) means user u prefers item i over item j.Minus(−) means user u prefers item j over item i. Question mark (?) means that the user's preference cannot be determined.
However, the triplets constructed based on the standard BPR model are randomly sampled [38], and the effect of social relationships on the sampling process is not considered.In real life, users prefer the items that their friends have selected.So the similarity between users and friends can be applied to the sampling of the BPR model.By considering friends' influences on the user and adding social relation constraints to the sampling process, the triplet can more precisely reflect the user's preferences and thereby the recommendation accuracy can be improved.
According to the user's purchase records and the friendships reflected by the social network, for each user u, the item purchased by the user is defined as i, the item that the user has not purchased is defined as j, the item purchased by the user's direct or indirect friend is defined as p.All the items set in the system are defined as D. The set of items purchased by the user u is defined as D u .The set of items purchased by the user's direct and indirect friends is defined as D p .The item set representing the user's strong preference is firstly D u and secondly D p \D u .The reason is that the user is likely to purchase the item D p \D u purchased by the direct or indirect friends but not by the user according to the influences of the friends on the user's preference.Finally, the item that the user is least likely to purchase is D\(D u ∪ D p ). Constructing a triplet of users and items as a training set based on social network information, the train set T can be expressed as follows.Where user-item triplet (u, i, j) represents that the user u has a greater preference for the item i than the item j.Item i is purchased by the user or by the direct or indirect friends of the user.Item j means the item not purchased by the user or his/her friends.In this way, a user-item triplet based on social relations is constructed.
According to the Bayesian formula, it is necessary to maximize the following posterior probabilities for finding a list of items recommended.In (3), (u, i, j) represents a constructed triplet with the user's preference, and θ represents the parameters of the model.To make the triplet (u, i, j) has the highest probability of occurrence, adjusting the model parameters.
To simplify the aforementioned formula, we assume that the item pairs (i, j) are independent of each other.Then we have: According to the integrity and anti-symmetry of pairwise learning, the above formula can be further simplified to: To obtain the final sorting, a model needs to be constructed to calculate the probability of recommendation for each item.The sigmoid function is used to construct the model in which the probability of the user's purchasing item i is greater than the one of purchasing item j.Where x uij (θ) is an arbitrary parametric model that describes the potential relationship between the user and the item.In other words, any model that describes the relationship between the user and the item can be used.
Improved BPR model is used to directly optimize the recommendation result based on the item recommendation ranking.

PV-DBOW Model
PV-DBOW model is used to learn review data to obtain feature representations of corresponding users and items.As Figure 4 shows, the model samples a text window, then samples a random word from the text window and forms a classification task given the Paragraph Vector [39].PV-DBOW assumes that the words in a sentence are independent of each other and requires only a small amount of data to be stored.
In our model, paragraph vectors are used to predict words.Each review will be mapped into a semantic space and then trained to predict words.The probability that the word w appears in sentence d can be calculated by calculating the softmax function.
Where d um denotes the review given by user u to item m, w denotes the word, V denotes the vocabulary.In order to reduce the cost of computation and improve the calculation efficiency, the negative sampling example strategy is adopted in this model.Therefore, the following objective function can be constructed.
Where f w,dum represents the frequency of word and review pairs.E w N ∼P V represents the expected value on the noise distribution P V , t represents the negative sample numbers.According to (8), the comment representation d um can be obtained and d um corresponds to the user u and the item m.The feature representations of u and m can be obtained from the comment data.

Fully Connected Neural Network
For the reason that neural networks have the ability to quickly find optimal solutions, the fully connected neural network is used to process the scoring data [30].The representations of user and item can be obtained from the score data.In this experiment, two fully connected layers are used to fit the nonlinear correlation: Where φ(•) is the ELU activation function, U 1 , U 2 , c 1 , c 2 are parameters to be learned.Then, the following objective function can be get.
The goal of the scoring model is to make the difference between the predicted score and the true score as small as possible.

BRS c S Model
To fuse multi-source heterogeneous data to make a recommendation, we propose a model named BRS c S (an acronym for BPR-Review-Score-Social).In the model, improved BPR model is used to optimize the ranking, user trust model is used to introduce social relationships into the rating and review data, PV-DBOW model is used to process the review data, and fully connected neural network is used to process the rating data.Finally, an integrated objective function is given to optimize.
The unified objective function for model optimization is given as (11).u represents the fusion feature representation of the user, i and j represent the fusion feature representation of the items.According to the previous definition, it is known that the user u has a greater preference for the item i than the item j • g(•) is a loss function that combines user and item features, this paper defines g(•) as a sigmoid function to calculate the user's different preferences for different items.Here g(u, i, j) = σ(u T i − u T j).L 1 is the objective function of the review data and L 2 is the objective function of the score data.When adding a new data source to the recommendation system, we only need add the corresponding objective function in (11) instead of redesigning the model.The model proposed has good scalability.max W = {W 1 , W 2 } denotes the weight parameters of each model.In the review representation learning model, the weight parameter W 1 is different for different user's different review and need to be learned.In the score representation learning model, the features of users and items are directly obtained, that is, the weight parameter W 2 can be set to 1.It is unnecessary to update W 2 by the optimization objective function.θ represents other parameters to be learned, r u , r m }}.λ is the penalty parameter for each model, and its value is in the interval [0, 1].The objective function L 2 of the score model is preceded by a negative sign because the objective function of the score model should be minimized, while the objective function of the overall model should be maximized.The stochastic gradient descent (SGD) method [40] can be used to optimize (11).
In the end, a recommendation list can be obtained by multiplying the user feature representations and the item feature representations: The larger the s, the higher possibility for the user to select the item.A user's top-N recommendation list can be obtained by sorting the scores obtained from (12) in descending order.training datasets (70%) and testing datasets (30%); 4: construct positive and negative sample triplets g(u, i, j) based on BPR; 5: learn the frequency of word-review pair f w,dum and the expected value E w N ∼P V ; 6: get review representation dum; 7: learn U 1 , U 2 , c 1 , c 2 from score data; 8: get score representation ru, rm; 9: get distance l ab between users; 10: calculate u,i,j g(u, i, j) + λ 1 L 1 − λ 2 L 2 ; 11: update θ = {θ 1 , θ 2 } with back propagation; 12: get corresponding user and item representations U, M according to (11); 13: end for 14: compute s according to (12); 15: return recommendation list L.

Results and discussion
Two groups of experiments are performed on a singlecore GPU GeForce GTX 1080 Ti with Ubuntu 16.04 operating system.The programming environment consists of python3.6,igraph, tensorflow1.4and IntelliJ IDEA.

Dataset
Yelp is a business directory service and crowd-sourced review forum in America.It covers businesses in restaurants, shopping centers, hotels, and tourism.Users can rate, submit comments, and exchange experiences on the Yelp website.This paper uses Yelp public dataset which can be obtained from the Yelp official website for experiments.Yelp dataset is in JSON format and contains details of users and businesses.Data contains the IDs of users and businesses, users' comments and ratings on businesses, and the friendship between the users.The social relationship between users is transformed into user-friend relationship pairs.Comments and ratings are used to analyze users' preferences.
Since the dataset is too sparse, it is necessary to filter out some users with few comments to verify the validity of the proposed model.Data with more than 20 comments are extracted from the Yelp dataset, and the new dataset is named as New-Yelp.Table 1 shows the detailed statistics of the New-Yelp dataset.

Comparable Experiment
Four indicators are used to measure the experimental results: Recall: the ratio of items purchased by users on the recommendation list to all items purchased by users.Precision: the ratio of the number of recommended items purchased by users to the total recommended items.NDCG: normalized discounted cumulative gain, it is used to calculate the ranking quality of recommended items.HT: the hit rate refers to whether the user has purchased the recommended item.If the user has purchased the recommended item, it means hits.Otherwise, it means misses.

Experiment I
To select a model to deal with texts and prove the positive effect of fusing social data on recommendation, Experiment I compares six models as below.HDC and SEL are the most commonly used models to process text information in the recommendation system.HRS model uses the HDC to process reviews, and SRS model uses the SEL to process reviews.BR, BRS, BRS c , and BRS c S are the models using the BPR framework and PV-DBOW algorithm to process reviews.
• BR (BPR+Review) model is based on the BPR framework and uses the reviews for the recommendation.The experimental results show that BRS c surpasses BSc and UserCF in terms of three indicators (Recall, Precision and HT).That proves the positive effect of fusing reviews together with scores to make a recommendation.The model proposed by us, BRS c S, performs best in terms of all the indicators.It proves that the fusion model based on deep learning outperforms the traditional collaborative filtering algorithm due to the merits of feature representation by deep learning.
Our model can fully fuse multi-source heterogeneous data such as scores, reviews, and social network information from data source level by deep learning so as to make a more accurate recommendation.Introducing social network information can also solve the cold start and data-sparse problems because direct and indirect friends' data can be used for making a recommendation.

Conclusions
To utilize heterogeneous data to improve recommendation accuracy and solve cold start and data-sparse problems, we propose a hybrid recommendation model that can fuse multi-source heterogeneous data such as scores, reviews and social network information.The model is named as BRS c S based on deep learning.Experiments are performed to compare our model with other recommendation models.The results show that our model can outperform the other models in Recall, Precision, NDCG and HT.Introducing the user trust model to fuse social data together with scores and reviews can solve the cold start problem and data-sparse problem effectively because friends' data can be used to make a recommendation.The model is scalable and can fuse more types of heterogeneous data easily.However, the model proposed is based on a neural network, it is difficult to explain the recommendation results, i.e., the interpretability of this model is weak.
In the future, we plan to introduce image data into our recommendation model because images contain rich semantic information.

Figures
Figures
Figure 1 Recommendation process.

Figure 2 Figure 3
Figure 2 Social relations between users.

Figure 4
Figure 4 Distributed Bag of Words of Paragraph Vectors.

Figure 7
Figure 7 Indicators of Experiment II

Figures Figure 1
Figures

Figure 2 Social
Figure 2

Figure 4 Distributed
Figure 4

Figure 7 Indicators
Figure 7 70% of the data is used for training and 30% of the data for testing.The batch size is 64, max train epoch is 40, the negative sample is 5, and the feature dimension is 300.BRS c S is the hybrid model proposed by us.The top-5 recommendation experimental results are shown in Table2and Figure5.The top-10 recommendation experimental results are shown in Table3and Figure6.The best results are shown in bold.In the experiment, we compare BRS, HRS and SRS, which are 3 different recommendation models based on reviews and social information.The results present that BRS model outstrips HRS model and SRS model significantly and proves that utilizing BPR framework in our model is a wise decision.Furthermore, experiments show that BRS model surpasses BR model and BRS c S model surpasses BRS c model, which proves that adding social networks can improve recommendation accuracy.Comparing all the models, we can see that the BRS c S model proposed by us performs best in terms of Precision, Recall, NDCG and HT.4.2.2 Experiment IIExperiment II compares our model with broadly-used classic recommendation models.BS c model is based on the BPR framework and uses scores to make a recommendation.UserCF (Userbased collaborative filtering algorithm) is one of the most popular recommendation algorithms and uses scores as input.BRS c is a model combining reviews and scores for the recommendation.BRS c S is the model proposed by us.The experimental results are shown in Table4and Figure7.The best indicator results are shown in bold.

Table 1
Detailed statistics of New-Yelp.

Table 2
Experiment I of top-5 Recommendation.

Table 3
Experiment I of top-10 Recommendation.

Table 4
Experiment II of Recommendation.
This is a list of supplementary les associated with this preprint.Click to download.