In this article written by Suresh Kumar Gorakala, author of the book Building Recommendation Engines we will learn how to build a basic recommender system using R. In this article we will learn about various types of recommender systems in detail. This article explains neighborhood-similarity-based recommendations, personalized recommendation engines, model-based recommender systems and hybrid recommendation engines.
Following are the different subtypes of recommender systems covered in this article:
- Neighborhood-based recommendation engines
- User-based collaborative filtering
- Item-based collaborative filtering
- Personalized recommendation engines
- Content-based recommendation engines
- Context-aware recommendation engines
(For more resources related to this topic, see here.)
Neighborhood-based recommendation engines
As the name suggests, neighborhood-based recommender systems considers the preferences or likes of other users in the neighborhood before making suggestions or recommendations to the active user. While considering the preferences or tastes of neighbors, we first calculate how similar the other users are to the active user and then new items from more similar users are recommended to the user. Here the active user is the person to whom the system is serving recommendations. Since similarity calculations are involved these recommender systems are also called similarity-based recommender systems. Also since preferences or tastes are considered collaboratively from a pool of users these recommender systems are also called as collaborative filtering recommender systems. In this type of systems the main actors are the users, products and users preference information such as rating/ranking/liking towards the products.
Preceding image is an example from Amazon showing collaborative filtering
The collaborative filtering systems come in two flavors, They are:
- User-based collaborative filtering
- Item-based collaborative filtering
When we have only the users interaction data of the products such as ratings, like/unlike, view/not viewed and we have to recommend new products then we have to choose Collaborative filtering approach.
User-based Collaborative Filtering
The basic intuition behind user-based collaborative filtering systems is, people with similar tastes in the past like similar items in future as well. For example, if user A and user B have very similar purchase history and if user A buys a new book which user B has not yet seen then we can suggest this book to User B as they have similar tastes.
Item-based Collaborative filtering
In this type of recommender systems unlike user-based collaborative filtering, we use similarity between items instead of similarity between users. Basic intuition for item-based recommender systems is if a User likes item A in past he might like item B which is similar to item A.
In this approach instead of calculating similarity between users we calculate similarity between items or products. The most common similarity measure used for this approach is cosine similarity. Like user-based collaborative approach, we project the data into vector space and similarity between items is calculated using cosine angle between items.
Similar to user-based collaborative filtering approach there are two steps for item-based collaborative approach. They are:
- Calculating the similarity between items.
- Predicting the ratings for the non rated item for a active user by making use of previous ratings given to other similar items
Advantages of user-based collaborative filtering
- Easy to implement
- Neither the content information of the products nor users profile information is required for building recommendations
- New items are recommended to users giving Surprise factor to the users
Disadvantages of user-based collaborative filtering
- This approach is computationally expensive as all the user information, product, rating information is loaded into the memory for similarity calculations.
- This approach fails for new users where we do not have any information about the users. This problem is called cold-start problem.
- This approach performs very poor if we have little data.
- Since we do not have content information about users or products. We cannot generate recommendations accurately only based on rating information only.
Content-based recommender systems
The recommendations are generated by considering only the rating or interaction information of the products by the users, that is suggesting new items for the active user are based on the ratings given to those new items by similar users to the active user.
Assume the case of a person has given 4 star rating to a movie. In a collaborative filtering approach we only consider this rating information for generating recommendations. In real, a person rates a movie based on the features or content of the movie such as its genre, actor, director, story, and screenplay. Also the person watches a movie based on his personal choices. When we are building a recommendation engine to target users at personal level, the recommendations should not be based on the tastes of other similar people but should be based on the individual users’ tastes and the contents of the products.
A recommendation which is targeted at personalized level and which considers individual preferences and contents of the products for generating recommendations are called content-based recommender systems.
Another motivation for building content-based recommendation engines are they solve the cold start problem which new users face in collaborative filtering approach. When a new user comes based on the preferences of the person we can suggest new items which are similar to his tastes.
Building content-based recommender systems involves three main steps as follows:
- Generating content information for products.
- Generating a user profile, preferences with respect to the features of the products.
- Generating recommendations predicting list of items which the user might like.
Let us discuss each step in detail:
- Content extraction: In this step, we extract the features that represent the product. Most commonly the content of the products is represented in the vector space model with products name as rows and features as columns.
- User Profile generation: In this step, we build the user profile or preference matrix or vector space model matching the products content.
- Generating Recommendations: Now that, we have the generated the product content and user profile, the next step will be to generate the recommendations.
Recommender systems using machine learning or any other mathematical, statistical models to generate recommendations are called as model-based systems
In this approach we first represent the user profiles and product content in vector forms and then we take cosine angle between each vector. The product which forms less angle with the user profile is considered as the most preferable item for the user. This approach is a standard approach while using neighborhood approach for Content based recommendations. Empirical studies shown that this approach gives more accurate results compared to other similarity measures.
Classification-based approaches fall under model-based recommender systems. In this approach, first we build a machine learning model by using the historical information, with user profile similar to the product content as input and the like/dislike of the product as output response classes. Supervised classification tasks such as logistic regression, KNN-classification methods, probabilistic methods and so on can be used.
- Content-based recommender systems are targeting at individual level
- Recommendations are generated using the user preferences alone unlike from user community as in collaborative filtering
- This approaches can be employed at real time as recommendation model doesn’t need to load the entire data for processing or generating recommendations
- Accuracy is high compared to collaborative approaches as they deal with the content of the products instead of rating information alone
- Cold start problem can be easily handled
- As the system is more personalized and the generated recommendations will become narrowed down to only user preferences with more and more user information comes into the system
- As a result, no new products that are not related to the user preferences will be shown to the user
- The user will not be able to look at what is happening around or what’s trending around
Context-aware recommender Systems
Over the years there has been evolution in recommender systems from neighborhood approaches to personalized recommender systems which are targeted to the individual users. These personalized recommender systems have become a huge success as this is useful at end user level and for organizations these systems become catalysts to increase their business. The personalized recommender systems, also called as content-based recommender systems are also getting evolved into Context aware recommender systems.
Though the personalized recommender systems are targeted at individual user level and caters recommendations based on the personal preferences of the users, still there was scope to improve or refine the systems. Same person at different places might have different requirements. Likewise same person has different requirements at different times.
Our intelligent recommender systems should be evolved enough to cater to the needs of the users for different places, at different times. Recommender System should be robust enough to suggest cotton shirts to a person during summer and suggesting Leather Jacket in winter. Similarly based on the time of the day suggesting Good restaurants serving a person’s personal choice breakfast and dinner would be very helpful. These kinds of recommender systems which considers location, time, mood, and so on that defines the context of user and suggests personalized recommendations are called context aware recommender systems.
At broad level, context aware recommender systems are content-based recommenders with the inclusion of new dimension called context. In context aware systems, recommendations are generated in two steps:
- Generating list of recommendations of products for each user based on users’ preferences, that is content-based recommendations.
- Filtering out the recommendations that are specific to a current context.
For example, based on past transaction history, interaction information, browsing patterns, ratings information on e-wallet mobile app, assume that User A is a movie lover, Sports lover, fitness freak. Using this information the content-based recommender systems generate recommendations of products such as Movie Tickets, 4G data offer for watching Football matches, Discount offers at GYM. Now based on the GPS co-ordinates of the mobile if the User A found to be at a 10K RUN marathon, then my Context aware recommendation engine will take this location information as the context and filters out the offers that are relevant to the current context and recommends Discount Offers at GYM to the user A.
Most common approaches for building Context Aware Recommender systems are:
- Post filtering Approaches
- Pre-filtering approaches
In pre-filtering approach, context information is applied to the User profile and product content. This step will filter out all the non relevant features and final personalized recommendations are generated on remaining feature set. Since filtering of features are made before generating personalized recommendations, these are called pre-filtering approaches.
Post filtering approaches
In post-filtering, firstly personalized recommendations are generated based on the user profile and product catalogue then the context information is applied for filtering out the relevant products to the user for the current context.
- Context aware systems are much advanced than the personalized content-based recommenders as these systems will be constantly in sync with user movements and generate recommendations as per current context.
- These systems are more real-time nature.
- Serendipity or surprise factor as in other personalized recommenders will be missing in this type of recommendations as well.
In this article, we have learned about popular recommendation engine techniques such as, collaborative filtering, content-based recommendations, context aware systems, hybrid recommendations, model-based recommendation systems with their advantages and disadvantages. Different similarity methods such as cosine similarity, Euclidean distance, Pearson-coefficient. Sub categories within each of the recommendations are also explained.
Resources for Article:
- Building a Recommendation Engine with Spark [article]
- Machine Learning Tasks [article]
- Machine Learning with R [article]