Recommender systems are common these days. You may not have noticed, but you might already be a user or receiver of such a system somewhere. Most of the well-performing e-commerce platforms use recommendation systems to recommend items to their users. When you see on the Amazon website that a book is recommended to you based on your earlier preferences, purchases, and browse history, Amazon is actually using such a recommendation system. Similarly, Netflix uses its recommendation system to suggest movies for you.
(For more resources related to this topic, see here.)
A recommender or recommendation system is used to recommend a product or information often based on user characteristics, preferences, history, and so on. So, a recommendation is always personalized.
Until recently, it was not so easy or straightforward to build a recommender, but Azure ML makes it really easy to build one as long as you have your data ready.
This article introduces you to the concept of recommendation systems and also the model available in ML Studio for you to build your own recommender system. It then walks you through the process of building a recommendation system with a simple example.
The Matchbox recommender
Microsoft has developed a large-scale recommender system based on a probabilistic model (Bayesian) called Matchbox. This model can learn about a user’s preferences through observations made on how they rate items, such as movies, content, or other products. Based on those observations, it recommends new items to the users when requested.
Matchbox uses the available data for each user in the most efficient way possible. The learning algorithm it uses is designed specifically for big data. However, its main feature is that Matchbox takes advantage of metadata available for both users and items. This means that the things it learns about one user or item can be transferred across to other users or items.
You can find more information about the Matchbox model at the Microsoft Research project link.
Kinds of recommendations
The Matchbox recommender supports the building of four kinds of recommenders, which will include most of the scenarios. Let’s take a look at the following list:
- Rating Prediction: This predicts ratings for a given user and item, for example, if a new movie is released, the system will predict what will be your rating for that movie out of 1-5.
- Item Recommendation: This recommends items to a given user, for example, Amazon suggests you books or YouTube suggests you videos to watch on its home page (especially when you are logged in).
- Related Users: This finds users that are related to a given user, for example, LinkedIn suggests people that you can get connected to or Facebook suggests friends to you.
- Related Items: This finds the items related to a given item, for example, a blog site suggests you related posts when you are reading a blog post.
Understanding the recommender modules
The Matchbox recommender comes with three components; as you might have guessed, a module each to train, score, and evaluate the data. The modules are described as follows.
The train Matchbox recommender
This module contains the algorithm and generates the trained algorithm, as shown in the following screenshot:
This module takes the values for the following two parameters.
The number of traits
This value decides how many implicit features (traits) the algorithm will learn about that are related to every user and item. The higher this value, the precise it would be as it would lead to better prediction. Typically, it takes a value in the range of 2 to 20.
The number of recommendation algorithm iterations
It is the number of times the algorithm iterates over the data. The higher this value, the better would the predictions be. Typically, it takes a value in the range of 1 to 10.
The score matchbox recommender
This module lets you specify the kind of recommendation and corresponding parameters you want:
- Rating Prediction
- Item Prediction
- Related Users
- Related Items
Let’s take a look at the following screenshot:
The ML Studio help page for the module provides details of all the corresponding parameters.
The evaluate recommender
This module takes a test and a scored dataset and generates evaluation metrics, as shown in the following screenshot:
It also lets you specify the kind of recommendation, such as the score module and corresponding parameters.
Building a recommendation system
Now, it would be worthwhile that you learn to build one by yourself. We will build a simple recommender system to recommend restaurants to a given user.
ML Studio includes three sample datasets, described as follows:
- Restaurant customer data: This is a set of metadata about customers, including demographics and preferences, for example, latitude, longitude, interest, and personality.
- Restaurant feature data: This is a set of metadata about restaurants and their features, such as food type, dining style, and location, for example, placeID, latitude, longitude, price.
- Restaurant ratings: This contains the ratings given by users to restaurants on a scale of 0 to 2. It contains the columns: userID, placeID, and rating.
Now, we will build a recommender that will recommend a given number of restaurants to a user (userID). To build a recommender perform the following steps:
- Create a new experiment. In the Search box in the modules palette, type Restaurant. The preceding three datasets get listed. Drag them all to the canvas one after another.
- Drag a Split module and connect it to the output port of the Restaurant ratings module. On the properties section to the right, choose Splitting mode as Recommender Split. Leave the other parameters at their default values.
- Drag a Project Columns module to the canvas and select the columns: userID, latitude, longitude, interest, and personality.
- Similarly, drag another Project Columns module and connect it to the Restaurant feature data module and select the columns: placeID, latitude, longitude, price, the_geom_meter, and address, zip.
- Drag a Train Matchbox Recommender module to the canvas and make connections to the three input ports, as shown in the following screenshot:
- Drag a Score Matchbox Recommender module to the canvas and make connections to the three input ports and set the property’s values, as shown in the following screenshot:
- Run the experiment and when it gets completed, right-click on the output of the Score Matchbox Recommender module and click on Visualize to explore the scored data.
You can note the different restaurants (IDs) recommended as items for a user from the test dataset. The next step is to evaluate the scored prediction. Drag the Evaluate Recommender module to the canvas and connect the second output of the Split module to its first input port and connect the output of the Score Matchbox Recommender module to its second input. Leave the module at its default properties.
Run the experiment again and when finished, right-click on the output port of the Evaluate Recommender module and click on Visualize to find the evaluation metric.
The evaluation metric Normalized Discounted Cumulative Gain (NDCG) is estimated from the ground truth ratings given in the test set. Its value ranges from 0.0 to 1.0, where 1.0 represents the most ideal ranking of the entities.
You started with gaining the basic knowledge about a recommender system. You then understood the Matchbox recommender that comes with ML Studio along with its components. You also explored different kinds of recommendations that you can make with it. Finally, you ended up building a simple recommendation system to recommend restaurants to a given user.
For more information on Azure, take a look at the following books also by Packt Publishing:
- Learning Microsoft Azure (https://www.packtpub.com/networking-and-servers/learning-microsoft-azure)
- Microsoft Windows Azure Development Cookbook (https://www.packtpub.com/application-development/microsoft-windows-azure-development-cookbook)
Resources for Article:
- Introduction to Microsoft Azure Cloud Services[article]
- Microsoft Azure – Developing Web API for Mobile Apps[article]
- Security in Microsoft Azure[article]