Recommendation Systems — A walk through

Chaitanya Belhekar
6 min readAug 27, 2020

It’s a jungle out there as far as understanding what a recommender system is, so this article focuses on providing a trajectory in order to study recommendation systems.

A recommender system is an application of machine learning that provides recommendations to users on what they might like based on their historical preferences. It can be further defined as a system that produces individualized recommendations as output or has the effect of guiding the user in a personalized way to interesting objects in a larger space of possible options. Examples:

  • Offering news articles readers, based on reader’s interests.
  • Offering customers suggestions about what they might like to buy, based on their past history of purchases/ searches.

“A lot of times, people don’t know what they want until you show it to them.” Steve Jobs

There are majorly three types of recommender systems:

  • Collaborative-filtering: Collaborative recommender systems aggregate ratings or recommendations of objects, recognize commonalities between the users on the basis of their ratings, and generate new recommendations based on inter-user comparisons. They work well for complex objects where variations in taste are responsible for much of the variation in preferences. Collaborative filtering is based on the assumption that people who agreed in the past will agree in the future and that they will like similar kind of objects as they liked in the past.
  • Content-based: A content-based recommender learns a profile of the new user’s interests based on the features present, in objects the user has rated. It’s basically a keyword specific recommender system here keywords are used to describe the items. Thus, in a content-based recommender system the algorithms used are such that it recommends similar items that the user has liked in the past or is examining currently.
  • Hybrid: Combining both the recommender systems in a manner that suits a particular industry is known as Hybrid Recommender system. Netflix is a good example of a hybrid system. They make recommendations by comparing the watching and searching habits of similar users (collaborative filtering) as well as by offering movies that share characteristics with films that a user has rated highly (content-based filtering).

Why recommender system are necessary?

The long tail: physical institutions can only provide what is popular, while online institutions can make everything available.

The long tail phenomenon makes recommendation systems necessary. The distinction between the physical and online worlds has been called the long tail phenomenon, as disclosed in the figure. The vertical axis represents popularity and the horizontal axis is for the items. Physical institutions provide only the most popular items to the left of the vertical line, while the online institutions provide the entire range of items: the tail as well as the popular items. The long-tail phenomenon forces online institutions to recommend items to individual users.

Content-Based Recommender System

Content-Based systems recommends items to customer similar to previously high rated items by the customer. The systems focus on the properties of items. Similarity of items is determined by measuring the similarity in their properties. In a content-based system, we must construct for each item a profile, which is a record or collection of records representing important characteristics of that item., e.g. actors, director, year, genre, etc. in case of movies. From the item profile, user profile is inferred and using the user profile other items are recommended from a catalog of the item profiles. Item profile is a set of features (vector), it can be boolean or numerical.

Content-Based Recommendations

Collaborative Filtering Recommender System

Collaborative filtering is based on the concept that similar people like similar things. It predicts which item a user will like based on the item preferences of other similar users. Collaborative filtering uses a user-item (utility) matrix to generate recommendations. This matrix is populated with values that indicate a user’s degree of preference towards a given item. These values can represent either explicit feedback (direct user ratings) or implicit feedback (indirect user behavior such as listening, purchasing, watching).

Collaborative filtering is based on the concept that similar people like similar things. It predicts which item a user will like based on the item preferences of other similar users. Collaborative filtering uses a user-item (utility) matrix to generate recommendations. This matrix is populated with values that indicate a user’s degree of preference towards a given item. These values can represent either explicit feedback (direct user ratings) or implicit feedback (indirect user behavior such as listening, purchasing, watching).

Content-Based Recommendations

Algorithms that can be used for matrix factorization: Alternating Least Squares (ALS), Stochastic Gradient Descent (SGD), Singular Value Decomposition (SVD)

We have looked into user-user collaborative filtering, there is an another point of view: item-item collaborative filtering. Here, we start with an item and find its similar items (neighborhood items) and estimate rating for the item based on the neighborhood item ratings.

Item-based Collaborative Filtering Recommendation Algorithms — A paper explaining item-based collaborative filtering recommendation algorithms.

Item-to-item collaborative filtering — Amazon — The recommendation system implemented at Amazon

Various Implementations of Collaborative Filtering

The types of collaborative filtering techniques are — Memory based approach & Model based approach.

The model based approach can be further divided into: Matrix Factorization, Clustering, Deep Learning.

Read about various implementation of collaborative filtering here:

Evaluating Recommender Systems

Precision@K is a popular evaluation metric for recommender system. It simply looks at the top K recommendations and calculate what proportion of those recommendations are actually relevant to the user.

Recall@K is the proportion of items that were found in the top K recommendations.

You can use this library — recmetrics (a python library of evaluation metrics and diagnostic tools for recommender systems.)

Case Study: Netflix’s Recommendation System

As we all know, Netflix is an streaming site for films and TV series. The most valued asset of Netflix is their recommendation system. Let’s have a look into Netflix’s recommendation system.

The front page of Netflix is constructed as a panel containing rows with subjects such as Top Picks, My List, and Popular on Netflix. The top row is dedicated to what’s on my list (watchlist). Then they have their own series — Netflix Originals. The recommendations are in the charts — Trending Now chart, Top Picks chart, Popular on Netflix chart. Read about different recommendations provided by Netflix here.

The recommendation system has definitions as follows:

Netflix Personalization: How Netflix personalized the movie watching experience for it’s users.

The article explains the hybrid recommender system used by Netflix — popularity and predicted rating.

Implementing recommender systems in python

Further Reading:

  • Practical Recommender Systems Book by Kim Falk. You can read the live book here. I would suggest to read the part-2 of the book that covers recommender algorithms in detail.

--

--

Chaitanya Belhekar
Chaitanya Belhekar

Written by Chaitanya Belhekar

Just an old soul trapped in a tiny body. Also a home-grown data science enthusiast. An avid reader, but a lazy writer.

No responses yet