Recommendation Systems (RS) provide suggestions for items to be of use to a user. The suggestions provided are aimed at supporting the users in various decision-making processes, such as what items to buy, what music to listen, or what news to read. “Item” is the general term used to denote what the system recommends to users.
Recommendations are ubiquitous, I’m sure you must have interacted with many of them, even though you may not have recognized it. An example is an e-commerce site – When you go to the page of an item, you see other items that are recommended to either be purchased together with it, or other items that you may want to look at. In social networking sites, you see some people you should connect to. A news/ media site recommends you articles that you should read. And in general, the whole advertising infrastructure revolves around sending you ads that are recommended for you and that you are therefore likely to click on and whose items you are likely to purchase. The basic theory behind recommending a product to a user is that we are more likely to buy what our friends suggest or a similar item that we have used in the past; because we tend to trust these choices rather than trusting the seller.
Recommendation Systems in e-commerce:
Recommendation systems find a rich application in e-commerce field. Where they can be used to impart a personalized experience to the users as well as leverage opportunities to increase the wallet share of the customers; hence impacting the top and bottom lines positively.
Personalization of e-commerce:
One use of Recommendation Systems is to make the whole experience of an e-commerce site more personalized to make a customer feel more important and valued. This can be done by using previously browsed products by the user or by using user’s profile information. An item can be recommended to a user based on his interests about whom the system gets to know from his profile information, past transactions, search and browse history etc. Personalization can also be done using collaborative filtering approach – recommending the items to the user which other users of same taste i.e. the like-minded users also like. e.g. Personalization of Home Page
Recommendations can also be used to cross-sell or up-sell items to a user –
Cross-sell: Cross-selling is the action or practice of selling an additional product or service to an existing customer. An RS(Recommendation System) can be used to cross sell a product to a user either to increase the revenue from the user or to protect the relationship with the client. e.g. Attractive bundling options pushed on the product detail page, by clubbing inter-related items, or Cross Selling Attractive Add-On options on the Shopping Cart
Social Recommendations: Related items are pushed on the product detail page based upon the social activities of other users . Social ratings and reviews of the current product are used
Up-sell: Up-selling implies selling something that is more profitable or otherwise preferable for the seller instead of, or in addition to, the original sale. An RS can be used to up sell a product to a user to increase the profit by introducing an option which the customer may have not considered. e.g. High-end products belonging to the same category are pushed on the product detail page to increase the up-selling opportunities
Recommendation systems are an excellent means to sell accessories which are related to the product which is being bought by the user. Sometimes a user may not have considered buying accessories but if they are recommended to the user he may consider it which would help the retailer to sell more and earn more. Related accessories are pushed on the product detail page to increase the cross selling opportunities
Market Basket Analysis:
An analysis of the shopping carts is done for all the previous sales at the retail site to find association among a set of items. And the items in the current shopping cart are matched against these associations to suggest other relevant items.
Understanding the Nuts and Bolts of Recommendation Systems:
Recommendations can be thought of as the next step to search. As the data is growing at an exponential rate, search is becoming time consuming and unsatisfactory. With an overwhelming range of products at hand on the web-site user faces a problem of information overload which leads to confusion and complexity in decision making. RS help the user to make a decision by giving the user a recommendation.
Approaches to Recommendation Systems:
Collaborative Filtering: This approach recommends the user an item liked by other users who have a similar taste. The similarity among users is calculated by measuring their previous ratings’ history.
Content Based: The system recommends an active user items that are similar to the ones which the customer has liked in the past.
Knowledge Based: They recommend items to the users based on specific domain knowledge. They take in account the features of the items and how these features meet the user’s needs and preferences and ultimately how the item is useful for the user.
Hybrid: They are based on the combination of the above mentioned techniques. They are used to overcome the disadvantages of one technique by implementing another technique at the same time.
Data mining in Recommendation Systems:
The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. Recommendation Systems (RS) typically apply techniques and methodologies from other neighboring areas – such as Human Computer Interaction (HCI) or Information Retrieval (IR). However, most of these systems bear in their core an algorithm that can be understood as a particular instance of a Data Mining (DM) technique. The process of data mining mainly consists of three steps:
Data Pre-Processing: To prepare the data for analysis we need to pre-process it i.e. clean, filter and transform it for further use. Pre-processing the data consists of – noise reduction, sampling and dimensionality reduction.
Noise Reduction: Low level data errors resulting from imperfect data collection or data objects which are weekly relevant or irrelevant is known as noise. There are various techniques to reduce noise; approach based on local outlier factor, hyper clique based etc.
Sampling: Sampling is the main technique used in DM for selecting a subset of relevant data from a large data set. It is used both in the preprocessing and final data interpretation steps. Sampling may be used because processing the entire data set is computationally too expensive. The processes followed are – random sampling with or without replacement or stratified sampling with or without replacement.
Dimensionality Reduction: It is common in RS to have not only a data set with features that define a high dimensional space, but also very sparse information in that space – i.e. there are values for a limited number of features per object. The two most relevant dimensionality reduction algorithms in the context of RS: Principal Component Analysis (PCA) and Singular Value Decomposition (SVD).
Data Analysis: The pre-processed data is analyzed to provide meaningful insights which can be used in recommendation systems. When we do data analysis for RS we use the classification approach. There are many kinds of classifiers but we talk about only supervised and unsupervised classifiers.
Supervised classifiers: In supervised classifiers the labels under which you want to distribute the data are previously defined and algorithms are run to classify the data accordingly. Some of the classification algorithms which are used are: kNN cosine (k-Nearest Neighbors with cosine distance measure), Naïve-Bayes classifiers, and decision trees, artificial neural networks, and support vector machines.
Unsupervised classifiers: Unsupervised classification is known as clustering, the labels or categories are unknown in advance and the task is to suitably (according to some criteria) organize the elements at hand. K-means is the most popular clustering algorithm of all.
Result interpretation: Identifying any visible patterns or insights that can be leveraged for taking best feasible responses and actions.
Crowd-Sourcing to come up with a Suitable Recommendation System Algorithm:
Some firms are crowd-sourcing data science (using a crowd of humans to do some of the data analysis) part involved in coming up with a suitable recommendation system algorithm. e.g. in 2006, Netflix released a data set with one hundred million anonymized movie ratings of its customers. And they put up a challenge to the data science community: Use this data set to design a recommendation algorithm that would beat Netflix’s existing algorithm by at least 10 percent. And associated with this challenge was a grand prize of one million dollars, which the winning team would get. And over all, over five thousand teams from all across the world competed over the following three years to win the bounty. Platforms such as Kaggle, have made these data science competitions, kind of ubiquitous, where data science teams perform specific challenges. E.g., the largest prize is the Heritage Health prize, where teams competed to win three million dollars. In the Heritage Health Prize they tried to identify patients who will be admitted to a hospital within the next year using historical claims data.