Zoe.ai logo
Schedule a demo
This website uses necessary and analytic cookies to measure traffic.
We may place these for analysis of our visitor data, to improve our website and to give you a great website experience. For more information about the cookies we use manage the settings.

Real-time recommendation models - part I.

Introduction to session-based approach

  • Lukáš Matějka
    Lukáš Matějka
  • 12 min read
Real-time recommendation models - part I.
Table of Contents

Nowadays, all e-commerce market players (among other things) are constantly striving to provide the best possible customer experience. One option how to reach that is by being able to recommend products to customers in a tailored way, and ideally in real-time. In the previous article, we discussed why the individual recommendation is a MUST.

It has been shown that using the most recent information from the user’s session (i.e. what users are looking at now, clicking on, etc.) can significantly improve the resulting product recommendation (see e.g. research regarding Evaluation of Session-based Recommendation Algorithms here). Moreover, thanks to this approach, products are recommended to the customer gradually according to their current behavior on the website. And last but not least, often the long-term data could completely miss which might ruin many models that benefit only from this perspective. Briefly said, short-term customer intent is very important as input for the recommendation.

Therefore, in this article, we will focus on models that can effectively use information from the current user’s session and compare different algorithms. Also, an example will be shown where a model with better accuracy metrics does not necessarily mean it is the winner. The article will be divided into two parts. Let’s go into it!

  • Part 1 — First part covers the basics of recommendation models and mainly explain the session-based approach on examples though several algorithms.
  • Part 2 — Second part will present concrete result coming from experiments on real-world data.

Quick flight through the world of recommendations

In order to understand the concept of the session-based approach and all the important outcomes from experiments we are going to show here easily, it is crucial to outline the basic concepts of recommendation systems. Feel free to skip this part if you have already been a recommendation specialist.

What is the recommendation task? I bet you must have already been part of some recommendation systems. It does not matter whether you listen to music on Spotify, watch movies on Netflix or buy some things at Amazon. You have surely noticed phrases like: “Content for you”, “You may also like this” or “Other customers buy”. They serve you products that you would exactly like the most. How do they do that? All these companies have their own recommendation engine which is composed of some recommendation model(s).

A definition of such a recommendation system (model), or RS for short, might sound as this:

“RS is a system designed to recommend things to user to help him or her find the most relevant items of interest using many different features. It usually tries to know the user better than him/her self.”

RS learns from a huge amount of data and then the recommendation task is to predict a list of items in the kind a way as a classification task: model will be successful if user will like that list and unsuccessful in the opposite case.

The art of Recommendation systems

The art of Recommendation System

There are many more or less popular ways how to deal with recommendation tasks. You can serve products to customers according to:

1. popular items,

2. customer’s last seen/purchased items,

3. interactions that were made by other customers and are somehow similar* to this customer,

4. items’ similarity — also known as content-based filtering approach: as a result it shows similar products to ones the user has already seen/bought

5. sequence-based/session-aware/session-based approach (which we are focusing on here),

6. hybrid of any of these.

*similarity can be measured variously — e.g. by similar interactions (then it is known as collaborative filtering approach), by customer’s profile data, …

The first two methods are both easy to understand and implement. The main difference between 3. and 4. can be read from the following picture:

Collaborative filtering vs Content-based filtering

Collaborative filtering vs Content-based filtering

In the case of collaborative filtering only actions of other users are needed for building the model and on the opposite, content-based filtering requires only features of items to be given.

To make it not so easy, there are plenty of techniques to tackle each recommendation method above — such as (deep) factorization machines, un(supervised) learning methods, any wide or deep learning model, etc.

How to measure if a recommendation is good?

In ML prediction tasks, it is usually straightforward to select the best models — concerning underlying data one can choose from well-known metrics such as accuracy, AUC, prediction error, F1 score, etc. However, within recommendation models there is no exact metric that you can select and say that your model is good or that in 95% it will give a good recommendation to the customer.

Fortunately, many well-known metrics applicable to recommendation tasks have already been introduced. You may have heard for example about hit rate or ndcg.

Except for the fact that everyone can create his own metric or modify any from the existing ones, we can divide them into two main categories described below: accuracy and beyond-accuracy metrics where @k means the metric is measured for first k recommendations:


  • Hit rate (hit_rate@k) — how many times the model recommended at least one item correctly
  • Cumulative hit rate (cum_hit_rate@k) — how many items were recommended correctly in total
  • Average reciprocal hit rate (arhr@k) — how many items on how good positions were recommended correctly in total
  • Mean average precision (map@k) — how many items from the recommended items were relevant (on average)
  • Mean reciprocal rank (mrr@k) — like arhr, but only cares about the first relevant recommended item
  • Normalized discounted cumulative gain (ndcg@k) — how much relevant recommended items differ from ideal sorting


  • Coverage (coverage@k) — how many different items from item set does model recommend
  • Brand diversity (brand_diversity_n@k) — how many recommendations had more unique brands than n
  • Minimal price ratio (min_price_ration_n@k) — how many recommendations had big (bigger than n) ratio of min_price of the most expensive item recommended vs min_price of the cheapest item recommended
  • Average popularity (avg_popularity@k) — how popular are the recommended items

We have been using more than twenty of metrics for our evaluation purposes. You can read more details in one of the previous articles.

Are these metrics all you need to say that the recommendation model is great? The answer is no, apart from these numerical metrics there is one more but not less important factor that should be always considered — and that is the quality of recommended items — in other words, if every customer will be satisfied by the recommendation all the time. Even though it does not look like that, it is almost impossible to easily measure if every person like the recommendation list — because how can you calculate that level of each person’s satisfaction?

The fact that the user will click on/buy the item that was recommended to him does not necessarily mean that he liked it. Maybe there can be item that he would like much more but because it was not recommended/shown to him we could never know if he would appreciate it more.

Last but not least, we can measure the performance of recommendation systems at two different states: offline or via real A/B test (with other metrics like revenue per session, CTR, average order value or other business KPIs). Both have their pros and cons. But what has to be mentioned here is the following: For developing of such system (and for finding the most suitable model) you usually have no choice but to rely on offline evaluation. However, even the very best model developed based on offline metrics mentioned above does not mean victory unless you find out the real impact of that model from real customers.

And now, is this already all?

If you are creating a model for the specific scenario (for example for one client) then usually yes. But in general, this is still not enough, this kind of measurement should be evaluated at many kinds of datasets.

There is for example an option to compare the results of your model with several public datasets focused on RS presented at competitions like Kaggle or DrivenData.

You can also consider the creation of pre-specified data segments in order to understand the behavior of your model. Unfortunately, this is something often neglected in the research field. You may find out that the users who viewed fewer items in their session have different needs on the model. And then probably the only solution is to create a hybrid model.

Also, the performance needs (such as memory consumption, training time etc.) might be other parameters to take into account when implementing model to client’s webpage.

And that should finally be all. So, let’s dive into the most important part of this article that you all have been waiting for.

Session-based models in a nutshell

Theory behind

The session-based approach belongs to recommendation techniques relying solely on the user’s actions within an ongoing session and which adapt their recommendations to the user’s actions.

In contrast to traditional methods described above, it incorporates the ordering of past events when predicting the next ones.

In recent years, an increased interest in session-based recommendation scenarios was observed. Many algorithms that can be used for this task have been developed quite a long time ago. By that nature, they often have much more trivial designs but despite that they can beat the more complex approaches of today based on deep neural networks.

Someone might be overwhelmed by three terms used in this context and that sound very similar: sequence-aware, session-aware and session-based. Sequence aware approach learns from historical sequences of user interactions and tries to predict the next one with no respect to the session. Both session-based and session-aware, as their names suggest, consider session as a key parameter and are subclasses of sequence-aware mechanism. The difference is that in the cases when we have interactions from previous user sessions, the recommendations can be personalized according to the users’ long-term preferences which we call session-aware recommendations. In the session-based approach only the current user’s session is used for the recommendation.

Let’s check the picture below which should help with any future concerns:

Advances in Session-Based and Session-Aware Recommendation, Dissertation zur Erlangung des Grades eines DoktorsderNaturwissenschaften der Technischen Universität Dortmund an der Fakultät für Informatik von Malte Ludewig, 2020

Advances in Session-Based and Session-Aware Recommendation, Dissertation zur Erlangung des Grades eines DoktorsderNaturwissenschaften der Technischen Universität Dortmund an der Fakultät für Informatik von Malte Ludewig, 2020

Why are they good?/Can they beat others?

To sum up, Session-based approach has many advantages:

  • Easy to understand, no black-box.
  • Easy to implement.
  • Very good performance, at specific use cases they can beat current hot-topic deep neural networks.
  • Does not make any assumptions about underlying data.

Session-based algorithms

In the rest of the article, we will talk about the session-based approach but the same also applies to the session-aware one.

Intro Example

Imagine user John is currently viewing Apple Watch Series 6 on the web page. He also saw Apple iPhone 13 five minutes ago and Samsung Galaxy S20 FE ten minutes ago. So, the sequence of his items looks like this:

Items view order

Items view order

For simplification, data for learning will be very short — consisting of 3 users and their sessions (items are sorted chronologically):

Users and sessions

Users and sessions

The main goal, as you would expect, is to find the best next item for John, that will interest him the most. There are many various algorithms that can be used for session-based approach.

Rules Algorithms

Association rules (AR)

Association rule algorithm (also known as market-basket analysis) is a technique to find hidden associations of frequently bought/seen items together. More easily, it learns rules like Customer who bought .. also bought… These rules and their corresponding importance are “learned” by counting how often the items A and B occurred together in a session of any user. They do not necessarily extract user preferences, but rather work in a collaborative-filtering way.

In our session-based approach, we use only the rules of size two. The score of such a rule (e.g.: {A, B}) is derived from the number of co-occurrence of items A and B in all sessions within the whole dataset. The more they occur together the higher the score. And the final recommendation list of length k for example for item C consists of k rules of paired items sorted by the score in descending order.

Using our example, we would learn these rules and corresponding scores via AR:

  • {Apple Watch Series 6, Apple Watch Series 7}: 3
  • {Apple iPhone 13, Apple iPhone 13 mini}: 1
  • {Apple Watch Series 6, Apple Watch Nike Series}: 1
  • {Samsung Galaxy S21, Samsung Galaxy S20 FE}: 1
  • {Apple Watch Series 6, Apple iPhone 13 mini}: 2

Because the last item John saw was Apple Watch Series 6, the recommendation list for him would be: Apple Watch Series 7 at the first position and Apple iPhone 13 mini at the second position and Apple Watch Nike Series at the third position.

You can notice that even though we do recommendations based only on the last session item, using this AR algorithm we are able to recommend items from different categories — not only some other watches but also mobile phones from the same brand. That happens thanks to learning from several sessions of various users.

Sequential rules

Sequential rules algorithm is partly similar to AR — it is also designed to find hidden rules of frequently co-occurring items but where also an order of items in session matters. It basically learns the frequent sequences which the algorithm name is derived from.

We create a rule when item A appeared after an item B in a session even when other events (viewing/buying items) happened between A and B. In our example with John, we now learn these rules and corresponding scores via SR:

  • {Apple Watch Series 6 -> Apple Watch Series 7}: 1
  • {Apple iPhone 13, Apple iPhone 13 mini}: 1
  • {Apple Watch Series 6, Apple Watch Nike Series}: 0.9*
  • {Samsung Galaxy S21, Samsung Galaxy S20 FE}: 1
  • {Apple Watch Series 6, Apple iPhone 13 mini}: 2

* The score is a bit less than one because Apple Watch Nike Series was not the immediate consequent item after Apple Watch Series 6.

We would recommend to John: Apple iPhone 13 mini at the first position, Apple Watch Series 7 at the second position and Apple Watch Nike Series at the third position. The first two positions have different orders compared to the previous AR algorithm. By the way, which recommendation list do you think is better for John? We will talk about it later on.

Note: The subset of SR is Markow chain algorithm that works in a more strictly way — rules are created based on how often users viewed/bought item A immediately after viewing/buying item B.


KNN is a non-parametric supervised learning method that can also be used very well for session-aware (or collaborative-filtering) recommendation. It finds the K most similar items to a particular item based on a given distance metric (usually cosine similarity, Euclidean distance, …) and item features.

There are various modifications of KNN algorithm into needs of session-based approach:

Item-KNN (i-KNN)

This version only considers the last element in a given session and then returns those items as recommendations that are most similar to it in terms of their co-occurrence in other sessions

Technically, each item is encoded as a binary vector (it can be seen in the picture below -> e.g. Apple Watch Series 6 would have vector: (0, 1, 1) because it was seen in two sessions (Session 2 and Session 3) of some people).

Now, if we are about to recommend an item to the user who has currently viewed Xiaomi Mi Watch Lite as his last item, since the vector of item: Apple Watch Series 6 is the most similar to vector of Xiaomi Mi Watch Lite and because this user has not seen this item yet: we would recommend exactly that item to him.

I-KNN input matrix example: 1 means the item occured in the session

I-KNN input matrix example: 1 means the item occured in the session

Session-KNN (s-KNN)

Instead of considering only the last event in the current session, the s-knn method compares the entire current session with past sessions of other users in the training data to determine the items to be recommended.

When we have some current sessions, we can obtain the rules in the following steps:

1. Determine K most similar sessions (neighbors) by applying a suitable session similarity measure.

2. Calculate the score for all items based on these K sessions as:

Calculate the score for all items based on these _K_

May look complicated but it basically means that a high score would have those items that occur in many similar sessions and even higher if those sessions are the most similar to the one which we are giving the recommendation.

Now, if we are about to recommend an item to a user who has currently viewed Xiamoi Mi Watch Lite and Apple iPhone 12 128 GB (see Session 1), since Session 3 is the most similar to user’s Session 1 and because this user has not seen the item occurring in Session 3: Apple Watch Series 6 -> i-knn would recommend exactly that item.

s-KNN example

s-KNN example

We can also think of some other modifications of these two -> for example put the more importance on items that were seen at the latest and many others.

Recurrent NN (RNN)

Recurrent neural networks, which are capable of learning things from sequentially ordered data, are a “natural choice” for this problem.

Again, more adaptations exist here. For example, GRU4REC approach that was specifically designed for session-based recommendations. It models the user session with the goal to predict the probability of subsequent events using RNN with so called Gated Recurrent Units. Computationally, it is more complex approach and requires quite a high volume of data.

Other algorithms are: BERT4Rec, SASRec or BST.


These have been currently the most popular algorithms used for session-based(aware) recommendation approach that has been shown as an important part of any recommendation system. We have outlined metrics where we can measure the success of such created model together with the thought that they themselves are not enough. And now there is a question which model from all of these session-based models should I use for my purpose? Which is the best one? You can look forward on that in our next part.

Co-author Simona Navrátilová https://www.linkedin.com/in/simona-navr%C3%A1tilov%C3%A1-7876a621b/

Real-time recommendation models - part II.Recommendation engines battle - part II.

Did you have fun reading it?

Let's talk about it more!


Aguan s.r.o.Kaprova 42/14110 00 PrahaCzech republicIN: 24173681+420 222 253 015info@lundegaard.eu


Lundegaard a.s.Futurama Business ParkSokolovská 651/136a186 00 Praha 8 - KarlínCzech republicIN: 25687221+420 222 253 015info@lundegaard.eu


Lundegaard a.s.Ponávka Business CentreŠkrobárenská 502/1617 00 Brno - jihCzech republicIN: 25687221+420 222 253 025info@lundegaard.eu

Hradec Králové

Lundegaard a.s.Velké náměstí 1/3500 03 Hradec KrálovéCzech republicIN: 25687221+420 222 253 015office.hradec-kralove@lundegaard.eu

Zoe.ai logo
Fully individualized experience through AI Deep recommendation.
For developers
Deep recommendation
Contact us

All rights reserved by Lundegaard a.s.

Services provided by Aguan s.r.o.