Table of Contents
- Dimension 1: Architecture
- Dimension 2: Type of model
- Links
- Notes from papers
Dimension 1: Architecture
- Collaborative filtering and matrix/tensor factorisation
- Model as probability of interaction (click)
For probability of interaction we can build this as any binary classification problem
where for every user and a preselected set of items, we predict the probability of
click. The preselection comes from L1-1 ranking (recall phase).
- Model as pairwise ranking (compare alternatives and make the model select one)
Dimension 2: Type of model
Modelling as probability of click (or pairwise ranking) can be done through
any classification model such as logistic regression, xgboost or neural nets.
Neural net approaches
- Wide and deep: Wide has sparse cross features which are memorised (overfit) and then
the deep is a sequence of MLP layers or other carefully crafted layers that let the
model generalise better. The wide part is for exceptions and deep part is for
generalisations.
- Two tower model: One tower for user embeddings and meta features and another
tower for the item. The final dimensions of the two towers match, we simply
dot and take the sigmoid to compute the probability.
- Sequence model: User actions are first classified into different buckets. Each
of these gets an action embedding. Users have a general user embedding. We then concat
these and learn a sequence model such as an LSTM or transformers where we predict the
action in the next time step. This apparently was a game-changer for Netflix.
Links
- Wide and deep blog: blog
- Youtube paper on deep recommendations: paper
- Netflix recommendation case study: paper
Notes from papers
Deep Learning for Recommender Systems: A Netflix Case Study