Homepage Personalization at Spotify

Homepage Personalization
at Spotify
Oğuz Semerci, Aloïs Gruson, Clay Gibson, Ben Lacker, Catherine Edwards, Vladan Radosavljevic

Spotify is a global audio
subscription service
By the
numbers
232M
108M
79
50M+ 450k+

What’s at stake on the Homepage?
The Homepage is the ﬁrst thing you see when you open the app. It
is many things: a discovery tool, a personal music assistant, a
marketplace for artists and their fans.
Spotify’s mission is to unlock the potential of human creativity —
by giving a million creative artists the opportunity to live oﬀ their art
and billions of fans the opportunity to enjoy and be inspired by it.
Personalization is powerful in this challenging content space with
vast volume and variety.

01 More on Spotify Homepage
02 Overview of the Ranking algorithm and the bandit policy
03 Sanity checks used in practice for policy debiasing and model behavior
Talk outline

Homepage
organization
The Homepage is made up of cards:
podcast shows or episodes, albums,
playlists, radio stations, artist pages,
etc.
Cards are organized into shelves.
Shelf A
Shelf B

Each user is eligible for hundreds of
candidate shelves, which can be
editorially or programmatically
curated. Shelves pull from a pool of
millions of cards.
All shelf candidates and their
respective cards are ranked in
real-time when you load Home.
Made for X
Your Favorite Albums
Similar to Y
Recommended for Today
Iconic 80s Soundtracks
Discovered in Greenwich Village
Programmatic Curation
Editorial Curation
Embedding
Network
Ranking
Recommendation
Funnel

Ranking Algorithm
and Bandit Policy

Log user feedback:
interactions such as clicks,
likes, streams
Learn to rank Homepage based on logged feedback data.
Homepage ranking as end-to-end ML problem
Ranking algorithm serves
recommendations
Train ranking
algorithm
using logged
feedback

Consequences of Feedback Loops
Without randomization in the feedback loop, you risk:
● Homogenized user behavior (Chaney et al. 2018)
● Diminishing diversity over time (Nguyen et al. 2014)
● Poor representation of the long tail (Mehrotra et al. 2018)
Continuous exploration and content pool expansion
are helpful (Jiang et al, 2019)

Log user feedback:
likes, streams
recommendations
Train ranking
algorithm
using logged
feedback
Introduce exploration

Exploration policy
introduces
randomness
Log user feedback:
likes, streams
recommendations
Train ranking
algorithm
using logged
feedback
+ policy
propensities
Introduce exploration

Random data collection
Randomize the Homepage
for a small fraction of
requests
Ways to introduce exploration
Bandit Policy
Explore/exploit as
Homepage is assembled
(McInerney et al., 2018)
Bandit approaches are becoming popular:
● Artwork personalization at Netﬂix (Amat et al. 2018)
● News article recommendation in Yahoo (Chu et al. 2012)
● Personalization at Amazon Music (ICML 2019)
● REVEAL ’19 workshop here
Fully randomized
experiment
Randomize the Homepage
for a small fraction of users

Explore/Exploit
on the Homepage
An example of an epsilon-greedy policy for
ranking the Spotify Homepage.
0.7 0.20.8
Card Candidates
Predicted stream rate

Explore/Exploit
on the Homepage
0.7 0.20.8
Card Candidates
0.8
𝜋 = (1- 𝝐) + 𝝐/ 3

Explore/Exploit
on the Homepage
0.7 0.20.8
Card Candidates
0.8 0.2
𝜋 = 𝝐/ 2

Explore/Exploit
on the Homepage
0.7 0.20.8
Card Candidates
0.8 0.2 0.7
𝜋 = 1

Training the reward model*
Counterfactual inference for model parameters
* Explore, Exploit, Explain: Personalizing Explainable Recommendations with Bandits. J McInerney, B Lacker, S Hansen,
K Higley, H.Bouchard, A Gruson & R Mehrotra. RecSys 2018.

Research Directions & Practical Challenges
Many research directions we work on:
● Designing better reward models (REVEAL, talk by Mounia Lalmas)
● Optimizing for the marketplace (Marketplaces tutorial, Rishabh and Ben)
● Careful feature engineering to mitigate feedback loop side effects and better
rank new content
● Creating a more representative Homepage (Henriette Cramer in Responsible
Recommendation Panel)
But we need to have integration tests (kind of) so that we are confident that we’ve
got the basics right.

Sanity Checks
used in Practice
Three examples

Need a way to validate that policy debiasing yields roughly unbiased training data.
Sanity Checks
for policy debiasing
Method:
● Remove position bias by using training data from top
position..
● Train a linear model with a single feature (shelf_name) to
predict a metric that’s observable online (CTR).
● Compare prediction from debiased model to observed
outcome during exploration in that position.

Need a way to validate that policy debiasing yields roughly unbiased training data.
Sanity Checks
for policy debiasing
With
importance
sampling
Without
importance
sampling

Product strategy
Sanity Checks
for problem specific model behavior
Aggregate ranking metrics (e.g. NDCG) have low resolution and oﬀer little visibility into
model behavior. But stakeholders have expectations about what the model should do in
speciﬁc situations. We build trust in the model internally and externally by creating metrics
around these expectations and using them as sanity checks.
Artists
Curators
Users

Music has repetitive consumption patterns.
Users have habitual behavior on Home. If a
user has a clear preference for a speciﬁc shelf,
models should rank that shelf high on the
page, regardless of what it is.
A user has a “favorite” shelf if a signiﬁcant
amount of their consumption can be attributed
to that shelf.
Measure the average row where that shelf is
placed for those users.
Favorite Shelf Position Sanity Check
modelA modelB
shelfX
shelfY
shelfZ

Daily & Hourly Patterns Sanity Check
“Why don’t I see “Peaceful Piano” on top of my
homepage every night?”
● Zoom into repetitive consumption patterns and
habitual behavior.
● Measure if the row position is higher at the right
time when applicable.
streamrate

01 Motivation for exploration when collecting training data
02 Methods for collection policies and an epsilon greedy example
03 Three examples of simple sanity checks we use in production while
navigating the complex ecosystem of the homepage personalization
Conclusions

Thank you!
References:
[1] Lihong Li, Wei Chu, John Langford, Robert E. Schapire, A Contextual-Bandit Approach to Personalized News Article Recommendation
arXiv preprint arXiv:1003.0146
[2] Rishabh Mehrotra, James McInerney, Hugues Bouchard, Mounia Lalmas, and Fernando Diaz. 2018. Towards a Fair Marketplace:
Counterfactual Evaluation of the trade-off between Relevance, Fairness & Satisfaction in Recommendation Systems. CIKM '18. ACM, New
York, NY, USA, 2243-2251
[3] Allison J. B. Chaney, Brandon Stewart, and Barbara Engelhardt. 2017. How algorithmic confounding in recommendation systems
increases homogeneity and decreases utility. arXiv preprint arXiv:1710.11214
[4] J. McInerney, B. Lacker, S. Hansen, K. Higley, H. Bouchard, A. Gruson, R. Mehrotra. Explore, Exploit, Explain: Personalizing Explainable
Recommendations with Bandits. In ACM Conference on Recommender Systems (RecSys), October 2018
[5] Ray Jiang, Silvia Chiappa, Tor Lattimore, Andras Agyorgy, and Pushmeet Kohli. 2019. Degenerate Feedback Loops in Recommender
Systems. arXiv:arXiv:1902.10730
[6] Thorsten Joachims, Adith Swaminathan, Tobias Schnabel Unbiased learning from biased user feedback arXiv:arXiv:1608.04468
[7] Fernando Amat, Ashok Chandrashekar, Tony Jebara, and Justin Basilico. 2018. Artwork personalization at netflix. In Proceedings of the
12th ACM Conference on Recommender Systems (RecSys '18).
https://www.spotifyjobs.com

Homepage Personalization at Spotify

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Homepage Personalization at Spotify

Semelhante a Homepage Personalization at Spotify (20)

Último

Último (20)

Homepage Personalization at Spotify