SlideShare uma empresa Scribd logo
1 de 19
Baixar para ler offline
Machine Learning @ Netflix
(and some lessons learned)
Yves Raimond (@moustaki)
Research/Engineering Manager
Search & Recommendations
Algorithm Engineering
Netflix evolution
Netflix scale
● > 69M members
● > 50 countries
● > 1000 device types
● > 3B hours/month
● 36% of peak US downstream traffic
Recommendations @ Netflix
● Goal: Help members find content
to watch and enjoy to maximize
satisfaction and retention
● Over 80% of what people watch
comes from our recommendations
● Top Picks, Because you Watched,
Trending Now, Row Ordering,
Evidence, Search, Search
Recommendations, Personalized
Genre Rows, ...
▪ Regression (Linear, logistic, elastic net)
▪ SVD and other Matrix Factorizations
▪ Factorization Machines
▪ Restricted Boltzmann Machines
▪ Deep Neural Networks
▪ Markov Models and Graph Algorithms
▪ Clustering
▪ Latent Dirichlet Allocation
▪ Gradient Boosted Decision Trees/Random Forests
▪ Gaussian Processes
▪ …
Models & Algorithms
Some lessons learned
Build the offline experimentation
framework first
When tackling a new problem
● What offline metrics can we compute that capture what online improvements we’
re actually trying to achieve?
● How should the input data to that evaluation be constructed (train, validation,
test)?
● How fast and easy is it to run a full cycle of offline experimentations?
○ Minimize time to first metric
● How replicable is the evaluation? How shareable are the results?
○ Provenance (see Dagobah)
○ Notebooks (see Jupyter, Zeppelin, Spark Notebook)
When tackling an old problem
● Same…
○ Were the metrics designed when first running experimentation in that space still appropriate now?
Think about distribution from the
outermost layers
1. For each combination of hyper-parameter
(e.g. grid search, random search, gaussian processes…)
2. For each subset of the training data
a. Multi-core learning (e.g. HogWild)
b. Distributed learning (e.g. ADMM, distributed L-BFGS, …)
When to use distributed learning?
● The impact of communication overhead when building distributed ML
algorithms is non-trivial
● Is your data big enough that the distribution offsets the communication overhead?
Example: Uncollapsed Gibbs sampler for LDA
(more details here)
Design production code to be
experimentation-friendly
Idea Data
Offline
Modeling
(R, Python,
MATLAB, …)
Iterate
Implement in
production
system (Java,
C++, …)
Missing post-
processing logic
Performance
issues
Actual
outputProduction environment
(A/B test) Code
discrepancies
Final
model
Data
discrepancies
Example development process
Avoid dual implementations
Shared Engine
Experiment
code
Production
code
ProductionExperiment
To be continued...
We’re hiring!
Yves Raimond (@moustaki)

Mais conteúdo relacionado

Mais procurados

10 Steps to Becoming Self Made Millionaire by Rhett Power
10 Steps to Becoming Self Made Millionaire by Rhett Power10 Steps to Becoming Self Made Millionaire by Rhett Power
10 Steps to Becoming Self Made Millionaire by Rhett Power24Slides
 
The Secret Sauce of Successful Teams
The Secret Sauce of Successful TeamsThe Secret Sauce of Successful Teams
The Secret Sauce of Successful TeamsSven Peters
 
Startups are Hard. Like, Really Hard. @luketucker
Startups are Hard. Like, Really Hard. @luketuckerStartups are Hard. Like, Really Hard. @luketucker
Startups are Hard. Like, Really Hard. @luketuckerEmpowered Presentations
 
100 growth hacks 100 days | 1 to 10
100 growth hacks 100 days | 1 to 10100 growth hacks 100 days | 1 to 10
100 growth hacks 100 days | 1 to 10Robin Yjord
 
Good Guide to a Great Podcast
Good Guide to a Great PodcastGood Guide to a Great Podcast
Good Guide to a Great PodcastSeth Price
 
The battle for attention
The battle for attentionThe battle for attention
The battle for attentionNewsworks
 
Q1 2023 DBX Investor Presentation
Q1 2023 DBX Investor PresentationQ1 2023 DBX Investor Presentation
Q1 2023 DBX Investor PresentationDropbox
 
How People Are Leveraging ChatGPT
How People Are Leveraging ChatGPTHow People Are Leveraging ChatGPT
How People Are Leveraging ChatGPTRoy Ahuja
 
The Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The FutureThe Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The FutureArturo Pelayo
 
ChatGPT Evaluation for NLP
ChatGPT Evaluation for NLPChatGPT Evaluation for NLP
ChatGPT Evaluation for NLPXiachongFeng
 
Discover The Top 10 Types Of Colleagues Around You
Discover The Top 10 Types Of Colleagues Around YouDiscover The Top 10 Types Of Colleagues Around You
Discover The Top 10 Types Of Colleagues Around YouAnkur Tandon
 
Class Introduction: Digital Product Management
Class Introduction: Digital Product ManagementClass Introduction: Digital Product Management
Class Introduction: Digital Product ManagementAlex Cowan
 
Strategic Storytelling | Business Presentation Techniques
Strategic Storytelling | Business Presentation TechniquesStrategic Storytelling | Business Presentation Techniques
Strategic Storytelling | Business Presentation TechniquesJeremey Donovan
 
WTF - Why the Future Is Up to Us - pptx version
WTF - Why the Future Is Up to Us - pptx versionWTF - Why the Future Is Up to Us - pptx version
WTF - Why the Future Is Up to Us - pptx versionTim O'Reilly
 
Unlocking the Power of ChatGPT and AI in Testing - NextSteps, presented by Ap...
Unlocking the Power of ChatGPT and AI in Testing - NextSteps, presented by Ap...Unlocking the Power of ChatGPT and AI in Testing - NextSteps, presented by Ap...
Unlocking the Power of ChatGPT and AI in Testing - NextSteps, presented by Ap...Applitools
 
The power of creative collaboration
The power of creative collaborationThe power of creative collaboration
The power of creative collaborationTable19
 
Dropbox Startup Lessons Learned
Dropbox Startup Lessons LearnedDropbox Startup Lessons Learned
Dropbox Startup Lessons Learnedgueste94e4c
 

Mais procurados (20)

10 Steps to Becoming Self Made Millionaire by Rhett Power
10 Steps to Becoming Self Made Millionaire by Rhett Power10 Steps to Becoming Self Made Millionaire by Rhett Power
10 Steps to Becoming Self Made Millionaire by Rhett Power
 
The Secret Sauce of Successful Teams
The Secret Sauce of Successful TeamsThe Secret Sauce of Successful Teams
The Secret Sauce of Successful Teams
 
Startups are Hard. Like, Really Hard. @luketucker
Startups are Hard. Like, Really Hard. @luketuckerStartups are Hard. Like, Really Hard. @luketucker
Startups are Hard. Like, Really Hard. @luketucker
 
100 growth hacks 100 days | 1 to 10
100 growth hacks 100 days | 1 to 10100 growth hacks 100 days | 1 to 10
100 growth hacks 100 days | 1 to 10
 
Good Guide to a Great Podcast
Good Guide to a Great PodcastGood Guide to a Great Podcast
Good Guide to a Great Podcast
 
The battle for attention
The battle for attentionThe battle for attention
The battle for attention
 
Q1 2023 DBX Investor Presentation
Q1 2023 DBX Investor PresentationQ1 2023 DBX Investor Presentation
Q1 2023 DBX Investor Presentation
 
How People Are Leveraging ChatGPT
How People Are Leveraging ChatGPTHow People Are Leveraging ChatGPT
How People Are Leveraging ChatGPT
 
The Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The FutureThe Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The Future
 
ChatGPT Evaluation for NLP
ChatGPT Evaluation for NLPChatGPT Evaluation for NLP
ChatGPT Evaluation for NLP
 
Discover The Top 10 Types Of Colleagues Around You
Discover The Top 10 Types Of Colleagues Around YouDiscover The Top 10 Types Of Colleagues Around You
Discover The Top 10 Types Of Colleagues Around You
 
Slides That Rock
Slides That RockSlides That Rock
Slides That Rock
 
Class Introduction: Digital Product Management
Class Introduction: Digital Product ManagementClass Introduction: Digital Product Management
Class Introduction: Digital Product Management
 
Webinar on ChatGPT.pptx
Webinar on ChatGPT.pptxWebinar on ChatGPT.pptx
Webinar on ChatGPT.pptx
 
Chatgpt.pptx
Chatgpt.pptxChatgpt.pptx
Chatgpt.pptx
 
Strategic Storytelling | Business Presentation Techniques
Strategic Storytelling | Business Presentation TechniquesStrategic Storytelling | Business Presentation Techniques
Strategic Storytelling | Business Presentation Techniques
 
WTF - Why the Future Is Up to Us - pptx version
WTF - Why the Future Is Up to Us - pptx versionWTF - Why the Future Is Up to Us - pptx version
WTF - Why the Future Is Up to Us - pptx version
 
Unlocking the Power of ChatGPT and AI in Testing - NextSteps, presented by Ap...
Unlocking the Power of ChatGPT and AI in Testing - NextSteps, presented by Ap...Unlocking the Power of ChatGPT and AI in Testing - NextSteps, presented by Ap...
Unlocking the Power of ChatGPT and AI in Testing - NextSteps, presented by Ap...
 
The power of creative collaboration
The power of creative collaborationThe power of creative collaboration
The power of creative collaboration
 
Dropbox Startup Lessons Learned
Dropbox Startup Lessons LearnedDropbox Startup Lessons Learned
Dropbox Startup Lessons Learned
 

Destaque

Oracle Sql Tuning
Oracle Sql TuningOracle Sql Tuning
Oracle Sql TuningChris Adkin
 
Metaprogramming JavaScript
Metaprogramming  JavaScriptMetaprogramming  JavaScript
Metaprogramming JavaScriptdanwrong
 
Principles and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at EtsyPrinciples and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at EtsyMike Brittain
 
C the basic concepts
C the basic conceptsC the basic concepts
C the basic conceptsAbhinav Vatsa
 
Why Project Managers (Understandably) Hate the CMMI -- and What to Do About It
Why Project Managers (Understandably) Hate the CMMI -- and What to Do About ItWhy Project Managers (Understandably) Hate the CMMI -- and What to Do About It
Why Project Managers (Understandably) Hate the CMMI -- and What to Do About ItLeading Edge Process Consultants LLC
 
Project Management With Scrum
Project Management With ScrumProject Management With Scrum
Project Management With ScrumTommy Norman
 
A Simple Introduction To CMMI For Beginer
A Simple Introduction To CMMI For BeginerA Simple Introduction To CMMI For Beginer
A Simple Introduction To CMMI For BeginerManas Das
 
Organizational communication
Organizational communicationOrganizational communication
Organizational communicationNingsih SM
 
Capability Maturity Model
Capability Maturity ModelCapability Maturity Model
Capability Maturity ModelUzair Akram
 
Capability maturity model cmm lecture 8
Capability maturity model cmm lecture 8Capability maturity model cmm lecture 8
Capability maturity model cmm lecture 8Abdul Basit
 
Gear Cutting Presentation for Polytechnic College Students of India
Gear Cutting Presentation for Polytechnic College Students of IndiaGear Cutting Presentation for Polytechnic College Students of India
Gear Cutting Presentation for Polytechnic College Students of Indiakichu
 
Root cause analysis - tools and process
Root cause analysis - tools and processRoot cause analysis - tools and process
Root cause analysis - tools and processCharles Cotter, PhD
 
Introduction to Cyber Security
Introduction to Cyber SecurityIntroduction to Cyber Security
Introduction to Cyber SecurityStephen Lahanas
 
Object Oriented Analysis and Design
Object Oriented Analysis and DesignObject Oriented Analysis and Design
Object Oriented Analysis and DesignHaitham El-Ghareeb
 
Agile Transformation and Cultural Change
 Agile Transformation and Cultural Change Agile Transformation and Cultural Change
Agile Transformation and Cultural ChangeJohnny Ordóñez
 
Evolution of Microsoft windows operating systems
Evolution of Microsoft windows operating systemsEvolution of Microsoft windows operating systems
Evolution of Microsoft windows operating systemsSai praveen Seva
 
An Overview of User Acceptance Testing (UAT)
An Overview of User Acceptance Testing (UAT)An Overview of User Acceptance Testing (UAT)
An Overview of User Acceptance Testing (UAT)Usersnap
 

Destaque (20)

Oracle Sql Tuning
Oracle Sql TuningOracle Sql Tuning
Oracle Sql Tuning
 
Metaprogramming JavaScript
Metaprogramming  JavaScriptMetaprogramming  JavaScript
Metaprogramming JavaScript
 
Capability maturity model
Capability maturity modelCapability maturity model
Capability maturity model
 
Organizational Communication
Organizational CommunicationOrganizational Communication
Organizational Communication
 
Principles and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at EtsyPrinciples and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at Etsy
 
C the basic concepts
C the basic conceptsC the basic concepts
C the basic concepts
 
Why Project Managers (Understandably) Hate the CMMI -- and What to Do About It
Why Project Managers (Understandably) Hate the CMMI -- and What to Do About ItWhy Project Managers (Understandably) Hate the CMMI -- and What to Do About It
Why Project Managers (Understandably) Hate the CMMI -- and What to Do About It
 
Project Management With Scrum
Project Management With ScrumProject Management With Scrum
Project Management With Scrum
 
A Simple Introduction To CMMI For Beginer
A Simple Introduction To CMMI For BeginerA Simple Introduction To CMMI For Beginer
A Simple Introduction To CMMI For Beginer
 
Organizational communication
Organizational communicationOrganizational communication
Organizational communication
 
Capability Maturity Model
Capability Maturity ModelCapability Maturity Model
Capability Maturity Model
 
Capability maturity model cmm lecture 8
Capability maturity model cmm lecture 8Capability maturity model cmm lecture 8
Capability maturity model cmm lecture 8
 
Gear Cutting Presentation for Polytechnic College Students of India
Gear Cutting Presentation for Polytechnic College Students of IndiaGear Cutting Presentation for Polytechnic College Students of India
Gear Cutting Presentation for Polytechnic College Students of India
 
6 Thinking Hats
6 Thinking Hats6 Thinking Hats
6 Thinking Hats
 
Root cause analysis - tools and process
Root cause analysis - tools and processRoot cause analysis - tools and process
Root cause analysis - tools and process
 
Introduction to Cyber Security
Introduction to Cyber SecurityIntroduction to Cyber Security
Introduction to Cyber Security
 
Object Oriented Analysis and Design
Object Oriented Analysis and DesignObject Oriented Analysis and Design
Object Oriented Analysis and Design
 
Agile Transformation and Cultural Change
 Agile Transformation and Cultural Change Agile Transformation and Cultural Change
Agile Transformation and Cultural Change
 
Evolution of Microsoft windows operating systems
Evolution of Microsoft windows operating systemsEvolution of Microsoft windows operating systems
Evolution of Microsoft windows operating systems
 
An Overview of User Acceptance Testing (UAT)
An Overview of User Acceptance Testing (UAT)An Overview of User Acceptance Testing (UAT)
An Overview of User Acceptance Testing (UAT)
 

Semelhante a Paris ML meetup

Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018HJ van Veen
 
Joker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data ScientistJoker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data ScientistAlexey Zinoviev
 
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...Dataiku
 
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...Infoshare
 
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...Brocade
 
Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflowCharmi Chokshi
 
A step towards machine learning at accionlabs
A step towards machine learning at accionlabsA step towards machine learning at accionlabs
A step towards machine learning at accionlabsChetan Khatri
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxIvo Andreev
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroDaniel Marcous
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixJustin Basilico
 
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Ali Alkan
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or realityAwantik Das
 

Semelhante a Paris ML meetup (20)

Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
 
Joker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data ScientistJoker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data Scientist
 
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
JavaScript and Artificial Intelligence by Aatman & Sagar - AhmedabadJS
JavaScript and Artificial Intelligence by Aatman & Sagar - AhmedabadJSJavaScript and Artificial Intelligence by Aatman & Sagar - AhmedabadJS
JavaScript and Artificial Intelligence by Aatman & Sagar - AhmedabadJS
 
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
 
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
 
Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
 
Unit no_1.pptx
Unit no_1.pptxUnit no_1.pptx
Unit no_1.pptx
 
tensorflow.pptx
tensorflow.pptxtensorflow.pptx
tensorflow.pptx
 
Apache Mahout
Apache MahoutApache Mahout
Apache Mahout
 
A step towards machine learning at accionlabs
A step towards machine learning at accionlabsA step towards machine learning at accionlabs
A step towards machine learning at accionlabs
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackbox
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to hero
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
 
Data science
Data scienceData science
Data science
 
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or reality
 
A Kaggle Talk
A Kaggle TalkA Kaggle Talk
A Kaggle Talk
 
machine learning
machine learningmachine learning
machine learning
 

Mais de Yves Raimond

Time, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender SystemsTime, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender SystemsYves Raimond
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender SystemsYves Raimond
 
(Some) pitfalls of distributed learning
(Some) pitfalls of distributed learning(Some) pitfalls of distributed learning
(Some) pitfalls of distributed learningYves Raimond
 
Recommending for the World
Recommending for the WorldRecommending for the World
Recommending for the WorldYves Raimond
 
Spark Meetup @ Netflix, 05/19/2015
Spark Meetup @ Netflix, 05/19/2015Spark Meetup @ Netflix, 05/19/2015
Spark Meetup @ Netflix, 05/19/2015Yves Raimond
 
Utilisation du Web Semantique pour les sites de la BBC
Utilisation du Web Semantique pour les sites de la BBCUtilisation du Web Semantique pour les sites de la BBC
Utilisation du Web Semantique pour les sites de la BBCYves Raimond
 
Linked Data on the BBC
Linked Data on the BBCLinked Data on the BBC
Linked Data on the BBCYves Raimond
 
Publishing and interlinking music-related data on the Web
Publishing and interlinking music-related data on the WebPublishing and interlinking music-related data on the Web
Publishing and interlinking music-related data on the WebYves Raimond
 
Linked data and applications
Linked data and applicationsLinked data and applications
Linked data and applicationsYves Raimond
 
Towards a musical Semantic Web
Towards a musical Semantic WebTowards a musical Semantic Web
Towards a musical Semantic WebYves Raimond
 

Mais de Yves Raimond (11)

Time, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender SystemsTime, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender Systems
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
(Some) pitfalls of distributed learning
(Some) pitfalls of distributed learning(Some) pitfalls of distributed learning
(Some) pitfalls of distributed learning
 
Recommending for the World
Recommending for the WorldRecommending for the World
Recommending for the World
 
Spark Meetup @ Netflix, 05/19/2015
Spark Meetup @ Netflix, 05/19/2015Spark Meetup @ Netflix, 05/19/2015
Spark Meetup @ Netflix, 05/19/2015
 
Utilisation du Web Semantique pour les sites de la BBC
Utilisation du Web Semantique pour les sites de la BBCUtilisation du Web Semantique pour les sites de la BBC
Utilisation du Web Semantique pour les sites de la BBC
 
Linked Data on the BBC
Linked Data on the BBCLinked Data on the BBC
Linked Data on the BBC
 
Publishing and interlinking music-related data on the Web
Publishing and interlinking music-related data on the WebPublishing and interlinking music-related data on the Web
Publishing and interlinking music-related data on the Web
 
Linked data and applications
Linked data and applicationsLinked data and applications
Linked data and applications
 
Web of data
Web of dataWeb of data
Web of data
 
Towards a musical Semantic Web
Towards a musical Semantic WebTowards a musical Semantic Web
Towards a musical Semantic Web
 

Último

KCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosKCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosVictor Morales
 
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxCurve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxRomil Mishra
 
A brief look at visionOS - How to develop app on Apple's Vision Pro
A brief look at visionOS - How to develop app on Apple's Vision ProA brief look at visionOS - How to develop app on Apple's Vision Pro
A brief look at visionOS - How to develop app on Apple's Vision ProRay Yuan Liu
 
11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdfHafizMudaserAhmad
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONjhunlian
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating SystemRashmi Bhat
 
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithmComputer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithmDeepika Walanjkar
 
Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Romil Mishra
 
70 POWER PLANT IAE V2500 technical training
70 POWER PLANT IAE V2500 technical training70 POWER PLANT IAE V2500 technical training
70 POWER PLANT IAE V2500 technical trainingGladiatorsKasper
 
Theory of Machine Notes / Lecture Material .pdf
Theory of Machine Notes / Lecture Material .pdfTheory of Machine Notes / Lecture Material .pdf
Theory of Machine Notes / Lecture Material .pdfShreyas Pandit
 
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptxTriangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptxRomil Mishra
 
Novel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsNovel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsResearcher Researcher
 
SOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATIONSOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATIONSneha Padhiar
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionMebane Rash
 
STATE TRANSITION DIAGRAM in psoc subject
STATE TRANSITION DIAGRAM in psoc subjectSTATE TRANSITION DIAGRAM in psoc subject
STATE TRANSITION DIAGRAM in psoc subjectGayathriM270621
 
CS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfCS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfBalamuruganV28
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...Erbil Polytechnic University
 
Secure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech LabsSecure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech Labsamber724300
 
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTFUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTSneha Padhiar
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxRomil Mishra
 

Último (20)

KCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosKCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitos
 
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxCurve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
 
A brief look at visionOS - How to develop app on Apple's Vision Pro
A brief look at visionOS - How to develop app on Apple's Vision ProA brief look at visionOS - How to develop app on Apple's Vision Pro
A brief look at visionOS - How to develop app on Apple's Vision Pro
 
11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating System
 
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithmComputer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithm
 
Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________
 
70 POWER PLANT IAE V2500 technical training
70 POWER PLANT IAE V2500 technical training70 POWER PLANT IAE V2500 technical training
70 POWER PLANT IAE V2500 technical training
 
Theory of Machine Notes / Lecture Material .pdf
Theory of Machine Notes / Lecture Material .pdfTheory of Machine Notes / Lecture Material .pdf
Theory of Machine Notes / Lecture Material .pdf
 
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptxTriangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptx
 
Novel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsNovel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending Actuators
 
SOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATIONSOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATION
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of Action
 
STATE TRANSITION DIAGRAM in psoc subject
STATE TRANSITION DIAGRAM in psoc subjectSTATE TRANSITION DIAGRAM in psoc subject
STATE TRANSITION DIAGRAM in psoc subject
 
CS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfCS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdf
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...
 
Secure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech LabsSecure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech Labs
 
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTFUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptx
 

Paris ML meetup

  • 1.
  • 2. Machine Learning @ Netflix (and some lessons learned) Yves Raimond (@moustaki) Research/Engineering Manager Search & Recommendations Algorithm Engineering
  • 4. Netflix scale ● > 69M members ● > 50 countries ● > 1000 device types ● > 3B hours/month ● 36% of peak US downstream traffic
  • 5. Recommendations @ Netflix ● Goal: Help members find content to watch and enjoy to maximize satisfaction and retention ● Over 80% of what people watch comes from our recommendations ● Top Picks, Because you Watched, Trending Now, Row Ordering, Evidence, Search, Search Recommendations, Personalized Genre Rows, ...
  • 6. ▪ Regression (Linear, logistic, elastic net) ▪ SVD and other Matrix Factorizations ▪ Factorization Machines ▪ Restricted Boltzmann Machines ▪ Deep Neural Networks ▪ Markov Models and Graph Algorithms ▪ Clustering ▪ Latent Dirichlet Allocation ▪ Gradient Boosted Decision Trees/Random Forests ▪ Gaussian Processes ▪ … Models & Algorithms
  • 8. Build the offline experimentation framework first
  • 9. When tackling a new problem ● What offline metrics can we compute that capture what online improvements we’ re actually trying to achieve? ● How should the input data to that evaluation be constructed (train, validation, test)? ● How fast and easy is it to run a full cycle of offline experimentations? ○ Minimize time to first metric ● How replicable is the evaluation? How shareable are the results? ○ Provenance (see Dagobah) ○ Notebooks (see Jupyter, Zeppelin, Spark Notebook)
  • 10. When tackling an old problem ● Same… ○ Were the metrics designed when first running experimentation in that space still appropriate now?
  • 11. Think about distribution from the outermost layers
  • 12. 1. For each combination of hyper-parameter (e.g. grid search, random search, gaussian processes…) 2. For each subset of the training data a. Multi-core learning (e.g. HogWild) b. Distributed learning (e.g. ADMM, distributed L-BFGS, …)
  • 13. When to use distributed learning? ● The impact of communication overhead when building distributed ML algorithms is non-trivial ● Is your data big enough that the distribution offsets the communication overhead?
  • 14. Example: Uncollapsed Gibbs sampler for LDA (more details here)
  • 15. Design production code to be experimentation-friendly
  • 16. Idea Data Offline Modeling (R, Python, MATLAB, …) Iterate Implement in production system (Java, C++, …) Missing post- processing logic Performance issues Actual outputProduction environment (A/B test) Code discrepancies Final model Data discrepancies Example development process
  • 17. Avoid dual implementations Shared Engine Experiment code Production code ProductionExperiment