SlideShare uma empresa Scribd logo
1 de 47
Learning To Rank For Graph Search
Junfeng He, Search Quality and Ranking
Joint work with Cristina, Rajat, Hieu, Allan, Maxime, Jiayan,
Ethan, Scott, Kittpat, Alessandro, etc.
10/21/2013
Motivation of This Talk – Halloween!
The most scary Halloween gift:
You have 10 seconds to leave this room. If you choose to stay,
you are fully responsible for all possible consequences.
Outline
Ranking model for graph search
Learn the ranking model for graph search
Experiments and Discussions
Browse Quereies
– photo vertical example
Photos of Rajat, Photos by my friends
Photos liked by Cristina, Photos commented on by me
Photos in Hawaii, photos before 2010, recent photos, 

Photos of my friends in Hawaii this year
My recommended photos of Girish Kumar's co-workers taken in
United States this year liked by Tom Stocky that are commented
on by friends of Lars Eilstrup Rasmussen
The Concept of Ranking Model -- typeahead
f1: FEATURE_SHARED_MUTUAL_FRIEN
f2: FEATURE_DISTANCE_GRAPH
f1: 30 f2: 1
f1: 8 f2: 2
score: 0.86
score: 0.4
A Toy User Scoring Model with Two Features
Ranking Model -- Browse
FEATURE_BROWSE_PHOTO_OF_USER:1
FEATURE_CONSTRAINTS_RATIO: 1
FEATURE_PHOTO_AGE_IN_DAYS_INV: 0.0196
FEATURE_PHOTO_COEFFICIENT_MAX:
0.2817745947658
FEATURE_PHOTO_HAS_FACE:1

..
Score: 0.96
FEATURE_BROWSE_PHOTO_OF_USER:1
FEATURE_CONSTRAINTS_RATIO: 1
FEATURE_PHOTO_AGE_IN_DAYS_INV: 0.0268
FEATURE_PHOTO_COEFFICIENT_MAX: 2.2E-308
FEATURE_PHOTO_HAS_FACE:1

..
Score: 0.82
Ranking Model
Intuitively, for searcher a’s request b, what should be the score
for result/document c?
‱ score: a real number.
‱ Results with larger score s should be ranked higher
Mathematically, score = Ranking_model(feature_vector)
feature_vector: features that contain info about searcher,
query, result.
Ranking_model: which maps a set of features (i.e., a feature
vector) to a score.
Ranking Features
Total number of features for all verticals today: ~1200
Each vertical contains a subset of features
Photo browse: about 100 features. Examples:
‱ FEATURE_BROWSE_PHOTO_IN: whether the query is “photos in some place”
‱ FEATURE_PHOTO_HAS_FACE: whether this photo has face
‱ FEATURE_PHOTO_NUM_FRIENDS_LIKED: how many of searcher’s friends liked
this photo 

‱ In sum, features contain info about searcher, query, result(document)
Bucket Ranker
For continuous features: piecewise linear
f1
s
f2
s
For discrete features: step-wise
f1
s
Can approximate any
nonlinear functions if
we create enough
number of buckets
One bucket:
the region between two borders
One bucket:
Every border is one bucket
Bucket Ranker
One example:
{ "features" : {"FEATURE_SHARED_MUTUAL_FRIENDS" :
[0,1,11,240]},
"weights" : [ 0.0206305, 0.0555313, 0.284588, 0]},
f1
Suppose x is the value of one data on feature
FEATURE_SHARED_MUTUAL_FRIENDS
If x <= 0, score= 0.0206305
If x == 1, score= 0.0206305 + 0.0555313
If 1<= x <=11, score= 0.0206305 + 0.0555313+ 0.284588 *(x-1)/(11-1)
If 11<= x <=240, score= 0.0206305 + 0.0555313+ 0.284588+ 0*(x-
11)/(240-11)
If x > 240, score= 0.0206305 + 0.0555313+ 0.284588+ 0
s1
Bucket Ranker
Output of the scoring model: the sum of score from
bucket each on each feature
f1
f 1 
 



Features
j
j
j
Features
j
j f
BR
s
f
R )
(
)
(
f2
f 2
s1
s2
f1: 30  s1: 0.36
f1: 8  s1: 0.3
f2: 1  s2: 0.5
f2: 2  s2: 0.1
score: 0.86
score: 0.4
Conditioned Bucket ranker, i.e., ranking tree
Condition 1:
“photo of user”
Condition 2:
“photo in 
”
f1
s
f2
s
Feature 1 Feature 2
Feature
1
Feature
2
 
 


Conditons
i Features
j
j
ij f
BR
f
R
s )
(
)
(
f1
s
f2
s
Face
feature
Face
feature
Outdoor
feature
Outdoor
feature
For different query intent:
Face features is very
important positive feature for
“photo of user” query,
but is not important for
“photos in some place”.
 
 


Conditons
i Features
j
j
ij f
BR
f
R
s )
(
)
(
Example:
Condition 1: “photo of user”
Condition 2: “photo in some place”
For one query “photos of user in
some place”, it will get score from
buckets under both condition 1
and 2
Condition 1:
“photo of user”
Condition 2:
“photo in 
”
f1
s
f2
s
Feature 1 Feature 2
Feature
1
Feature
2
f1
s
f2
s
Face
feature
Face
feature
Outdoor
feature
Outdoor
feature
photos of my friends photos taken in Hawaii
photos of my friends taken in Hawaii
A Defense for the Bucket Ranker Model
Question: Why Bucket Ranker, why not Linear, Random Forests,
Boosted Decision Trees, LambdaMART, Bayesian Graph Model (yet)?
Answer:
A White Box model: Good interpretation/debug ability
We still have lots of problems on features, labeling, data logging, etc., so our
data may not be ready to train a black box model
We often need to manually modify the model (e.g., support new queries, hot fix
for an important bug, cover corner cases when training data is not good
enough, etc.,)
Complex enough to guarantee good ranking quality
A Defense for the Bucket Ranker Model
Bucket Ranker gives us a good tradeoff between interpretation/debug
ability and ranking quality interpretation ability
linear Bucket Ranker Black Box
Models
Ranking Quality (up to now)
linear Bucket Ranker Black Box
Models
What if black box models is
significantly better?
A brilliant idea: use the score
from black box model as label
to train a white box model
Engineers in our search ranking team used to manually tune
the model, sometimes consisting of hundreds of curves (i.e.,
piecewise linear functions)
‱ Tedious!
‱ Unproductive! Don’t know what the weights/curves should be
Machine learning to rescue !
Outline
Ranking model for graph search
Learn the ranking model for graph search
Training Data
The Workflow to Learn Conditioned Bucket Ranker
Learning To Rank Techniques
Experiments and Discussions
Training Data
Results shown to users
Random results/samples from the
same search session, but not
shown to users, collected in
indexing servers
Basic Labeling
All random samples are labeled as
negative data -1
Results with target actions, (e.g.,
click, friending, etc.) are labeled as
positive data +1
Results without target actions, are
labeled as negative data -1
More Labeling Strategies
typeahead_balanced:
ignore negative results under the
positive results
positive_only:
ignore all negative results, only
use negative samples



Basic labeling is usually the best, or good enough
Outline
Ranking model for graph search
Learn the ranking model for graph search
Training Data
The Workflow to Learn Conditioned Bucket Ranker
Learning To Rank Techniques
Example Results on photo search
How to Choose the Conditions and Features
Conditions: Manually chosen up to
now, incorporate human domain
knowledge
Features: usually need to remove
obvious meaningless features like
Doc_id
Some ongoing tasks about
suggesting condition and feature
automatically
e.g., frequent pattern mining on queries
Condition 1:
“photo of user”
Condition 2:
“photo in 
”
Feature 1 Feature 2
Feature
1
Feature 2
Create Bucket Borders
For continuers features or discrete
features with many possible values
‱ Percentile
‱ make sure each buckets contains the
same number of data
For discrete features with few
possible values (like binary features),
or category features such as user
locale
‱ each feature value is one bucket
Create Bucket Borders
Discrete features with skew
distribution:
Iterative percentile
histogram
Learn the Bucket Ranker
Condition 1:
“photo of user”
Condition 2:
“photo in 
”
f1
s
f2
s
Feature 1 Feature 2
Feature
1
Feature
2
f1
s
f2
s
We have condition, features, and
bucket borders now,
the only thing to learn is the
weights of each bucket.
Feature transformation --Bucketization
]
0
,...
0
,
8
.
0
,
1
,...
1
[
:
'ij
f
23
:
ij
f
:
ij
f
the feature vector for feature j
under condition i, after
bucketization.
A vector with the dimension ==
number of buckets
ij
f'
the feature value for feature j
under condition i, a scalar
Scale invariant with percentile buckets
Feature transformation -- Linearization




  
 
x
w
f
h
f ij
ij ,
'
'
)
(
Conditons
i Features
j
R
s
]
0
,...,...
0
,
'
,
0
,...
0
[
: ij
f
x
Condition 1:
“photo of user”
Condition 2:
“photo in 
”
f1
s
f2
s
Feature 1 Feature 2
Feature
1
Feature
2
f1
s
f2
s
f: original features
x: features after transformation
Dimension of features x:
Number of conditions *
number of original features *
number of buckets
Learning the whole tree simultaneously ==
learning a linear function s =R(x)= <w, x>
One condition
Satisfied:
Multiple
condition
Satisfied:
]
0
,...,
'
,...
0
,
'
,...
0
[
: ij
ij f
f
x
Outline
Ranking model for graph search
Learn the ranking model for graph search
Training Data
The Workflow to Learn Conditioned Bucket Ranker
Learning To Rank Techniques
Example Results on photo search
Learning to Rank
Given lots of training data (feature_vector xi, score si), i=1,
n
Learn a linear ranking function
‱ score = Ranking_model(feature_vector) = <w, x>
Learning to Rank -- History
A good summary: http://en.wikipedia.org/wiki/Learning_to_rank
Three main categories of methods
‱ Pointwise learning
‱ Pairwise learning
‱ Listwise learning
Learning to Rank -- History
Pointwise Learning
‱ For one point (xi , si), the cost function is to make sure R(xi) si
‱ i.e., if one result gets clicked, its score should be close to 1; otherwise, its score should be close to 0;
‱ Every supervised regression or classification method are applicable. Examples:
‱ Linear, Logistic Regression, SVM, etc.,
‱ FB Ads and Newsfeed ranking team are using methods in this category
‱ One possible problem: Label is not session specific
X: feature_vector after transformation
s: score
R: ranking model
Learning to Rank -- History
(Session Specific) Pairwise Learning
‱ For two points (xi, si), (xj, sj) from the same search session, the cost function is to make sure
R(xi) > R(xj) , if si > sj
‱ In other words, if result i and j are results from the same search session, and i is clicked, but j is
not, then score i should be higher than score j.
‱ Examples: RankNet, Ranking SVM , RankBoost, LambdaRank, CRR, etc.
(Session Specific) Listwise Learning
‱ For all points from the same search session, make sure R(fi) will have the same order as si
‱ Structure SVM, LambdaMART, etc.
X: feature_vector after transformation
s: score
R: ranking model
CRR: Combined Regression and Ranking
pointwise
term
pairwise
term
regularization
term
E.g.,
f: linear,
l:hinge loss,
t: sign()
or
f: logistic function,
l: a loga + (1-a) log(1-a),
t: (1+y)/2
Learn (Conditioned) Bucket ranker
Solve the problem by Stochastic Gradient Descent (SGD)
methods
‱ Machine learning toolbox: sofia-ML
Can train with 10M data within 1 hour in a single machine
Outline
Ranking model for graph search
Learn the ranking model for graph search
Experiments and Discussions
Data Logging
Logging of the results features at backend
Logging of the click & conversion actions in frontend
‱ Tables from frontend and backend are joined to create the
ultimate HIVE data table: search_learning_data
‱ Tables are populated once per day
Train Scoring Models with Machine Learning Pipelines
Run two commands, e.g.,
‱ search/ranking/util/collect_data_ta.sh '2013-08-21' '2013-08-27' -type users --output
/home/jfh/data.txt
‱ search/ranking/util/train_model.sh /home/jfh/data.txt /home/jfh/bm_test
Wait for 1-2 hours, obtain your model!
Evaluation and A/B test
â–Ș More details:
https://our.intern.facebook.com/intern/wiki/index.php/Trainin
g_models_using_hive_data
Results
ML Trained models for typeahead verticals (except group)
are deployed in production
‱ Metrics are usually slightly better, compared to hand-tune model
Two Trained Models for Browse deployed to production
‱ Ragtime: clickers by 2.47% and actioners by 3.77%
‱ Photos: more details in following slides
Basic metrics of the previous handtuned production
model for photo search
(# clicked sessions: 30.8% , # clicks per clicked session: 3.04)
# clicks per session: 0.936,
# likes per session: ~0.07
# comments: < 0.01
A/B test experiments compared to previous hand tuned
production model
Improvement compared to previous handtuned production model
Target Actions #clicks
Per
session
#likes
per
session
#photo queries per
day
#photo search
users per day
click 5% -5% 3% ~1%
like 4% >20% 0-1% ~0%
click, like, comment 5% 7% 3% ~1%
â–Ș Run A/B test for handtuned model, and 3 machine trained models
â–Ș The three machine trained models have target actions of “click”, “like”, “click or
like or comment” respectively
â–Ș Run A/B tests for 3 weeks
Deployed a
our current
production
model.
Case Study – “photos of mark zuckerberg”
Handtuned model Machine trained model
One possible reason: improper feature weights
In handtuned model, the weight of features FEATURE_CONSTRAINTS_RATIO
(i.e., how many people tagged) is not as high as other features e.g.,
FEATURE_PHOTO_LIKES
Case Study –”photos of Girish Kumar’s friends”
Handtuned model Machine trained model
One possible reason: “counter”-intuitive features
In hand-tuned model, the older the photos, the lower
the score
In machine trained model, old photos gets low score,
but very old photos (e.g.,>10 years) get highest
score!
Photos taken today
Photos taken >10 years
ago
More analysis on photo age
photos of my friends, photos of a user
photos of a product/company, movie,
public figure, etc.
photos liked a user
photos in some place: almost 0 weights
More
examples
“Photos liked by Lars Eilstrup Rasmussen”
Handtuned model Machine trained model
“Photos of Barack Obama”
More examples
“Photos by National Geographic ”
Handtuned model Machine trained model
“Photos in Beijing China”
Goal At the End of H2
Train models for all the use cases for all verticals
Continuous training
Cover corner cases well
Offline feature extraction, evaluation
Try different machine learning methods
Easy to analyze and debug the machine learning and
models
The Ultimate Goal
Our life now â–Ș Our life in the future

Mais conteĂșdo relacionado

Semelhante a Deep Dive to Learning to Rank for Graph Search.pptx

Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2Roger Barga
 
Learning from data
Learning from dataLearning from data
Learning from dataGovind Kanshi
 
Predict oscars (4:17)
Predict oscars (4:17)Predict oscars (4:17)
Predict oscars (4:17)Thinkful
 
Buidling large scale recommendation engine
Buidling large scale recommendation engineBuidling large scale recommendation engine
Buidling large scale recommendation engineKeeyong Han
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation SystemsRobin Reni
 
Collaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CFCollaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CFYusuke Yamamoto
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckSasha Lazarevic
 
Recsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakRecsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakDeepak Agarwal
 
Search, Discovery and Questions at Quora
Search, Discovery and Questions at QuoraSearch, Discovery and Questions at Quora
Search, Discovery and Questions at QuoraNikhil Dandekar
 
Predict oscars (5:11)
Predict oscars (5:11)Predict oscars (5:11)
Predict oscars (5:11)Thinkful
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectiveXavier Amatriain
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning SystemsXavier Amatriain
 
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsBIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsXavier Amatriain
 
large scale collaborative filtering using Apache Giraph
large scale collaborative filtering using Apache Giraphlarge scale collaborative filtering using Apache Giraph
large scale collaborative filtering using Apache GiraphDataWorks Summit
 
Analytics Boot Camp - Slides
Analytics Boot Camp - SlidesAnalytics Boot Camp - Slides
Analytics Boot Camp - SlidesAditya Joshi
 
acmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxacmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxdongchangim30
 
UNIT 1 Machine Learning [KCS-055] (1).pptx
UNIT 1 Machine Learning [KCS-055] (1).pptxUNIT 1 Machine Learning [KCS-055] (1).pptx
UNIT 1 Machine Learning [KCS-055] (1).pptxRohanPathak30
 
Introduction to Boosted Trees by Tianqi Chen
Introduction to Boosted Trees by Tianqi ChenIntroduction to Boosted Trees by Tianqi Chen
Introduction to Boosted Trees by Tianqi ChenZhuyi Xue
 
5 analytic hierarchy_process
5 analytic hierarchy_process5 analytic hierarchy_process
5 analytic hierarchy_processFEG
 
analytic hierarchy_process
analytic hierarchy_processanalytic hierarchy_process
analytic hierarchy_processFEG
 

Semelhante a Deep Dive to Learning to Rank for Graph Search.pptx (20)

Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2
 
Learning from data
Learning from dataLearning from data
Learning from data
 
Predict oscars (4:17)
Predict oscars (4:17)Predict oscars (4:17)
Predict oscars (4:17)
 
Buidling large scale recommendation engine
Buidling large scale recommendation engineBuidling large scale recommendation engine
Buidling large scale recommendation engine
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
 
Collaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CFCollaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CF
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist Deck
 
Recsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakRecsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and Deepak
 
Search, Discovery and Questions at Quora
Search, Discovery and Questions at QuoraSearch, Discovery and Questions at Quora
Search, Discovery and Questions at Quora
 
Predict oscars (5:11)
Predict oscars (5:11)Predict oscars (5:11)
Predict oscars (5:11)
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
 
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsBIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
 
large scale collaborative filtering using Apache Giraph
large scale collaborative filtering using Apache Giraphlarge scale collaborative filtering using Apache Giraph
large scale collaborative filtering using Apache Giraph
 
Analytics Boot Camp - Slides
Analytics Boot Camp - SlidesAnalytics Boot Camp - Slides
Analytics Boot Camp - Slides
 
acmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxacmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptx
 
UNIT 1 Machine Learning [KCS-055] (1).pptx
UNIT 1 Machine Learning [KCS-055] (1).pptxUNIT 1 Machine Learning [KCS-055] (1).pptx
UNIT 1 Machine Learning [KCS-055] (1).pptx
 
Introduction to Boosted Trees by Tianqi Chen
Introduction to Boosted Trees by Tianqi ChenIntroduction to Boosted Trees by Tianqi Chen
Introduction to Boosted Trees by Tianqi Chen
 
5 analytic hierarchy_process
5 analytic hierarchy_process5 analytic hierarchy_process
5 analytic hierarchy_process
 
analytic hierarchy_process
analytic hierarchy_processanalytic hierarchy_process
analytic hierarchy_process
 

Último

🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Gurgaon âœĄïž9711147426✹Call In girls Gurgaon Sector 51 escort service
Gurgaon âœĄïž9711147426✹Call In girls Gurgaon Sector 51 escort serviceGurgaon âœĄïž9711147426✹Call In girls Gurgaon Sector 51 escort service
Gurgaon âœĄïž9711147426✹Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2RajaP95
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Study on Air-Water & Water-Water Heat Exchange in a Finned ï»żTube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned ï»żTube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned ï»żTube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned ï»żTube ExchangerAnamika Sarkar
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 

Último (20)

🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Gurgaon âœĄïž9711147426✹Call In girls Gurgaon Sector 51 escort service
Gurgaon âœĄïž9711147426✹Call In girls Gurgaon Sector 51 escort serviceGurgaon âœĄïž9711147426✹Call In girls Gurgaon Sector 51 escort service
Gurgaon âœĄïž9711147426✹Call In girls Gurgaon Sector 51 escort service
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Study on Air-Water & Water-Water Heat Exchange in a Finned ï»żTube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned ï»żTube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned ï»żTube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned ï»żTube Exchanger
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 

Deep Dive to Learning to Rank for Graph Search.pptx

  • 1. Learning To Rank For Graph Search Junfeng He, Search Quality and Ranking Joint work with Cristina, Rajat, Hieu, Allan, Maxime, Jiayan, Ethan, Scott, Kittpat, Alessandro, etc. 10/21/2013
  • 2. Motivation of This Talk – Halloween! The most scary Halloween gift: You have 10 seconds to leave this room. If you choose to stay, you are fully responsible for all possible consequences.
  • 3. Outline Ranking model for graph search Learn the ranking model for graph search Experiments and Discussions
  • 4. Browse Quereies – photo vertical example Photos of Rajat, Photos by my friends Photos liked by Cristina, Photos commented on by me Photos in Hawaii, photos before 2010, recent photos, 
 Photos of my friends in Hawaii this year My recommended photos of Girish Kumar's co-workers taken in United States this year liked by Tom Stocky that are commented on by friends of Lars Eilstrup Rasmussen
  • 5. The Concept of Ranking Model -- typeahead f1: FEATURE_SHARED_MUTUAL_FRIEN f2: FEATURE_DISTANCE_GRAPH f1: 30 f2: 1 f1: 8 f2: 2 score: 0.86 score: 0.4 A Toy User Scoring Model with Two Features
  • 6. Ranking Model -- Browse FEATURE_BROWSE_PHOTO_OF_USER:1 FEATURE_CONSTRAINTS_RATIO: 1 FEATURE_PHOTO_AGE_IN_DAYS_INV: 0.0196 FEATURE_PHOTO_COEFFICIENT_MAX: 0.2817745947658 FEATURE_PHOTO_HAS_FACE:1 
.. Score: 0.96 FEATURE_BROWSE_PHOTO_OF_USER:1 FEATURE_CONSTRAINTS_RATIO: 1 FEATURE_PHOTO_AGE_IN_DAYS_INV: 0.0268 FEATURE_PHOTO_COEFFICIENT_MAX: 2.2E-308 FEATURE_PHOTO_HAS_FACE:1 
.. Score: 0.82
  • 7. Ranking Model Intuitively, for searcher a’s request b, what should be the score for result/document c? ‱ score: a real number. ‱ Results with larger score s should be ranked higher Mathematically, score = Ranking_model(feature_vector) feature_vector: features that contain info about searcher, query, result. Ranking_model: which maps a set of features (i.e., a feature vector) to a score.
  • 8. Ranking Features Total number of features for all verticals today: ~1200 Each vertical contains a subset of features Photo browse: about 100 features. Examples: ‱ FEATURE_BROWSE_PHOTO_IN: whether the query is “photos in some place” ‱ FEATURE_PHOTO_HAS_FACE: whether this photo has face ‱ FEATURE_PHOTO_NUM_FRIENDS_LIKED: how many of searcher’s friends liked this photo 
 ‱ In sum, features contain info about searcher, query, result(document)
  • 9. Bucket Ranker For continuous features: piecewise linear f1 s f2 s For discrete features: step-wise f1 s Can approximate any nonlinear functions if we create enough number of buckets One bucket: the region between two borders One bucket: Every border is one bucket
  • 10. Bucket Ranker One example: { "features" : {"FEATURE_SHARED_MUTUAL_FRIENDS" : [0,1,11,240]}, "weights" : [ 0.0206305, 0.0555313, 0.284588, 0]}, f1 Suppose x is the value of one data on feature FEATURE_SHARED_MUTUAL_FRIENDS If x <= 0, score= 0.0206305 If x == 1, score= 0.0206305 + 0.0555313 If 1<= x <=11, score= 0.0206305 + 0.0555313+ 0.284588 *(x-1)/(11-1) If 11<= x <=240, score= 0.0206305 + 0.0555313+ 0.284588+ 0*(x- 11)/(240-11) If x > 240, score= 0.0206305 + 0.0555313+ 0.284588+ 0 s1
  • 11. Bucket Ranker Output of the scoring model: the sum of score from bucket each on each feature f1 f 1       Features j j j Features j j f BR s f R ) ( ) ( f2 f 2 s1 s2 f1: 30  s1: 0.36 f1: 8  s1: 0.3 f2: 1  s2: 0.5 f2: 2  s2: 0.1 score: 0.86 score: 0.4
  • 12. Conditioned Bucket ranker, i.e., ranking tree Condition 1: “photo of user” Condition 2: “photo in 
” f1 s f2 s Feature 1 Feature 2 Feature 1 Feature 2       Conditons i Features j j ij f BR f R s ) ( ) ( f1 s f2 s Face feature Face feature Outdoor feature Outdoor feature For different query intent: Face features is very important positive feature for “photo of user” query, but is not important for “photos in some place”.
  • 13.       Conditons i Features j j ij f BR f R s ) ( ) ( Example: Condition 1: “photo of user” Condition 2: “photo in some place” For one query “photos of user in some place”, it will get score from buckets under both condition 1 and 2 Condition 1: “photo of user” Condition 2: “photo in 
” f1 s f2 s Feature 1 Feature 2 Feature 1 Feature 2 f1 s f2 s Face feature Face feature Outdoor feature Outdoor feature photos of my friends photos taken in Hawaii photos of my friends taken in Hawaii
  • 14. A Defense for the Bucket Ranker Model Question: Why Bucket Ranker, why not Linear, Random Forests, Boosted Decision Trees, LambdaMART, Bayesian Graph Model (yet)? Answer: A White Box model: Good interpretation/debug ability We still have lots of problems on features, labeling, data logging, etc., so our data may not be ready to train a black box model We often need to manually modify the model (e.g., support new queries, hot fix for an important bug, cover corner cases when training data is not good enough, etc.,) Complex enough to guarantee good ranking quality
  • 15. A Defense for the Bucket Ranker Model Bucket Ranker gives us a good tradeoff between interpretation/debug ability and ranking quality interpretation ability linear Bucket Ranker Black Box Models Ranking Quality (up to now) linear Bucket Ranker Black Box Models What if black box models is significantly better? A brilliant idea: use the score from black box model as label to train a white box model
  • 16. Engineers in our search ranking team used to manually tune the model, sometimes consisting of hundreds of curves (i.e., piecewise linear functions) ‱ Tedious! ‱ Unproductive! Don’t know what the weights/curves should be Machine learning to rescue !
  • 17. Outline Ranking model for graph search Learn the ranking model for graph search Training Data The Workflow to Learn Conditioned Bucket Ranker Learning To Rank Techniques Experiments and Discussions
  • 18. Training Data Results shown to users Random results/samples from the same search session, but not shown to users, collected in indexing servers
  • 19. Basic Labeling All random samples are labeled as negative data -1 Results with target actions, (e.g., click, friending, etc.) are labeled as positive data +1 Results without target actions, are labeled as negative data -1
  • 20. More Labeling Strategies typeahead_balanced: ignore negative results under the positive results positive_only: ignore all negative results, only use negative samples 

 Basic labeling is usually the best, or good enough
  • 21. Outline Ranking model for graph search Learn the ranking model for graph search Training Data The Workflow to Learn Conditioned Bucket Ranker Learning To Rank Techniques Example Results on photo search
  • 22. How to Choose the Conditions and Features Conditions: Manually chosen up to now, incorporate human domain knowledge Features: usually need to remove obvious meaningless features like Doc_id Some ongoing tasks about suggesting condition and feature automatically e.g., frequent pattern mining on queries Condition 1: “photo of user” Condition 2: “photo in 
” Feature 1 Feature 2 Feature 1 Feature 2
  • 23. Create Bucket Borders For continuers features or discrete features with many possible values ‱ Percentile ‱ make sure each buckets contains the same number of data For discrete features with few possible values (like binary features), or category features such as user locale ‱ each feature value is one bucket
  • 24. Create Bucket Borders Discrete features with skew distribution: Iterative percentile histogram
  • 25. Learn the Bucket Ranker Condition 1: “photo of user” Condition 2: “photo in 
” f1 s f2 s Feature 1 Feature 2 Feature 1 Feature 2 f1 s f2 s We have condition, features, and bucket borders now, the only thing to learn is the weights of each bucket.
  • 26. Feature transformation --Bucketization ] 0 ,... 0 , 8 . 0 , 1 ,... 1 [ : 'ij f 23 : ij f : ij f the feature vector for feature j under condition i, after bucketization. A vector with the dimension == number of buckets ij f' the feature value for feature j under condition i, a scalar Scale invariant with percentile buckets
  • 27. Feature transformation -- Linearization          x w f h f ij ij , ' ' ) ( Conditons i Features j R s ] 0 ,...,... 0 , ' , 0 ,... 0 [ : ij f x Condition 1: “photo of user” Condition 2: “photo in 
” f1 s f2 s Feature 1 Feature 2 Feature 1 Feature 2 f1 s f2 s f: original features x: features after transformation Dimension of features x: Number of conditions * number of original features * number of buckets Learning the whole tree simultaneously == learning a linear function s =R(x)= <w, x> One condition Satisfied: Multiple condition Satisfied: ] 0 ,..., ' ,... 0 , ' ,... 0 [ : ij ij f f x
  • 28. Outline Ranking model for graph search Learn the ranking model for graph search Training Data The Workflow to Learn Conditioned Bucket Ranker Learning To Rank Techniques Example Results on photo search
  • 29. Learning to Rank Given lots of training data (feature_vector xi, score si), i=1,
n Learn a linear ranking function ‱ score = Ranking_model(feature_vector) = <w, x>
  • 30. Learning to Rank -- History A good summary: http://en.wikipedia.org/wiki/Learning_to_rank Three main categories of methods ‱ Pointwise learning ‱ Pairwise learning ‱ Listwise learning
  • 31. Learning to Rank -- History Pointwise Learning ‱ For one point (xi , si), the cost function is to make sure R(xi) si ‱ i.e., if one result gets clicked, its score should be close to 1; otherwise, its score should be close to 0; ‱ Every supervised regression or classification method are applicable. Examples: ‱ Linear, Logistic Regression, SVM, etc., ‱ FB Ads and Newsfeed ranking team are using methods in this category ‱ One possible problem: Label is not session specific X: feature_vector after transformation s: score R: ranking model
  • 32. Learning to Rank -- History (Session Specific) Pairwise Learning ‱ For two points (xi, si), (xj, sj) from the same search session, the cost function is to make sure R(xi) > R(xj) , if si > sj ‱ In other words, if result i and j are results from the same search session, and i is clicked, but j is not, then score i should be higher than score j. ‱ Examples: RankNet, Ranking SVM , RankBoost, LambdaRank, CRR, etc. (Session Specific) Listwise Learning ‱ For all points from the same search session, make sure R(fi) will have the same order as si ‱ Structure SVM, LambdaMART, etc. X: feature_vector after transformation s: score R: ranking model
  • 33. CRR: Combined Regression and Ranking pointwise term pairwise term regularization term E.g., f: linear, l:hinge loss, t: sign() or f: logistic function, l: a loga + (1-a) log(1-a), t: (1+y)/2
  • 34. Learn (Conditioned) Bucket ranker Solve the problem by Stochastic Gradient Descent (SGD) methods ‱ Machine learning toolbox: sofia-ML Can train with 10M data within 1 hour in a single machine
  • 35. Outline Ranking model for graph search Learn the ranking model for graph search Experiments and Discussions
  • 36. Data Logging Logging of the results features at backend Logging of the click & conversion actions in frontend ‱ Tables from frontend and backend are joined to create the ultimate HIVE data table: search_learning_data ‱ Tables are populated once per day
  • 37. Train Scoring Models with Machine Learning Pipelines Run two commands, e.g., ‱ search/ranking/util/collect_data_ta.sh '2013-08-21' '2013-08-27' -type users --output /home/jfh/data.txt ‱ search/ranking/util/train_model.sh /home/jfh/data.txt /home/jfh/bm_test Wait for 1-2 hours, obtain your model! Evaluation and A/B test â–Ș More details: https://our.intern.facebook.com/intern/wiki/index.php/Trainin g_models_using_hive_data
  • 38. Results ML Trained models for typeahead verticals (except group) are deployed in production ‱ Metrics are usually slightly better, compared to hand-tune model Two Trained Models for Browse deployed to production ‱ Ragtime: clickers by 2.47% and actioners by 3.77% ‱ Photos: more details in following slides
  • 39. Basic metrics of the previous handtuned production model for photo search (# clicked sessions: 30.8% , # clicks per clicked session: 3.04) # clicks per session: 0.936, # likes per session: ~0.07 # comments: < 0.01
  • 40. A/B test experiments compared to previous hand tuned production model Improvement compared to previous handtuned production model Target Actions #clicks Per session #likes per session #photo queries per day #photo search users per day click 5% -5% 3% ~1% like 4% >20% 0-1% ~0% click, like, comment 5% 7% 3% ~1% â–Ș Run A/B test for handtuned model, and 3 machine trained models â–Ș The three machine trained models have target actions of “click”, “like”, “click or like or comment” respectively â–Ș Run A/B tests for 3 weeks Deployed a our current production model.
  • 41. Case Study – “photos of mark zuckerberg” Handtuned model Machine trained model One possible reason: improper feature weights In handtuned model, the weight of features FEATURE_CONSTRAINTS_RATIO (i.e., how many people tagged) is not as high as other features e.g., FEATURE_PHOTO_LIKES
  • 42. Case Study –”photos of Girish Kumar’s friends” Handtuned model Machine trained model One possible reason: “counter”-intuitive features In hand-tuned model, the older the photos, the lower the score In machine trained model, old photos gets low score, but very old photos (e.g.,>10 years) get highest score! Photos taken today Photos taken >10 years ago
  • 43. More analysis on photo age photos of my friends, photos of a user photos of a product/company, movie, public figure, etc. photos liked a user photos in some place: almost 0 weights
  • 44. More examples “Photos liked by Lars Eilstrup Rasmussen” Handtuned model Machine trained model “Photos of Barack Obama”
  • 45. More examples “Photos by National Geographic ” Handtuned model Machine trained model “Photos in Beijing China”
  • 46. Goal At the End of H2 Train models for all the use cases for all verticals Continuous training Cover corner cases well Offline feature extraction, evaluation Try different machine learning methods Easy to analyze and debug the machine learning and models
  • 47. The Ultimate Goal Our life now â–Ș Our life in the future