SlideShare uma empresa Scribd logo
1 de 47
Baixar para ler offline
Архитектура поискового кластера

            Яндекс
        Den Raskovalov
  denplusplus@yandex-team.ru
          30.09.2011
О чем это?

Несмотря на то, что автор сведущ лишь в
том, как работает поиск Яндекса, слышал о
том, как работают другие поисковые
машины, он имеет наглость полагать, что
приемы, которые накоплены в поисковой
индустрии будут полезны любому
высоконагруженному сервису обработки
данных.
Что умеет Яндекс?
●   Зарабатывать деньги (реклама в интернете)
●   Делать надежные высоконагруженные
    сложные сервисы
●   Увязывать тысячи источников данных в один
    продукт
●   Решать задачи обработки данных
Requirements
●   Indexation
    ●   ~1012 known URL's
    ●   ~1010 indexed documents
    ●   ~1010 searchable documents
    ●   Real-time indexation
●   Real-time query processing
    ●   107 user queries per hour in peak
●   Development
    ●   400+ creative search developers
    ●   A lot of experiments on full data (pages, links, clicks)
    ●   Large-scale machine learning
    ●   Many experiments require full-sized copy of Yandex.Search
Hardware
●   Server's cost:
    ●   Hardware (~$2500 for ~24 computing cores, ~48Gb RAM)
    ●   Place in data center (~$1000 per year)
    ●   Electricity for computation and cooling (~$1000 per year)
    ●   Maintenance (~$300 per year)
●   High-end hardware costs too much.
●   Low-end hardware needs same electricity and place
    and maintenance as high-end.
●   There is optimum point.
●   Now Yandex.Search works on ~105 servers.
Distribution of Resources
Indexation                Real-time serving          Development


 ●   Main design principles
     ●   Sharding (splitting into almost independent parts)
     ●   Replication (each piece of data is copied several
         times for performance and failover)
     ●   Full isolation between development and production
         (no read/write access for developer to production
         nodes)
Content System



Content system – system for crawling internet
urls, building indexes (inverted index for search,
link index, pagerank, ...) and deployment
indexes for real-time query processing.
Content System
●   Content system phases:
    ●   Crawling (building a list of URLs)
    ●   Selection (we can't fetch everything, we need to predict quality
        before fetching)
    ●   Fetching document data from Internet
    ●   Building search representation for documents (inverted indices
        with lemmatization; binding of external host/url information, link
        information)
    ●   Spam detection
    ●   Duplicates detection
    ●   Selection (we can't search on everything we fetch)
    ●   Documents deployment to system of real-time query processing
Content System
●   In fact we have two content systems now
    ●   Batch (for processing ~1012 URLs)
    ●   Real-time (for processing ~109 URLs we need for
        freshness)
●   Different update periods inflects architecture
    ●   Disc-efficient (sort/join) algorithms for batch
    ●   Almost everything in memory for real-time
Batch Content System
●   We split internet in n parts based on 32-bit hash
    of a hostname. We have ~104 computers, a
    computer for an interval of hash values; it
    crawls “it's” hosts and builds search indexes.
●   We represent “knowledge” of internet as
    relational tables (table of urls, table of links,
    table of hosts).
Batch Content System
             Internet


                                                      Fetcher cluster
                                   h                                                Fe
                                                                                      tch
                              fetc




                                                                  Fetc
                                                     tch
                                                                                         ed
                                        Fet
                                        Fet
                                        Fet                                         ch
                         or
                    rls f
                                            ch
                                            ch
                                            ch                                  fet                da




                                                   or fe
                                 ed
                                 ed
                                 ed
                                                                                                        ta
                                                                              r
                   U




                                                                      h
                           ata
                           ata
                           ata      d
                                    d
                                    d                                                         fo




                                                                    ed d
                                                                                         Urls




                                                      f
                                                 Urls




                                                                        ata
     Indexation node                              Indexation node                        Indexation node



       Search node                                    Search node                          Search node

●   Once per day each indexation node forms a list of urls for fetch.
●   Once per day each indexation node receives fetched data and updates “knowledge” about internet
    (update process needs only sorting and joins of big tables on disk).
●   Once per three days each indexation node push indexes on search nodes.
Batch Content System
●   Each indexation node needs “external” data
    ●   Links from another indexation nodes
        –   data passing is asynchronous
        –   death of source or target indexation nodes isn't critical (in
            case of failure data will be received next time)
    ●   Aggregated clicks on URLs from MapReduce
    ●   External data about hosts and pages
Real-time Content System
●   Main requirement for real-time content system is small time interval
    between availability of information about document and ability to
    search on document's content.
●   It implies that data can be stored only in RAM. HDD can only be used
    for eventually consistent backup.
●   You have to switch from relational design of data processing to
    services and message passing algorithms.
●   You need big and fast key-value storage in RAM.
●   Algorithms with smaller guarantees without consistency invariants
    (because any network packet can be lose).
Real-time Content System
●   We run each set of services on a node:
    ●   Link database
    ●   Local pagerank calculator
    ●   Global dup-detector
    ●   Anti-spam checker
    ●   Indexer
●   We have library of maintaing key-value storage in
    memory with eventual backup/restore to/from HDD.
●   A lot of network traffic between nodes of real-time
    content system. Almost no data locality. Nodes of real-
    time content system need to be in one datacenter.
MapReduce in Yandex.Search
●   MapReduce approach is simple, clear data storage
    combined with data processing, which hides complexity
    of parallelizing algorithms on several computation nodes
●   We use MR for
    ●   logs (views, clicks) processing
    ●   development experiments, which need a lot of data or a lot of
        CPU
●   We have several independent MR-clusters with a total of
    ~103 nodes (some of them are “production”, some of
    them for data uploading and code execution)
HPC in Search Quality
●   We use 'pools' -- conveniently represented or
    aggregated parts of web data.
●   Types of 'pools':
    ●   LETOR data
    ●   n-grams frequencies
    ●   text and HTML samples
    ●   ....
    ●   pages with song lyrics from web
    ●   ....
    ●   any crazy idea? (you can check it in hours)
HPC in Search Quality
●   May be the most important task in search quality is relevance
    prediction. Assume that you have URL-query pairs. We have
    statistics about url (pagerank, text length), query (frequency,
    commerciality) and pair {url, query} (text relevance, link
    relevance); which we call relevance features. We want to
    predict relevance of URL on search query by some feature-
    based algorithm.
●   We need to collect relevance features for thousands queries for
    millions of documents from thousands computers for
    training/evaluation of relevance prediction models.
HPC in Search Quality
  ●   Sample of a relevance pool
Query       URL                 Label        PageRank   Text        Link
  ●
                                                        Relevance   Relevance

yandex      Yandex.ru           Relevant     1          0.5         1


yandex      Hdsjdsfs.com        Irrelevant   0.1        0.5         0.1


wikipedia   Wikipedia.org       Relevant     0.9        1           1


wikipedia   wiki.livejournal.co Irrelevant   0.01       1           0.01
            m
HPC in Search Quality
●   One of the most famous formulas in information
    retrieval (TF*IDF):
    TextRelevance(Query, Document) =
    Sum(TextFrequency(Word)/CollectionFrequenc
    y(Word)).
●   We need to calculate frequency of every word
    in our document collection for evaluation of
    TF*IDF text relevance.
HPC in Search Quality
●   MatrixNet is the name of our machine learning
    algorithm.
●   MatrixNet can be run on arbitrary number of
    core processing units.
●   It's important because it allows to decrease time
    for experiment from a week to several hours (if
    you have 100 computers :) ).
HPC in Search Quality
●   Many experiments need analysis of difference between Yandex and
    slightly modified Yandex (beta version).
●   Any developer needs the ability to build betas in minutes.
●   Requirements:
    ●   strong system of deployment (you always need to know all parts of system
        with versions and how you can put it on any node)
    ●   strong code management (branches, tags, code review)
    ●   dynamic resource management (you need only several betas at one time)
    ●   system of automatic search quality evaluation
    ●   a lot of resources
HPC in Search Quality
●   Problems:
    ●   You need full web data and even more
    ●   Any computer can be unavailable at any moment
    ●   You need robust data and code distribution tools
    ●   You have to execute code near data
    ●   You need some coordination between different users on development
        nodes
    ●   You cannot predict load of development node
●   But your progress speed determined by development speed
●   Solution: data replication, scheduling of tasks, convenient and
    robust APIs for developers
●   But..
HPC in Search Quality

No final solution.

They are developers. They
can eat all resources they
can reach.
Overview
●   Structure of the cluster
●   Architectures:
    ●   Data-centric
        –   Batch
        –   Real-time
    ●   Development-centric
        –   MapReduce
        –   Ad-hoc
    ●   Real-time query processing
What is this about


 ●
     Computational model of query processing
 ●
     Problems and their solutions in terms of this
     model
 ●
     Evolution of Search architecture
 ●
     Evolution of Search components
How does Search work
                  IPVS/Balancer


 Frontends


hash(req) %
n              Cachers
                                  
                                      Merging top relevant
                                       documents
                                  
                                      Two-stage query



      Indeces/Snippets
Problems

      
          Connectivity
      
          Load balancing
             
                 Static (precalculated)
             
                 Dynamic (run-time)
      
          Fault tolerance
      
          Latency
Network

 
     Unpredictability of latency
 
     Bottleneck of the system
 
     Throughput grows slower than CPU
 
     Heterogeneous topology
        
            1 datacenter => many datacenters
        
            Many datacenters => datacenters in many
             countries
Load Balancing


    Machines and indexes are different
       
           3 generations of hardware

    Uncontrollable processes at nodes
       
           OS decided to clear buffers
       
           Monitoring system daemon writes core dump
       
           RAID Rebuilding
       
           Whatever
Fault tolerance


    Machines fail
       
           1% of resources is always unavailable

    Packets are lost in the network

    Datacenters get turned off
       
           Summer, heat, cooling doesn’t cope with workload
       
           Forgot to pay for energy
       
           Drunk worker cut the cable

    Cross-country links fail
Latency


    Different system parts reply with different speed
       
           3 generations of machines
       
           Indexes are different

    Random latency
       
           Packets are lost in the network
       
           Scheduler in the OS doesn’t work
       
           Disk is slow (200 S/PS)

    Slow “tail” of queries
15 years ago


           
               Everything is on 1 machine
               
                   Actually, 3, with different DNS names
           
               Problems are obvious
                    
                        Query volume is growing
   Index
                    
                        Base is growing
                    
                        Fault Intolerance
10 years ago

 Frontend/Cacher
                       1 datacenter, 10 base searchers – the
                          base got distributed
                       Can grow query volume and base size
                       Got clients and money – delays are
                         unacceptable
                               Machine fails – system doesn’t live
           …..
                               There are too few machines – subtle
                                  things are not noticable yet
Base Searchers(Indeces)

  ONE Data Center
7 years ago

      Frontends

                        
                            300 machines, 1 datacenter,
                             10 frontends, 20 cachers,
         Cachers             250 base searches
                        
                            Problem of machine failures
                             is solved
                               
                                   IPVS over frontends
          X
                               
                                   Re-Asks at all levels

 Indeces — 2 replicas
7 years ago - problems





    Latency
       
           Wait for the last source
       
           Slow “tail”

    Limit of growth in one datacenter
       
           Base is growing
       
           Delay because of datacenter failure is unacceptable
4 years ago
         DNS/IPVS
                      
                          “Live” in several datacenters
                      
                          Don’t wait for the slowest
                           nodes
                             
                                 Latency is better
                             
                                 Partial replies
                             
                                 Cache updates after reply
                                  to the user
          X


 Many Symmetrical Data Centers
4 years ago - problems

 
     Datacenters are different, the base is growing,
      one datacenter can’t hold the whole base
 
     Need more network even inside the datacenter
 
     Manual configuration
 
     “No-reply”s – the scariest word
 
     Different behavior for different users
Scary Story 2008

                   ●
                       Dynamic Load
                       Balancing
                   ●
                       Hard to implement
                       properly
                       ●
                           Many sources
                       ●
                           Fast changes
                   ●
                       Converge
                       guarantee?
What now?

                                     10^5 machines, 10^1
                                       datacenters
                                     Problem of growth in different
                                       datacenters solved
                                            Integrators – less traffic, can
                                               cross DC boundaries
                                     Cluster structure optimization
                                            “Annealing”
                                            «cost» → min with fixed RPS
                                               and base



Many Datacenters, working as single cluster
Frontier

     
         New Markets
            
                Distributed Data Centers
            
                Slow channels
            
                Database upload
            
                Independent DCs again?
     
         Latency is “not modern”
            
                Ask 2 replicas simultaneously
            
                Wait only for the best
Resume


   
       Succession of architectures
   
       Linear horizontal scaling
          
              By queries
          
              By base
   
       Growth limits are years ahead
   
       Robustness of computational model
Base Search

 
     “Conservative” component
        
            Stable external interface
        
            Complication/optimization of ranking
 
     Filtering iterator
 
     Pruning
 
     Early query termination
 
     Selection Rank
Cacher/metasearch


    Synchronous -> asynchronous
       
           Message passing
       
           Early decision making

    Merge tree

    Rerankings
       
           Metasearch statistics
       
           Data enrichment
Balancer

     
         DNS
     
         IPVS
           
               Only inside VLAN
           
               Distribution by IP
     
         HTTP balancer
           
               Keepalive
           
               Ssl
           
               Honest frontend balancing
           
               Early robots elimination
Also...
                                      Cache
 
     Tiers                            r
        
             Golden       GOLD
                                  COMMON
        
             Common                                 GARBAGE
        
             Garbage
 
     Index update frequency
        
             Main – slow updates (days)
        
             Fast – fast updates (hours, minutes)
        
             Real-time – ultra-fast updates (seconds)
Further Reading
●   Yet Another MapReduce
    http://prezi.com/p4uipwixfd4p/yet-another-mapredu
●   Оптимизация алгоритмов ранжирования
    методами машинного обучения
    http://romip.ru/romip2009/15_yandex.pdf
●   Как работают поисковые системы
    http://download.yandex.ru/company/iworld-3.pdf
Thank you.

Questions?

Mais conteúdo relacionado

Destaque

清真食用品驗證方案 美妝品V3.1
清真食用品驗證方案 美妝品V3.1清真食用品驗證方案 美妝品V3.1
清真食用品驗證方案 美妝品V3.1Chen Terry
 
정치가 평전읽기
정치가 평전읽기정치가 평전읽기
정치가 평전읽기형근 김
 
It driftsperson fra mekaniker til kartleser og sjåfør
It driftsperson   fra mekaniker til kartleser og sjåførIt driftsperson   fra mekaniker til kartleser og sjåfør
It driftsperson fra mekaniker til kartleser og sjåførSimen Sommerfeldt
 
1 Sustainable Making - Esempi di un Mondo che Cambia
1 Sustainable Making  - Esempi di un Mondo che Cambia1 Sustainable Making  - Esempi di un Mondo che Cambia
1 Sustainable Making - Esempi di un Mondo che CambiaDaniele Bucci
 
Kun koulutus ei riitä - ja ajatuksia siitä mitä oppiminen työssä vaatii
Kun koulutus ei riitä - ja ajatuksia siitä mitä oppiminen työssä vaatiiKun koulutus ei riitä - ja ajatuksia siitä mitä oppiminen työssä vaatii
Kun koulutus ei riitä - ja ajatuksia siitä mitä oppiminen työssä vaatiiOuti Sivonen
 
Afro investors group
Afro investors groupAfro investors group
Afro investors groupMelke AB
 
Проект "Декларатор"
Проект "Декларатор"Проект "Декларатор"
Проект "Декларатор"Alona Glazkova
 
Электронный бюллетень «Юнисон Групп» за первый квартал 2016 года.
Электронный бюллетень «Юнисон Групп» за первый квартал 2016 года.Электронный бюллетень «Юнисон Групп» за первый квартал 2016 года.
Электронный бюллетень «Юнисон Групп» за первый квартал 2016 года.Unison Group
 
FAQมากิพลัส
FAQมากิพลัสFAQมากิพลัส
FAQมากิพลัสwattana072
 
PMO TI - Projeto de Implantação numa Agência Pública Federal - o caso Finep
PMO TI - Projeto de  Implantação numa Agência  Pública Federal - o caso FinepPMO TI - Projeto de  Implantação numa Agência  Pública Federal - o caso Finep
PMO TI - Projeto de Implantação numa Agência Pública Federal - o caso FinepRoberto Chiacchio
 
портфоліо кер. гуртка 1
портфоліо кер. гуртка  1портфоліо кер. гуртка  1
портфоліо кер. гуртка 1Olena Pyzaenko
 
Avant-projet de loi Travail El Khomri
Avant-projet de loi Travail El KhomriAvant-projet de loi Travail El Khomri
Avant-projet de loi Travail El Khomrimariesautier
 
KAİROS TEKNOLOJİ SUNUM 2016
KAİROS TEKNOLOJİ SUNUM 2016KAİROS TEKNOLOJİ SUNUM 2016
KAİROS TEKNOLOJİ SUNUM 2016Ayça Özen
 
Slik bruker vi sosiale medier når vi investerer
Slik bruker vi sosiale medier når vi investererSlik bruker vi sosiale medier når vi investerer
Slik bruker vi sosiale medier når vi investererenerWE
 

Destaque (18)

清真食用品驗證方案 美妝品V3.1
清真食用品驗證方案 美妝品V3.1清真食用品驗證方案 美妝品V3.1
清真食用品驗證方案 美妝品V3.1
 
Nonprofit investing fees - end the waste
Nonprofit investing fees - end the waste  Nonprofit investing fees - end the waste
Nonprofit investing fees - end the waste
 
SmarterMoney Review 5 Winter 2016
SmarterMoney Review 5 Winter 2016SmarterMoney Review 5 Winter 2016
SmarterMoney Review 5 Winter 2016
 
정치가 평전읽기
정치가 평전읽기정치가 평전읽기
정치가 평전읽기
 
It driftsperson fra mekaniker til kartleser og sjåfør
It driftsperson   fra mekaniker til kartleser og sjåførIt driftsperson   fra mekaniker til kartleser og sjåfør
It driftsperson fra mekaniker til kartleser og sjåfør
 
1 Sustainable Making - Esempi di un Mondo che Cambia
1 Sustainable Making  - Esempi di un Mondo che Cambia1 Sustainable Making  - Esempi di un Mondo che Cambia
1 Sustainable Making - Esempi di un Mondo che Cambia
 
Kun koulutus ei riitä - ja ajatuksia siitä mitä oppiminen työssä vaatii
Kun koulutus ei riitä - ja ajatuksia siitä mitä oppiminen työssä vaatiiKun koulutus ei riitä - ja ajatuksia siitä mitä oppiminen työssä vaatii
Kun koulutus ei riitä - ja ajatuksia siitä mitä oppiminen työssä vaatii
 
Afro investors group
Afro investors groupAfro investors group
Afro investors group
 
踏切マジック
踏切マジック踏切マジック
踏切マジック
 
Anúncios criativos ambiente e videos net
Anúncios criativos ambiente e videos netAnúncios criativos ambiente e videos net
Anúncios criativos ambiente e videos net
 
Проект "Декларатор"
Проект "Декларатор"Проект "Декларатор"
Проект "Декларатор"
 
Электронный бюллетень «Юнисон Групп» за первый квартал 2016 года.
Электронный бюллетень «Юнисон Групп» за первый квартал 2016 года.Электронный бюллетень «Юнисон Групп» за первый квартал 2016 года.
Электронный бюллетень «Юнисон Групп» за первый квартал 2016 года.
 
FAQมากิพลัส
FAQมากิพลัสFAQมากิพลัส
FAQมากิพลัส
 
PMO TI - Projeto de Implantação numa Agência Pública Federal - o caso Finep
PMO TI - Projeto de  Implantação numa Agência  Pública Federal - o caso FinepPMO TI - Projeto de  Implantação numa Agência  Pública Federal - o caso Finep
PMO TI - Projeto de Implantação numa Agência Pública Federal - o caso Finep
 
портфоліо кер. гуртка 1
портфоліо кер. гуртка  1портфоліо кер. гуртка  1
портфоліо кер. гуртка 1
 
Avant-projet de loi Travail El Khomri
Avant-projet de loi Travail El KhomriAvant-projet de loi Travail El Khomri
Avant-projet de loi Travail El Khomri
 
KAİROS TEKNOLOJİ SUNUM 2016
KAİROS TEKNOLOJİ SUNUM 2016KAİROS TEKNOLOJİ SUNUM 2016
KAİROS TEKNOLOJİ SUNUM 2016
 
Slik bruker vi sosiale medier når vi investerer
Slik bruker vi sosiale medier når vi investererSlik bruker vi sosiale medier når vi investerer
Slik bruker vi sosiale medier når vi investerer
 

Semelhante a 20110930 information retrieval raskovalov_lecture1

Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data AnalyticsSupersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analyticsmason_s
 
Apache Hive for modern DBAs
Apache Hive for modern DBAsApache Hive for modern DBAs
Apache Hive for modern DBAsLuis Marques
 
Introduction to Structured Data Processing with Spark SQL
Introduction to Structured Data Processing with Spark SQLIntroduction to Structured Data Processing with Spark SQL
Introduction to Structured Data Processing with Spark SQLdatamantra
 
Code as Data workshop: Using source{d} Engine to extract insights from git re...
Code as Data workshop: Using source{d} Engine to extract insights from git re...Code as Data workshop: Using source{d} Engine to extract insights from git re...
Code as Data workshop: Using source{d} Engine to extract insights from git re...source{d}
 
Big data for the rest of us with hadoop
Big data for the rest of us with hadoopBig data for the rest of us with hadoop
Big data for the rest of us with hadoopDhaval Anjaria
 
The Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop EcosystemThe Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop EcosystemCloudera, Inc.
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataGiorgos Santipantakis
 
How a search engine works slide
How a search engine works slideHow a search engine works slide
How a search engine works slideSovan Misra
 
CityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tablesCityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tablesEnrico Daga
 
SDEC2011 Replacing legacy Telco DB/DW to Hadoop and Hive
SDEC2011 Replacing legacy Telco DB/DW to Hadoop and HiveSDEC2011 Replacing legacy Telco DB/DW to Hadoop and Hive
SDEC2011 Replacing legacy Telco DB/DW to Hadoop and HiveKorea Sdec
 
Apache Arrow: In Theory, In Practice
Apache Arrow: In Theory, In PracticeApache Arrow: In Theory, In Practice
Apache Arrow: In Theory, In PracticeDremio Corporation
 

Semelhante a 20110930 information retrieval raskovalov_lecture1 (20)

Introducing Datawave
Introducing DatawaveIntroducing Datawave
Introducing Datawave
 
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data AnalyticsSupersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
 
Handout: 'Open Source Tools & Resources'
Handout: 'Open Source Tools & Resources'Handout: 'Open Source Tools & Resources'
Handout: 'Open Source Tools & Resources'
 
InternReport
InternReportInternReport
InternReport
 
Publishing Linked Data using Schema.org
Publishing Linked Data using Schema.orgPublishing Linked Data using Schema.org
Publishing Linked Data using Schema.org
 
Presto
PrestoPresto
Presto
 
Apache Hive for modern DBAs
Apache Hive for modern DBAsApache Hive for modern DBAs
Apache Hive for modern DBAs
 
Introduction to Structured Data Processing with Spark SQL
Introduction to Structured Data Processing with Spark SQLIntroduction to Structured Data Processing with Spark SQL
Introduction to Structured Data Processing with Spark SQL
 
Code as Data workshop: Using source{d} Engine to extract insights from git re...
Code as Data workshop: Using source{d} Engine to extract insights from git re...Code as Data workshop: Using source{d} Engine to extract insights from git re...
Code as Data workshop: Using source{d} Engine to extract insights from git re...
 
Oai Workshop Extended
Oai Workshop ExtendedOai Workshop Extended
Oai Workshop Extended
 
Web technology Unit I Part C
Web technology Unit I  Part CWeb technology Unit I  Part C
Web technology Unit I Part C
 
Big data for the rest of us with hadoop
Big data for the rest of us with hadoopBig data for the rest of us with hadoop
Big data for the rest of us with hadoop
 
The Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop EcosystemThe Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop Ecosystem
 
Spark mhug2
Spark mhug2Spark mhug2
Spark mhug2
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
Introduction to HTML
Introduction to HTMLIntroduction to HTML
Introduction to HTML
 
How a search engine works slide
How a search engine works slideHow a search engine works slide
How a search engine works slide
 
CityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tablesCityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tables
 
SDEC2011 Replacing legacy Telco DB/DW to Hadoop and Hive
SDEC2011 Replacing legacy Telco DB/DW to Hadoop and HiveSDEC2011 Replacing legacy Telco DB/DW to Hadoop and Hive
SDEC2011 Replacing legacy Telco DB/DW to Hadoop and Hive
 
Apache Arrow: In Theory, In Practice
Apache Arrow: In Theory, In PracticeApache Arrow: In Theory, In Practice
Apache Arrow: In Theory, In Practice
 

Mais de Computer Science Club

20140531 serebryany lecture01_fantastic_cpp_bugs
20140531 serebryany lecture01_fantastic_cpp_bugs20140531 serebryany lecture01_fantastic_cpp_bugs
20140531 serebryany lecture01_fantastic_cpp_bugsComputer Science Club
 
20140531 serebryany lecture02_find_scary_cpp_bugs
20140531 serebryany lecture02_find_scary_cpp_bugs20140531 serebryany lecture02_find_scary_cpp_bugs
20140531 serebryany lecture02_find_scary_cpp_bugsComputer Science Club
 
20140531 serebryany lecture01_fantastic_cpp_bugs
20140531 serebryany lecture01_fantastic_cpp_bugs20140531 serebryany lecture01_fantastic_cpp_bugs
20140531 serebryany lecture01_fantastic_cpp_bugsComputer Science Club
 
20140511 parallel programming_kalishenko_lecture12
20140511 parallel programming_kalishenko_lecture1220140511 parallel programming_kalishenko_lecture12
20140511 parallel programming_kalishenko_lecture12Computer Science Club
 
20140427 parallel programming_zlobin_lecture11
20140427 parallel programming_zlobin_lecture1120140427 parallel programming_zlobin_lecture11
20140427 parallel programming_zlobin_lecture11Computer Science Club
 
20140420 parallel programming_kalishenko_lecture10
20140420 parallel programming_kalishenko_lecture1020140420 parallel programming_kalishenko_lecture10
20140420 parallel programming_kalishenko_lecture10Computer Science Club
 
20140413 parallel programming_kalishenko_lecture09
20140413 parallel programming_kalishenko_lecture0920140413 parallel programming_kalishenko_lecture09
20140413 parallel programming_kalishenko_lecture09Computer Science Club
 
20140329 graph drawing_dainiak_lecture02
20140329 graph drawing_dainiak_lecture0220140329 graph drawing_dainiak_lecture02
20140329 graph drawing_dainiak_lecture02Computer Science Club
 
20140329 graph drawing_dainiak_lecture01
20140329 graph drawing_dainiak_lecture0120140329 graph drawing_dainiak_lecture01
20140329 graph drawing_dainiak_lecture01Computer Science Club
 
20140310 parallel programming_kalishenko_lecture03-04
20140310 parallel programming_kalishenko_lecture03-0420140310 parallel programming_kalishenko_lecture03-04
20140310 parallel programming_kalishenko_lecture03-04Computer Science Club
 
20140216 parallel programming_kalishenko_lecture01
20140216 parallel programming_kalishenko_lecture0120140216 parallel programming_kalishenko_lecture01
20140216 parallel programming_kalishenko_lecture01Computer Science Club
 

Mais de Computer Science Club (20)

20141223 kuznetsov distributed
20141223 kuznetsov distributed20141223 kuznetsov distributed
20141223 kuznetsov distributed
 
Computer Vision
Computer VisionComputer Vision
Computer Vision
 
20140531 serebryany lecture01_fantastic_cpp_bugs
20140531 serebryany lecture01_fantastic_cpp_bugs20140531 serebryany lecture01_fantastic_cpp_bugs
20140531 serebryany lecture01_fantastic_cpp_bugs
 
20140531 serebryany lecture02_find_scary_cpp_bugs
20140531 serebryany lecture02_find_scary_cpp_bugs20140531 serebryany lecture02_find_scary_cpp_bugs
20140531 serebryany lecture02_find_scary_cpp_bugs
 
20140531 serebryany lecture01_fantastic_cpp_bugs
20140531 serebryany lecture01_fantastic_cpp_bugs20140531 serebryany lecture01_fantastic_cpp_bugs
20140531 serebryany lecture01_fantastic_cpp_bugs
 
20140511 parallel programming_kalishenko_lecture12
20140511 parallel programming_kalishenko_lecture1220140511 parallel programming_kalishenko_lecture12
20140511 parallel programming_kalishenko_lecture12
 
20140427 parallel programming_zlobin_lecture11
20140427 parallel programming_zlobin_lecture1120140427 parallel programming_zlobin_lecture11
20140427 parallel programming_zlobin_lecture11
 
20140420 parallel programming_kalishenko_lecture10
20140420 parallel programming_kalishenko_lecture1020140420 parallel programming_kalishenko_lecture10
20140420 parallel programming_kalishenko_lecture10
 
20140413 parallel programming_kalishenko_lecture09
20140413 parallel programming_kalishenko_lecture0920140413 parallel programming_kalishenko_lecture09
20140413 parallel programming_kalishenko_lecture09
 
20140329 graph drawing_dainiak_lecture02
20140329 graph drawing_dainiak_lecture0220140329 graph drawing_dainiak_lecture02
20140329 graph drawing_dainiak_lecture02
 
20140329 graph drawing_dainiak_lecture01
20140329 graph drawing_dainiak_lecture0120140329 graph drawing_dainiak_lecture01
20140329 graph drawing_dainiak_lecture01
 
20140310 parallel programming_kalishenko_lecture03-04
20140310 parallel programming_kalishenko_lecture03-0420140310 parallel programming_kalishenko_lecture03-04
20140310 parallel programming_kalishenko_lecture03-04
 
20140223-SuffixTrees-lecture01-03
20140223-SuffixTrees-lecture01-0320140223-SuffixTrees-lecture01-03
20140223-SuffixTrees-lecture01-03
 
20140216 parallel programming_kalishenko_lecture01
20140216 parallel programming_kalishenko_lecture0120140216 parallel programming_kalishenko_lecture01
20140216 parallel programming_kalishenko_lecture01
 
20131106 h10 lecture6_matiyasevich
20131106 h10 lecture6_matiyasevich20131106 h10 lecture6_matiyasevich
20131106 h10 lecture6_matiyasevich
 
20131027 h10 lecture5_matiyasevich
20131027 h10 lecture5_matiyasevich20131027 h10 lecture5_matiyasevich
20131027 h10 lecture5_matiyasevich
 
20131027 h10 lecture5_matiyasevich
20131027 h10 lecture5_matiyasevich20131027 h10 lecture5_matiyasevich
20131027 h10 lecture5_matiyasevich
 
20131013 h10 lecture4_matiyasevich
20131013 h10 lecture4_matiyasevich20131013 h10 lecture4_matiyasevich
20131013 h10 lecture4_matiyasevich
 
20131006 h10 lecture3_matiyasevich
20131006 h10 lecture3_matiyasevich20131006 h10 lecture3_matiyasevich
20131006 h10 lecture3_matiyasevich
 
20131006 h10 lecture3_matiyasevich
20131006 h10 lecture3_matiyasevich20131006 h10 lecture3_matiyasevich
20131006 h10 lecture3_matiyasevich
 

Último

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 

Último (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 

20110930 information retrieval raskovalov_lecture1

  • 1. Архитектура поискового кластера Яндекс Den Raskovalov denplusplus@yandex-team.ru 30.09.2011
  • 2. О чем это? Несмотря на то, что автор сведущ лишь в том, как работает поиск Яндекса, слышал о том, как работают другие поисковые машины, он имеет наглость полагать, что приемы, которые накоплены в поисковой индустрии будут полезны любому высоконагруженному сервису обработки данных.
  • 3. Что умеет Яндекс? ● Зарабатывать деньги (реклама в интернете) ● Делать надежные высоконагруженные сложные сервисы ● Увязывать тысячи источников данных в один продукт ● Решать задачи обработки данных
  • 4. Requirements ● Indexation ● ~1012 known URL's ● ~1010 indexed documents ● ~1010 searchable documents ● Real-time indexation ● Real-time query processing ● 107 user queries per hour in peak ● Development ● 400+ creative search developers ● A lot of experiments on full data (pages, links, clicks) ● Large-scale machine learning ● Many experiments require full-sized copy of Yandex.Search
  • 5. Hardware ● Server's cost: ● Hardware (~$2500 for ~24 computing cores, ~48Gb RAM) ● Place in data center (~$1000 per year) ● Electricity for computation and cooling (~$1000 per year) ● Maintenance (~$300 per year) ● High-end hardware costs too much. ● Low-end hardware needs same electricity and place and maintenance as high-end. ● There is optimum point. ● Now Yandex.Search works on ~105 servers.
  • 6. Distribution of Resources Indexation Real-time serving Development ● Main design principles ● Sharding (splitting into almost independent parts) ● Replication (each piece of data is copied several times for performance and failover) ● Full isolation between development and production (no read/write access for developer to production nodes)
  • 7. Content System Content system – system for crawling internet urls, building indexes (inverted index for search, link index, pagerank, ...) and deployment indexes for real-time query processing.
  • 8. Content System ● Content system phases: ● Crawling (building a list of URLs) ● Selection (we can't fetch everything, we need to predict quality before fetching) ● Fetching document data from Internet ● Building search representation for documents (inverted indices with lemmatization; binding of external host/url information, link information) ● Spam detection ● Duplicates detection ● Selection (we can't search on everything we fetch) ● Documents deployment to system of real-time query processing
  • 9. Content System ● In fact we have two content systems now ● Batch (for processing ~1012 URLs) ● Real-time (for processing ~109 URLs we need for freshness) ● Different update periods inflects architecture ● Disc-efficient (sort/join) algorithms for batch ● Almost everything in memory for real-time
  • 10. Batch Content System ● We split internet in n parts based on 32-bit hash of a hostname. We have ~104 computers, a computer for an interval of hash values; it crawls “it's” hosts and builds search indexes. ● We represent “knowledge” of internet as relational tables (table of urls, table of links, table of hosts).
  • 11. Batch Content System Internet Fetcher cluster h Fe tch fetc Fetc tch ed Fet Fet Fet ch or rls f ch ch ch fet da or fe ed ed ed ta r U h ata ata ata d d d fo ed d Urls f Urls ata Indexation node Indexation node Indexation node Search node Search node Search node ● Once per day each indexation node forms a list of urls for fetch. ● Once per day each indexation node receives fetched data and updates “knowledge” about internet (update process needs only sorting and joins of big tables on disk). ● Once per three days each indexation node push indexes on search nodes.
  • 12. Batch Content System ● Each indexation node needs “external” data ● Links from another indexation nodes – data passing is asynchronous – death of source or target indexation nodes isn't critical (in case of failure data will be received next time) ● Aggregated clicks on URLs from MapReduce ● External data about hosts and pages
  • 13. Real-time Content System ● Main requirement for real-time content system is small time interval between availability of information about document and ability to search on document's content. ● It implies that data can be stored only in RAM. HDD can only be used for eventually consistent backup. ● You have to switch from relational design of data processing to services and message passing algorithms. ● You need big and fast key-value storage in RAM. ● Algorithms with smaller guarantees without consistency invariants (because any network packet can be lose).
  • 14. Real-time Content System ● We run each set of services on a node: ● Link database ● Local pagerank calculator ● Global dup-detector ● Anti-spam checker ● Indexer ● We have library of maintaing key-value storage in memory with eventual backup/restore to/from HDD. ● A lot of network traffic between nodes of real-time content system. Almost no data locality. Nodes of real- time content system need to be in one datacenter.
  • 15. MapReduce in Yandex.Search ● MapReduce approach is simple, clear data storage combined with data processing, which hides complexity of parallelizing algorithms on several computation nodes ● We use MR for ● logs (views, clicks) processing ● development experiments, which need a lot of data or a lot of CPU ● We have several independent MR-clusters with a total of ~103 nodes (some of them are “production”, some of them for data uploading and code execution)
  • 16. HPC in Search Quality ● We use 'pools' -- conveniently represented or aggregated parts of web data. ● Types of 'pools': ● LETOR data ● n-grams frequencies ● text and HTML samples ● .... ● pages with song lyrics from web ● .... ● any crazy idea? (you can check it in hours)
  • 17. HPC in Search Quality ● May be the most important task in search quality is relevance prediction. Assume that you have URL-query pairs. We have statistics about url (pagerank, text length), query (frequency, commerciality) and pair {url, query} (text relevance, link relevance); which we call relevance features. We want to predict relevance of URL on search query by some feature- based algorithm. ● We need to collect relevance features for thousands queries for millions of documents from thousands computers for training/evaluation of relevance prediction models.
  • 18. HPC in Search Quality ● Sample of a relevance pool Query URL Label PageRank Text Link ● Relevance Relevance yandex Yandex.ru Relevant 1 0.5 1 yandex Hdsjdsfs.com Irrelevant 0.1 0.5 0.1 wikipedia Wikipedia.org Relevant 0.9 1 1 wikipedia wiki.livejournal.co Irrelevant 0.01 1 0.01 m
  • 19. HPC in Search Quality ● One of the most famous formulas in information retrieval (TF*IDF): TextRelevance(Query, Document) = Sum(TextFrequency(Word)/CollectionFrequenc y(Word)). ● We need to calculate frequency of every word in our document collection for evaluation of TF*IDF text relevance.
  • 20. HPC in Search Quality ● MatrixNet is the name of our machine learning algorithm. ● MatrixNet can be run on arbitrary number of core processing units. ● It's important because it allows to decrease time for experiment from a week to several hours (if you have 100 computers :) ).
  • 21. HPC in Search Quality ● Many experiments need analysis of difference between Yandex and slightly modified Yandex (beta version). ● Any developer needs the ability to build betas in minutes. ● Requirements: ● strong system of deployment (you always need to know all parts of system with versions and how you can put it on any node) ● strong code management (branches, tags, code review) ● dynamic resource management (you need only several betas at one time) ● system of automatic search quality evaluation ● a lot of resources
  • 22. HPC in Search Quality ● Problems: ● You need full web data and even more ● Any computer can be unavailable at any moment ● You need robust data and code distribution tools ● You have to execute code near data ● You need some coordination between different users on development nodes ● You cannot predict load of development node ● But your progress speed determined by development speed ● Solution: data replication, scheduling of tasks, convenient and robust APIs for developers ● But..
  • 23. HPC in Search Quality No final solution. They are developers. They can eat all resources they can reach.
  • 24. Overview ● Structure of the cluster ● Architectures: ● Data-centric – Batch – Real-time ● Development-centric – MapReduce – Ad-hoc ● Real-time query processing
  • 25. What is this about ● Computational model of query processing ● Problems and their solutions in terms of this model ● Evolution of Search architecture ● Evolution of Search components
  • 26. How does Search work IPVS/Balancer Frontends hash(req) % n Cachers  Merging top relevant documents  Two-stage query Indeces/Snippets
  • 27. Problems  Connectivity  Load balancing  Static (precalculated)  Dynamic (run-time)  Fault tolerance  Latency
  • 28. Network  Unpredictability of latency  Bottleneck of the system  Throughput grows slower than CPU  Heterogeneous topology  1 datacenter => many datacenters  Many datacenters => datacenters in many countries
  • 29. Load Balancing  Machines and indexes are different  3 generations of hardware  Uncontrollable processes at nodes  OS decided to clear buffers  Monitoring system daemon writes core dump  RAID Rebuilding  Whatever
  • 30. Fault tolerance  Machines fail  1% of resources is always unavailable  Packets are lost in the network  Datacenters get turned off  Summer, heat, cooling doesn’t cope with workload  Forgot to pay for energy  Drunk worker cut the cable  Cross-country links fail
  • 31. Latency  Different system parts reply with different speed  3 generations of machines  Indexes are different  Random latency  Packets are lost in the network  Scheduler in the OS doesn’t work  Disk is slow (200 S/PS)  Slow “tail” of queries
  • 32. 15 years ago  Everything is on 1 machine  Actually, 3, with different DNS names  Problems are obvious  Query volume is growing Index  Base is growing  Fault Intolerance
  • 33. 10 years ago Frontend/Cacher  1 datacenter, 10 base searchers – the base got distributed  Can grow query volume and base size  Got clients and money – delays are unacceptable  Machine fails – system doesn’t live …..  There are too few machines – subtle things are not noticable yet Base Searchers(Indeces) ONE Data Center
  • 34. 7 years ago Frontends  300 machines, 1 datacenter, 10 frontends, 20 cachers, Cachers 250 base searches  Problem of machine failures is solved  IPVS over frontends X  Re-Asks at all levels Indeces — 2 replicas
  • 35. 7 years ago - problems  Latency  Wait for the last source  Slow “tail”  Limit of growth in one datacenter  Base is growing  Delay because of datacenter failure is unacceptable
  • 36. 4 years ago DNS/IPVS  “Live” in several datacenters  Don’t wait for the slowest nodes  Latency is better  Partial replies  Cache updates after reply to the user X Many Symmetrical Data Centers
  • 37. 4 years ago - problems  Datacenters are different, the base is growing, one datacenter can’t hold the whole base  Need more network even inside the datacenter  Manual configuration  “No-reply”s – the scariest word  Different behavior for different users
  • 38. Scary Story 2008 ● Dynamic Load Balancing ● Hard to implement properly ● Many sources ● Fast changes ● Converge guarantee?
  • 39. What now?  10^5 machines, 10^1 datacenters  Problem of growth in different datacenters solved  Integrators – less traffic, can cross DC boundaries  Cluster structure optimization  “Annealing”  «cost» → min with fixed RPS and base Many Datacenters, working as single cluster
  • 40. Frontier  New Markets  Distributed Data Centers  Slow channels  Database upload  Independent DCs again?  Latency is “not modern”  Ask 2 replicas simultaneously  Wait only for the best
  • 41. Resume  Succession of architectures  Linear horizontal scaling  By queries  By base  Growth limits are years ahead  Robustness of computational model
  • 42. Base Search  “Conservative” component  Stable external interface  Complication/optimization of ranking  Filtering iterator  Pruning  Early query termination  Selection Rank
  • 43. Cacher/metasearch  Synchronous -> asynchronous  Message passing  Early decision making  Merge tree  Rerankings  Metasearch statistics  Data enrichment
  • 44. Balancer  DNS  IPVS  Only inside VLAN  Distribution by IP  HTTP balancer  Keepalive  Ssl  Honest frontend balancing  Early robots elimination
  • 45. Also... Cache  Tiers r  Golden GOLD COMMON  Common GARBAGE  Garbage  Index update frequency  Main – slow updates (days)  Fast – fast updates (hours, minutes)  Real-time – ultra-fast updates (seconds)
  • 46. Further Reading ● Yet Another MapReduce http://prezi.com/p4uipwixfd4p/yet-another-mapredu ● Оптимизация алгоритмов ранжирования методами машинного обучения http://romip.ru/romip2009/15_yandex.pdf ● Как работают поисковые системы http://download.yandex.ru/company/iworld-3.pdf