SlideShare uma empresa Scribd logo
1 de 22
Social networking architectures what can we learn
An enthusiasts view 2
How do sites with a social networking angle figure globally? 3 As ranked by Alexa Site		Global ranking Facebook		2 YouTube		3 Yahoo			4 Windows Live	5 Blogger		7 Wikipedia		8 Twitter		10
3 Principles 4 3 common principles Fast feature delivery is key Cache everything everywhere Relational data is dead
Interesting stats 5 	Facebook - 	Serve 120 million queries per second 			without a single join 	37 Signals -	Developed a production application 				serving over 4 million items using 				only 579 lines of code 	Flickr - 		2 Billion photos served without 				using 	relational  databases
How did they do it? 6 Nobody thought this was possible Unencumbered by history or restrictive rules Had to be creative in solving problems that nobody had experienced using very little capital outlay
3 Principles 7 Fast feature delivery is key Cache everything everywhere Relational data is dead
Fast feature delivery is key 8 Choose an appropriate language Speed of development more important than speed of execution Languages like PHP and Ruby commonly used for rapid development and deployment
Language is not religion 9
3 Principles 10 Fast feature delivery is key Cache everything everywhere Relational data is dead
Cache everything everywhere 11 You need a really good reason not to cache data for reading Local caching a good start but more than one server means duplicating the cache no group invalidation memory limited to how much spare RAM on the server Most social networks use a distributed cache like memcached
Cache everything everywhere 12 Check if the information is in the cache. If so, use it If not, query the database put the result in the cache On update delete from the cache. The next user goes to the database function get_foo(int userid) {  	result = memcached_fetch("userrow:" + userid);  	if (!result) {  		result = db_select("SELECT * FROM users WHERE userid = ?", userid);  		memcached_add("userrow:" + userid, result);  	} return result;
Responsivness is key 13
3 Principles 14 Fast feature delivery is key Cache everything everywhere Relational data is dead
Everybody wants to use a database 15
Relational issue No 1 - Normalisation 16 Relational databases do not scale well because of normalisation Why normalise? 			- reduce storage space 			- reduce anomalies Today  			- storage is cheap 			- as data gets larger, joins are expensive
Relational issue No 2 - Transactions 17 ACID principles govern transactions Relational databases do not scale well because of transactions
After relational 18 Use BASE (basically available, soft state, eventually consistent) Shard Data Favour Name value pair stores over relational databases
Lessons for enterprise 19 Design of software should always be it depends. Test your most basic assumptions Dynamic languages and frameworks may be suitable to deliver a feature quickly You don't need an RDBMS for everything, especially if you need huge scale You should always cache data for read (unless you shouldn’t)
Fresh ideas always welcome 20
Find me here  21 MarkGreville@itarc.ie
Or find me here 22

Mais conteúdo relacionado

Destaque

Iasa Architect responsibilities in the cloud
Iasa Architect responsibilities in the cloudIasa Architect responsibilities in the cloud
Iasa Architect responsibilities in the cloudiasaglobal
 
Cita iasa certifications
Cita iasa certificationsCita iasa certifications
Cita iasa certificationsAdams Firdaus
 
Iasa, Iasa Ireland, ICS Jan 2011
Iasa, Iasa Ireland, ICS Jan 2011Iasa, Iasa Ireland, ICS Jan 2011
Iasa, Iasa Ireland, ICS Jan 2011iasaireland
 
The Role of the Software Architect
The Role of the Software ArchitectThe Role of the Software Architect
The Role of the Software ArchitectHayim Makabee
 
Architecting multi sided business
Architecting multi sided businessArchitecting multi sided business
Architecting multi sided businessRichard Veryard
 
User story estimation with agile architectures
User story estimation with agile architecturesUser story estimation with agile architectures
User story estimation with agile architecturesRaffaele Garofalo
 
Solution architecture
Solution architectureSolution architecture
Solution architectureiasaglobal
 
Are You an Accidental or Intention Software Architect
Are You an Accidental or Intention Software ArchitectAre You an Accidental or Intention Software Architect
Are You an Accidental or Intention Software ArchitectRandy Ynchausti
 
Software architecture in an agile environment
Software architecture in an agile environmentSoftware architecture in an agile environment
Software architecture in an agile environmentRaffaele Garofalo
 
Business Process Management: Implementing Continuous Improvement in Your Orga...
Business Process Management: Implementing Continuous Improvement in Your Orga...Business Process Management: Implementing Continuous Improvement in Your Orga...
Business Process Management: Implementing Continuous Improvement in Your Orga...Henry Chandra
 
Platforms or Two-sided markets
Platforms or Two-sided marketsPlatforms or Two-sided markets
Platforms or Two-sided marketsMartin Westhead
 
Structured Approach to Solution Architecture
Structured Approach to Solution ArchitectureStructured Approach to Solution Architecture
Structured Approach to Solution ArchitectureAlan McSweeney
 

Destaque (13)

Iasa Architect responsibilities in the cloud
Iasa Architect responsibilities in the cloudIasa Architect responsibilities in the cloud
Iasa Architect responsibilities in the cloud
 
Cita iasa certifications
Cita iasa certificationsCita iasa certifications
Cita iasa certifications
 
Why certify
Why certifyWhy certify
Why certify
 
Iasa, Iasa Ireland, ICS Jan 2011
Iasa, Iasa Ireland, ICS Jan 2011Iasa, Iasa Ireland, ICS Jan 2011
Iasa, Iasa Ireland, ICS Jan 2011
 
The Role of the Software Architect
The Role of the Software ArchitectThe Role of the Software Architect
The Role of the Software Architect
 
Architecting multi sided business
Architecting multi sided businessArchitecting multi sided business
Architecting multi sided business
 
User story estimation with agile architectures
User story estimation with agile architecturesUser story estimation with agile architectures
User story estimation with agile architectures
 
Solution architecture
Solution architectureSolution architecture
Solution architecture
 
Are You an Accidental or Intention Software Architect
Are You an Accidental or Intention Software ArchitectAre You an Accidental or Intention Software Architect
Are You an Accidental or Intention Software Architect
 
Software architecture in an agile environment
Software architecture in an agile environmentSoftware architecture in an agile environment
Software architecture in an agile environment
 
Business Process Management: Implementing Continuous Improvement in Your Orga...
Business Process Management: Implementing Continuous Improvement in Your Orga...Business Process Management: Implementing Continuous Improvement in Your Orga...
Business Process Management: Implementing Continuous Improvement in Your Orga...
 
Platforms or Two-sided markets
Platforms or Two-sided marketsPlatforms or Two-sided markets
Platforms or Two-sided markets
 
Structured Approach to Solution Architecture
Structured Approach to Solution ArchitectureStructured Approach to Solution Architecture
Structured Approach to Solution Architecture
 

Semelhante a Social networking architectures: 3 principles for scaling without relational databases

Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)Trieu Nguyen
 
Data Discovery and Metadata
Data Discovery and MetadataData Discovery and Metadata
Data Discovery and Metadatamarkgrover
 
Performance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and morePerformance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and moreDenodo
 
PyConline AU 2021 - Things might go wrong in a data-intensive application
PyConline AU 2021 - Things might go wrong in a data-intensive applicationPyConline AU 2021 - Things might go wrong in a data-intensive application
PyConline AU 2021 - Things might go wrong in a data-intensive applicationHua Chu
 
Minerva: Drill Storage Plugin for IPFS
Minerva: Drill Storage Plugin for IPFSMinerva: Drill Storage Plugin for IPFS
Minerva: Drill Storage Plugin for IPFSBowenDing4
 
Eliminating the Problems of Exponential Data Growth, Forever
Eliminating the Problems of Exponential Data Growth, ForeverEliminating the Problems of Exponential Data Growth, Forever
Eliminating the Problems of Exponential Data Growth, Foreverspectralogic
 
How Lyft Drives Data Discovery
How Lyft Drives Data DiscoveryHow Lyft Drives Data Discovery
How Lyft Drives Data DiscoveryNeo4j
 
Key aspects of big data storage and its architecture
Key aspects of big data storage and its architectureKey aspects of big data storage and its architecture
Key aspects of big data storage and its architectureRahul Chaturvedi
 
Lesson 1 introduction to_big_data_and_hadoop.pptx
Lesson 1 introduction to_big_data_and_hadoop.pptxLesson 1 introduction to_big_data_and_hadoop.pptx
Lesson 1 introduction to_big_data_and_hadoop.pptxPankajkumar496281
 
Demystifying Data Virtualization (ASEAN)
Demystifying Data Virtualization (ASEAN)Demystifying Data Virtualization (ASEAN)
Demystifying Data Virtualization (ASEAN)Denodo
 
Stored-Procedures-Presentation
Stored-Procedures-PresentationStored-Procedures-Presentation
Stored-Procedures-PresentationChuck Walker
 
PostgreSQL as a Strategic Tool
PostgreSQL as a Strategic ToolPostgreSQL as a Strategic Tool
PostgreSQL as a Strategic ToolEDB
 
Quick dive into the big data pool without drowning - Demi Ben-Ari @ Panorays
Quick dive into the big data pool without drowning - Demi Ben-Ari @ PanoraysQuick dive into the big data pool without drowning - Demi Ben-Ari @ Panorays
Quick dive into the big data pool without drowning - Demi Ben-Ari @ PanoraysDemi Ben-Ari
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization Denodo
 

Semelhante a Social networking architectures: 3 principles for scaling without relational databases (20)

Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)
 
Data Discovery and Metadata
Data Discovery and MetadataData Discovery and Metadata
Data Discovery and Metadata
 
bigdata 2.pptx
bigdata 2.pptxbigdata 2.pptx
bigdata 2.pptx
 
Performance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and morePerformance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and more
 
PyConline AU 2021 - Things might go wrong in a data-intensive application
PyConline AU 2021 - Things might go wrong in a data-intensive applicationPyConline AU 2021 - Things might go wrong in a data-intensive application
PyConline AU 2021 - Things might go wrong in a data-intensive application
 
Minerva: Drill Storage Plugin for IPFS
Minerva: Drill Storage Plugin for IPFSMinerva: Drill Storage Plugin for IPFS
Minerva: Drill Storage Plugin for IPFS
 
Eliminating the Problems of Exponential Data Growth, Forever
Eliminating the Problems of Exponential Data Growth, ForeverEliminating the Problems of Exponential Data Growth, Forever
Eliminating the Problems of Exponential Data Growth, Forever
 
How Lyft Drives Data Discovery
How Lyft Drives Data DiscoveryHow Lyft Drives Data Discovery
How Lyft Drives Data Discovery
 
Key aspects of big data storage and its architecture
Key aspects of big data storage and its architectureKey aspects of big data storage and its architecture
Key aspects of big data storage and its architecture
 
bigdata.pptx
bigdata.pptxbigdata.pptx
bigdata.pptx
 
Lesson 1 introduction to_big_data_and_hadoop.pptx
Lesson 1 introduction to_big_data_and_hadoop.pptxLesson 1 introduction to_big_data_and_hadoop.pptx
Lesson 1 introduction to_big_data_and_hadoop.pptx
 
Demystifying Data Virtualization (ASEAN)
Demystifying Data Virtualization (ASEAN)Demystifying Data Virtualization (ASEAN)
Demystifying Data Virtualization (ASEAN)
 
Stored-Procedures-Presentation
Stored-Procedures-PresentationStored-Procedures-Presentation
Stored-Procedures-Presentation
 
PostgreSQL as a Strategic Tool
PostgreSQL as a Strategic ToolPostgreSQL as a Strategic Tool
PostgreSQL as a Strategic Tool
 
A data analyst view of Bigdata
A data analyst view of Bigdata A data analyst view of Bigdata
A data analyst view of Bigdata
 
Quick dive into the big data pool without drowning - Demi Ben-Ari @ Panorays
Quick dive into the big data pool without drowning - Demi Ben-Ari @ PanoraysQuick dive into the big data pool without drowning - Demi Ben-Ari @ Panorays
Quick dive into the big data pool without drowning - Demi Ben-Ari @ Panorays
 
NoSQL Basics - a quick tour
NoSQL Basics - a quick tourNoSQL Basics - a quick tour
NoSQL Basics - a quick tour
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
Big data rmoug
Big data rmougBig data rmoug
Big data rmoug
 

Último

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 

Último (20)

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 

Social networking architectures: 3 principles for scaling without relational databases

  • 3. How do sites with a social networking angle figure globally? 3 As ranked by Alexa Site Global ranking Facebook 2 YouTube 3 Yahoo 4 Windows Live 5 Blogger 7 Wikipedia 8 Twitter 10
  • 4. 3 Principles 4 3 common principles Fast feature delivery is key Cache everything everywhere Relational data is dead
  • 5. Interesting stats 5 Facebook - Serve 120 million queries per second without a single join 37 Signals - Developed a production application serving over 4 million items using only 579 lines of code Flickr - 2 Billion photos served without using relational databases
  • 6. How did they do it? 6 Nobody thought this was possible Unencumbered by history or restrictive rules Had to be creative in solving problems that nobody had experienced using very little capital outlay
  • 7. 3 Principles 7 Fast feature delivery is key Cache everything everywhere Relational data is dead
  • 8. Fast feature delivery is key 8 Choose an appropriate language Speed of development more important than speed of execution Languages like PHP and Ruby commonly used for rapid development and deployment
  • 9. Language is not religion 9
  • 10. 3 Principles 10 Fast feature delivery is key Cache everything everywhere Relational data is dead
  • 11. Cache everything everywhere 11 You need a really good reason not to cache data for reading Local caching a good start but more than one server means duplicating the cache no group invalidation memory limited to how much spare RAM on the server Most social networks use a distributed cache like memcached
  • 12. Cache everything everywhere 12 Check if the information is in the cache. If so, use it If not, query the database put the result in the cache On update delete from the cache. The next user goes to the database function get_foo(int userid) { result = memcached_fetch("userrow:" + userid); if (!result) { result = db_select("SELECT * FROM users WHERE userid = ?", userid); memcached_add("userrow:" + userid, result); } return result;
  • 14. 3 Principles 14 Fast feature delivery is key Cache everything everywhere Relational data is dead
  • 15. Everybody wants to use a database 15
  • 16. Relational issue No 1 - Normalisation 16 Relational databases do not scale well because of normalisation Why normalise? - reduce storage space - reduce anomalies Today - storage is cheap - as data gets larger, joins are expensive
  • 17. Relational issue No 2 - Transactions 17 ACID principles govern transactions Relational databases do not scale well because of transactions
  • 18. After relational 18 Use BASE (basically available, soft state, eventually consistent) Shard Data Favour Name value pair stores over relational databases
  • 19. Lessons for enterprise 19 Design of software should always be it depends. Test your most basic assumptions Dynamic languages and frameworks may be suitable to deliver a feature quickly You don't need an RDBMS for everything, especially if you need huge scale You should always cache data for read (unless you shouldn’t)
  • 20. Fresh ideas always welcome 20
  • 21. Find me here 21 MarkGreville@itarc.ie
  • 22. Or find me here 22

Notas do Editor

  1. Looked at top 10 sites on the web found 7 with social networking aspectsOther:Google 1Baidu 6QQ.com 9
  2. Decided to look at the traffic and found some very interesting statsFacebook – 200 million active users & 50 billion page views per monthYouTube – over 1 billion views per dayBasecamp – 2 million active accounts & 1.3 million projects managedTwitter – 1 Million + users & 3 million tweets per day
  3. It should be noted that neither are the most efficient languages as they are not compiled (both are interpreted languages, they are not directly executed by the CPU but executed by an interpreter)Sites like Twitter and Yellowpages.com are written using Ruby on Rails. Tada list – has so much build into the framework that a full production app can be developed with very little code.
  4. Some treat language as a religion, its ok to try something different, it doesn’t define you as a person.
  5. Duplicating the cache is a waste of memoryNo group invalidation means you either need to notify all of your servers that they need to refresh their cache or rely solely on cache timeouts.a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load.Memcached is used by: Facebook, YouTube, Wikipedia, LiveJournal, Digg, Twitter, SourceForgeMost site founders said that the biggest gain was from implementing a caching layer
  6. There is a significant penalty in going to disk to read every time as opposed to reading from the cache.Implementing a cache is extremely easy, as shown by the code aboveGreat for reading data, but you still have to write data
  7. All about responsivnessUsers wont tolerate long waits on social networksThey are now expecting this behaviour from all software
  8. To prevent anomalies we don't duplicate data. We split everything up so it is stored once. The price of normalization is that when we want a person's address we have to go find the person and their address and bring the data together again. This is called a join. Joins are relatively slow, especially over very large data sets. Not just for reads (caching takes care of this) but for CUD.Flickr decided to denormalize because it took 13 Selects to each Insert, Delete or Update.
  9. eBay do not use transactions, they have so much data that distributed transactions would harm responsiveness. Referential integrity and sorting are done in application code.Atomicity - all parts of a transaction succeed or none of then succeed.Consistency - The database will be in a consistent state when the transaction begins and ends.Isolation - The transaction will behave as if it is the only operation being performed upon the database.Durability - Upon completion of the transaction, the operation will not be reversed.Facebook has 4500 database servers
  10. All solutions are slightly differentSame challenge in 5 years may have a totally different solution (hardware/software changes)
  11. Need fresh ideas – otherwise well copy the mistakes of others