SlideShare uma empresa Scribd logo
1 de 36
Introduction to  NoSQL Databases San Diego NoSQL Meetup – Aug 2010 By Derek Stainer http://nosqldatabases.com
Agenda Introduction Objective Explore NoSQL Databases Conclusion
Introduction UCSD Graduate in Computer Science Java Developer for 10 years Creator of http://nosqldatabases.com Curator of NoSQL information
Objective Deeper dive into each type of NoSQL database Discuss 1-2 NoSQL databases  in each family of databases
NoSQL Taxonomy Key/Value Document Column Graph Others Geospatial File System Object
Key/Value Databases Global collection of Key/Value pairs Inspired by Amazon’s Dynamo and Distributed Hashtables Designed to handle massive load Multiple Types In memory i.e. Memcache On Disk i.e. Redis, SimpleDB Eventually Consistent i.e. Dynamo, Voldemort
Key/Value: Voldemort Created by LinkedIn, now open source Inspired by Amazon’s Dynamo Written in Java Pluggable Storage BerkeleyDB, In Memory, MySQL Pluggable Serialization JSON, Thrift, Protocol Buffers, etc. Cluster Rebalancing
Key/Value: Voldemort Versioning, based on Vector Clocks Reconciliation occurs on reads. Partitioning and Replication based on Dynamo Consistent Hashing Virtual Nodes Gossip
Other Key/Value Stores Other Key/Value Stores Amazon’s Dynamo Riak Redis Memcache SimpleDB
Document Databases Similar to a Key/Value database but with a major difference, value is a document Inspired by Lotus Notes Flexible Schema Any number of fields can be added Documents stored in JSON or BSON formats Examples: CouchDB, MongoDB
Sample Document {      "day": [ 2010, 01, 23 ],      "products": {          "apple": { "price": 10 "quantity": 6 },          "kiwi": { "price": 20 "quantity": 2 }      },      "checkout": 100  }
Document: CouchDB Development began ~ 2005 by Damien Katz former Lotus Notes Developer Couch – Cluster Of Unreliable Commodity Hardware Top level Apache Project Commercially supported by CouchIO Licensed under Apache License Written in Erlang Documents are stored in JSON
Document: CouchDB [cont’d] B-Tree Storage Engine MVCC model, no locking  No joins, primary key or foreign key (UUIDs are auto assigned)  Built bi-directional replication Can even run offline, come back and sync back changes Custom persistent views using MapReduce REST API
Document: MongoDB Development started in 2007 Commercially supported and developed by 10Gen Stores documents using BSON Supports AdHoc queries Can query against embedded objects and arrays Support multiples types of indexing
Document: MongoDB [cont’d] Officially supported drivers available for multiple languages C, C++, Java, Javascript, Perl, PHP, Python and Ruby Community supported drivers include: Scala, Node.js, Haskell, Erlang, Smalltalk Replication uses a master/slave model Scales horizontally via sharding Written C++
Column Family Databases Each key is associated with multiple attributes (i.e. Columns) Hybrid row/column stores Inspired by Google BigTable Examples: HBase, Cassandra
Column: HBase Based on Google’s BigTable Apache Project TLP Cloudera (certifications, EC2 AMI’s, etc.) Layered over HDFS (Hadoop Distributed File System) Input/Output for MapReduce Jobs APIs Thrift, REST
Column: Hbase [cont’d] Automatic partitioning Automatic re-balancing/re-partitioning Fault tolerant HDFS  Multiple Replicas Highly distributed
Column: Hbase [cont’d] Lars George
Column: Cassandra Created at Facebook for Inbox search Facebook -> Google Code -> ASF Commercial Support available from Riptano Features taken from both Dynamo and BigTable Dynamo – Consistent hashing, Partitioning, Replication Big Table – Column Familes, MemTables, SSTables
Column: Cassandra [cont’d] Symmetric nodes No single point of failure Linearly scalable Ease of administration Flexible/Automated Provisioning Flexible Replica Replacement High Availability Eventual Consistency However, consistency is tuneable
Column: Cassandra [cont’d] Partitioning Random Good distribution of data between nodes Range scans not possible Order Preserving Can lead to unbalanced nodes Range scans, Natural Order Custom Extremely fast reads/writes (low latency) Thrift API
Column: Cassandra [cont’d] Column Basic unit of storage Column Family Collection of like records Record level atomicity Indexed Keyspace Top level namespace Usually one per application
Column: Cassandra [cont’d] Eric Evans
Column: Cassandra [cont’d] Column Details Name byte[] Queried against Determines sort order Value byte[] Opaque to Cassandra Timestamp long Conflict resolution (last write wins)
Graph Databases Inspired by Euler Graph Theory, G=(E,V) Focused on modeling the structure of the data Property Graph Data Model Examples: Neo4j, InfiniteGraph
Sample Property Graph[] Todd Hoff
Graph: Neo4j Data Model: Property Graph Nodes – Person, Place, Thing, etc. Relationships – Lives, Likes, Owns, etc. Properties on Both Primary operation is graph traversal between nodes Written in Java Embedded database
Graph: Neo4j [cont’d] Disk-based Graph stored in custom binary format Transactional JTA/JTS, XA, 2PC, MVCC Scales Billions of nodes/relationships/properties per JVM Robust 6+ years in 24/7 production
Graph: Neo4j [cont’d] Multiple language binds Jython, Cpython Jruby (including RESTful API) Clojure Scala (including RESTful API) Uses Social Graph i.e. Facebook Recommendation Engines Financial Audit
Graph: Neo4j [cont’d] Licensed under AGPLv3 Dual Commercial License Available First server is free Second server Inexpensive Commercial support provided by Neo Technologies
Other Graph Databases Other graph databases InfiniteGraph HyperGraphDB sones
Conclusion
Thank You!
References NoSQL Databases - Part 1 – Landscape, Vineet Guptahttp://www.vineetgupta.com/2010/01/nosql-databases-part-1-landscape.html NoSQL for Dummies, Tobias Ivarssonhttp://www.slideshare.net/thobe/nosql-for-dummies NoSQL Databases, Marin Dimitrovhttp://www.slideshare.net/marin_dimitrov/nosql-databases-3584443 CouchDB vs. MongoDB, Gabriele Lanahttp://www.slideshare.net/gabriele.lana/couchdb-vs-mongodb-2982288 Hbase, Ryan Rawsonhttp://www.slideshare.net/adorepump/hbase-nosql Introduction to Cassandra, Gary Dusbabekhttp://www.slideshare.net/gdusbabek/introduction-to-cassandra-june-2010 Cassandra Explained, Eric Evanshttp://www.slideshare.net/jericevans/cassandra-explained Towards Robust Distributed Systems, Eric Brewerhttp://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf Cassandra - A Decentralized Structured Storage System, Lakshman, Ladishttp://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf
References [cont’d] Bigtable: A Distributed Storage System for Structured Data, Google Inc.http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google.com/en/us/papers/bigtable-osdi06.pdf Dynamo: Amazon’s Highly Available Key-value Store, Amazon Inc.http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf HBase Architecture 101 – Storage, Lars Georgehttp://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html BASE: An ACID Alternative, Dan Pritchett

Mais conteúdo relacionado

Mais procurados

The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDBvaluebound
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBRavi Teja
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sqlRam kumar
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and UsesSuvradeep Rudra
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architectureBishal Khanal
 
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...Databricks
 
Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...
Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...
Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...HostedbyConfluent
 
Siligong.Data - May 2021 - Transforming your analytics workflow with dbt
Siligong.Data - May 2021 - Transforming your analytics workflow with dbtSiligong.Data - May 2021 - Transforming your analytics workflow with dbt
Siligong.Data - May 2021 - Transforming your analytics workflow with dbtJon Su
 
MongoDB presentation
MongoDB presentationMongoDB presentation
MongoDB presentationHyphen Call
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake OverviewJames Serra
 

Mais procurados (20)

NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
 
MongoDB
MongoDBMongoDB
MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sql
 
Key-Value NoSQL Database
Key-Value NoSQL DatabaseKey-Value NoSQL Database
Key-Value NoSQL Database
 
SQL & NoSQL
SQL & NoSQLSQL & NoSQL
SQL & NoSQL
 
Mongo db intro.pptx
Mongo db intro.pptxMongo db intro.pptx
Mongo db intro.pptx
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and Uses
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architecture
 
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...
 
Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...
Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...
Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...
 
Siligong.Data - May 2021 - Transforming your analytics workflow with dbt
Siligong.Data - May 2021 - Transforming your analytics workflow with dbtSiligong.Data - May 2021 - Transforming your analytics workflow with dbt
Siligong.Data - May 2021 - Transforming your analytics workflow with dbt
 
NoSql
NoSqlNoSql
NoSql
 
Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
 
MongoDB presentation
MongoDB presentationMongoDB presentation
MongoDB presentation
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Mongo DB
Mongo DB Mongo DB
Mongo DB
 

Destaque

Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQLRTigger
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL DatabasesBADR
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBWilliam LaForest
 
NoSQL databases pros and cons
NoSQL databases pros and consNoSQL databases pros and cons
NoSQL databases pros and consFabio Fumarola
 
A Beginners Guide to noSQL
A Beginners Guide to noSQLA Beginners Guide to noSQL
A Beginners Guide to noSQLMike Crabb
 
Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015Susanna-Assunta Sansone
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Stefan Lipp
 
mini MAXI art exhibition
mini MAXI art exhibitionmini MAXI art exhibition
mini MAXI art exhibitionAnna Casey
 
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics MeetupIntroduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetupiwrigley
 
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera, Inc.
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenLorenzo Alberton
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsJonas Bonér
 
Enabling the Industry 4.0 vision: Hype? Real Opportunity!
Enabling the Industry 4.0 vision: Hype? Real Opportunity!Enabling the Industry 4.0 vision: Hype? Real Opportunity!
Enabling the Industry 4.0 vision: Hype? Real Opportunity!Boris Otto
 

Destaque (14)

Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQL
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databases
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDB
 
NoSQL databases pros and cons
NoSQL databases pros and consNoSQL databases pros and cons
NoSQL databases pros and cons
 
A Beginners Guide to noSQL
A Beginners Guide to noSQLA Beginners Guide to noSQL
A Beginners Guide to noSQL
 
Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
 
mini MAXI art exhibition
mini MAXI art exhibitionmini MAXI art exhibition
mini MAXI art exhibition
 
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics MeetupIntroduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
 
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for Hadoop
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
Enabling the Industry 4.0 vision: Hype? Real Opportunity!
Enabling the Industry 4.0 vision: Hype? Real Opportunity!Enabling the Industry 4.0 vision: Hype? Real Opportunity!
Enabling the Industry 4.0 vision: Hype? Real Opportunity!
 

Semelhante a Introduction to NoSQL Databases

NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and HowBigBlueHat
 
Couchbase - Yet Another Introduction
Couchbase - Yet Another IntroductionCouchbase - Yet Another Introduction
Couchbase - Yet Another IntroductionKelum Senanayake
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataRoger Xia
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"Jihyun Ahn
 
DynamoDB Gluecon 2012
DynamoDB Gluecon 2012DynamoDB Gluecon 2012
DynamoDB Gluecon 2012Appirio
 
Gluecon 2012 - DynamoDB
Gluecon 2012 - DynamoDBGluecon 2012 - DynamoDB
Gluecon 2012 - DynamoDBJeff Douglas
 
Mongodb - NoSql Database
Mongodb - NoSql DatabaseMongodb - NoSql Database
Mongodb - NoSql DatabasePrashant Gupta
 
JS App Architecture
JS App ArchitectureJS App Architecture
JS App ArchitectureCorey Butler
 
Microsoft Azure e Open Source
Microsoft Azure e Open SourceMicrosoft Azure e Open Source
Microsoft Azure e Open SourceDanilo Bordini
 
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBBenchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBAthiq Ahamed
 
Software development - the java perspective
Software development - the java perspectiveSoftware development - the java perspective
Software development - the java perspectiveAlin Pandichi
 
MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...Ram Murat Sharma
 
20150716 introduction to apache spark v3
20150716 introduction to apache spark v3 20150716 introduction to apache spark v3
20150716 introduction to apache spark v3 Andrey Vykhodtsev
 
What you need to know about ceph
What you need to know about cephWhat you need to know about ceph
What you need to know about cephEmma Haruka Iwao
 
03 net saturday anton samarskyy ''document oriented databases for the .net pl...
03 net saturday anton samarskyy ''document oriented databases for the .net pl...03 net saturday anton samarskyy ''document oriented databases for the .net pl...
03 net saturday anton samarskyy ''document oriented databases for the .net pl...DneprCiklumEvents
 

Semelhante a Introduction to NoSQL Databases (20)

NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and How
 
Intro to RavenDB
Intro to RavenDBIntro to RavenDB
Intro to RavenDB
 
Nosql seminar
Nosql seminarNosql seminar
Nosql seminar
 
No sql databases
No sql databasesNo sql databases
No sql databases
 
Couchbase - Yet Another Introduction
Couchbase - Yet Another IntroductionCouchbase - Yet Another Introduction
Couchbase - Yet Another Introduction
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"
 
DynamoDB Gluecon 2012
DynamoDB Gluecon 2012DynamoDB Gluecon 2012
DynamoDB Gluecon 2012
 
Gluecon 2012 - DynamoDB
Gluecon 2012 - DynamoDBGluecon 2012 - DynamoDB
Gluecon 2012 - DynamoDB
 
Mongodb - NoSql Database
Mongodb - NoSql DatabaseMongodb - NoSql Database
Mongodb - NoSql Database
 
JS App Architecture
JS App ArchitectureJS App Architecture
JS App Architecture
 
MongoDB is the MashupDB
MongoDB is the MashupDBMongoDB is the MashupDB
MongoDB is the MashupDB
 
Microsoft Azure e Open Source
Microsoft Azure e Open SourceMicrosoft Azure e Open Source
Microsoft Azure e Open Source
 
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBBenchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
 
No sq lv2
No sq lv2No sq lv2
No sq lv2
 
Software development - the java perspective
Software development - the java perspectiveSoftware development - the java perspective
Software development - the java perspective
 
MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...
 
20150716 introduction to apache spark v3
20150716 introduction to apache spark v3 20150716 introduction to apache spark v3
20150716 introduction to apache spark v3
 
What you need to know about ceph
What you need to know about cephWhat you need to know about ceph
What you need to know about ceph
 
03 net saturday anton samarskyy ''document oriented databases for the .net pl...
03 net saturday anton samarskyy ''document oriented databases for the .net pl...03 net saturday anton samarskyy ''document oriented databases for the .net pl...
03 net saturday anton samarskyy ''document oriented databases for the .net pl...
 

Último

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 

Último (20)

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 

Introduction to NoSQL Databases

  • 1. Introduction to NoSQL Databases San Diego NoSQL Meetup – Aug 2010 By Derek Stainer http://nosqldatabases.com
  • 2. Agenda Introduction Objective Explore NoSQL Databases Conclusion
  • 3. Introduction UCSD Graduate in Computer Science Java Developer for 10 years Creator of http://nosqldatabases.com Curator of NoSQL information
  • 4. Objective Deeper dive into each type of NoSQL database Discuss 1-2 NoSQL databases in each family of databases
  • 5. NoSQL Taxonomy Key/Value Document Column Graph Others Geospatial File System Object
  • 6. Key/Value Databases Global collection of Key/Value pairs Inspired by Amazon’s Dynamo and Distributed Hashtables Designed to handle massive load Multiple Types In memory i.e. Memcache On Disk i.e. Redis, SimpleDB Eventually Consistent i.e. Dynamo, Voldemort
  • 7. Key/Value: Voldemort Created by LinkedIn, now open source Inspired by Amazon’s Dynamo Written in Java Pluggable Storage BerkeleyDB, In Memory, MySQL Pluggable Serialization JSON, Thrift, Protocol Buffers, etc. Cluster Rebalancing
  • 8. Key/Value: Voldemort Versioning, based on Vector Clocks Reconciliation occurs on reads. Partitioning and Replication based on Dynamo Consistent Hashing Virtual Nodes Gossip
  • 9. Other Key/Value Stores Other Key/Value Stores Amazon’s Dynamo Riak Redis Memcache SimpleDB
  • 10. Document Databases Similar to a Key/Value database but with a major difference, value is a document Inspired by Lotus Notes Flexible Schema Any number of fields can be added Documents stored in JSON or BSON formats Examples: CouchDB, MongoDB
  • 11. Sample Document { "day": [ 2010, 01, 23 ], "products": { "apple": { "price": 10 "quantity": 6 }, "kiwi": { "price": 20 "quantity": 2 } }, "checkout": 100 }
  • 12. Document: CouchDB Development began ~ 2005 by Damien Katz former Lotus Notes Developer Couch – Cluster Of Unreliable Commodity Hardware Top level Apache Project Commercially supported by CouchIO Licensed under Apache License Written in Erlang Documents are stored in JSON
  • 13. Document: CouchDB [cont’d] B-Tree Storage Engine MVCC model, no locking No joins, primary key or foreign key (UUIDs are auto assigned) Built bi-directional replication Can even run offline, come back and sync back changes Custom persistent views using MapReduce REST API
  • 14. Document: MongoDB Development started in 2007 Commercially supported and developed by 10Gen Stores documents using BSON Supports AdHoc queries Can query against embedded objects and arrays Support multiples types of indexing
  • 15. Document: MongoDB [cont’d] Officially supported drivers available for multiple languages C, C++, Java, Javascript, Perl, PHP, Python and Ruby Community supported drivers include: Scala, Node.js, Haskell, Erlang, Smalltalk Replication uses a master/slave model Scales horizontally via sharding Written C++
  • 16. Column Family Databases Each key is associated with multiple attributes (i.e. Columns) Hybrid row/column stores Inspired by Google BigTable Examples: HBase, Cassandra
  • 17. Column: HBase Based on Google’s BigTable Apache Project TLP Cloudera (certifications, EC2 AMI’s, etc.) Layered over HDFS (Hadoop Distributed File System) Input/Output for MapReduce Jobs APIs Thrift, REST
  • 18. Column: Hbase [cont’d] Automatic partitioning Automatic re-balancing/re-partitioning Fault tolerant HDFS Multiple Replicas Highly distributed
  • 20. Column: Cassandra Created at Facebook for Inbox search Facebook -> Google Code -> ASF Commercial Support available from Riptano Features taken from both Dynamo and BigTable Dynamo – Consistent hashing, Partitioning, Replication Big Table – Column Familes, MemTables, SSTables
  • 21. Column: Cassandra [cont’d] Symmetric nodes No single point of failure Linearly scalable Ease of administration Flexible/Automated Provisioning Flexible Replica Replacement High Availability Eventual Consistency However, consistency is tuneable
  • 22. Column: Cassandra [cont’d] Partitioning Random Good distribution of data between nodes Range scans not possible Order Preserving Can lead to unbalanced nodes Range scans, Natural Order Custom Extremely fast reads/writes (low latency) Thrift API
  • 23. Column: Cassandra [cont’d] Column Basic unit of storage Column Family Collection of like records Record level atomicity Indexed Keyspace Top level namespace Usually one per application
  • 25. Column: Cassandra [cont’d] Column Details Name byte[] Queried against Determines sort order Value byte[] Opaque to Cassandra Timestamp long Conflict resolution (last write wins)
  • 26. Graph Databases Inspired by Euler Graph Theory, G=(E,V) Focused on modeling the structure of the data Property Graph Data Model Examples: Neo4j, InfiniteGraph
  • 28. Graph: Neo4j Data Model: Property Graph Nodes – Person, Place, Thing, etc. Relationships – Lives, Likes, Owns, etc. Properties on Both Primary operation is graph traversal between nodes Written in Java Embedded database
  • 29. Graph: Neo4j [cont’d] Disk-based Graph stored in custom binary format Transactional JTA/JTS, XA, 2PC, MVCC Scales Billions of nodes/relationships/properties per JVM Robust 6+ years in 24/7 production
  • 30. Graph: Neo4j [cont’d] Multiple language binds Jython, Cpython Jruby (including RESTful API) Clojure Scala (including RESTful API) Uses Social Graph i.e. Facebook Recommendation Engines Financial Audit
  • 31. Graph: Neo4j [cont’d] Licensed under AGPLv3 Dual Commercial License Available First server is free Second server Inexpensive Commercial support provided by Neo Technologies
  • 32. Other Graph Databases Other graph databases InfiniteGraph HyperGraphDB sones
  • 35. References NoSQL Databases - Part 1 – Landscape, Vineet Guptahttp://www.vineetgupta.com/2010/01/nosql-databases-part-1-landscape.html NoSQL for Dummies, Tobias Ivarssonhttp://www.slideshare.net/thobe/nosql-for-dummies NoSQL Databases, Marin Dimitrovhttp://www.slideshare.net/marin_dimitrov/nosql-databases-3584443 CouchDB vs. MongoDB, Gabriele Lanahttp://www.slideshare.net/gabriele.lana/couchdb-vs-mongodb-2982288 Hbase, Ryan Rawsonhttp://www.slideshare.net/adorepump/hbase-nosql Introduction to Cassandra, Gary Dusbabekhttp://www.slideshare.net/gdusbabek/introduction-to-cassandra-june-2010 Cassandra Explained, Eric Evanshttp://www.slideshare.net/jericevans/cassandra-explained Towards Robust Distributed Systems, Eric Brewerhttp://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf Cassandra - A Decentralized Structured Storage System, Lakshman, Ladishttp://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf
  • 36. References [cont’d] Bigtable: A Distributed Storage System for Structured Data, Google Inc.http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google.com/en/us/papers/bigtable-osdi06.pdf Dynamo: Amazon’s Highly Available Key-value Store, Amazon Inc.http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf HBase Architecture 101 – Storage, Lars Georgehttp://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html BASE: An ACID Alternative, Dan Pritchett

Notas do Editor

  1. Surveying the NoSQL Landscape, By Derek Stainer
  2. Indexing types include, single-key, compound, unique, non-unique, and geospatial
  3. Surveying the NoSQL Landscape, By Derek Stainer
  4. Surveying the NoSQL Landscape, By Derek Stainer