SlideShare uma empresa Scribd logo
1 de 111
w e l c o m e
BIG DATA
Architectures and Approaches
David Elliman & Ashok Subramanian
Luke Barrett
1971-2014
http://upload.wikimedia.org/wikipedia/commons/f/f0/DARPA_Big_Data.jpg
BIG DATA
https://www.flickr.com/photos/katerha/8380451137/
1944
https://www.flickr.com/photos/timetrax/376152628/sizes/l
1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
1961
1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
1971
1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
1996
https://www.flickr.com/photos/epsos/8336691931
ge becomes more cost effective for storing da
1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
1996
1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
1998
1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
1998
1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
https://www.usenix.org/conference/1999-usenix-annual-technical-conference/big-data-and-next-wave-infrastress-problems
2004
1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
2006
1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
2008
1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
2010
1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
2013
1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
"alottabytes"
2015
https://www.flickr.com/photos/will-lion/2595830716/
1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
https://www.flickr.com/photos/taedc/6998468974
http://blogs.gartner.com/doug-laney/batman-on-big-data/
https://www.flickr.com/photos/10ch/3347658610/
THE OPPORTUNITY
<- 1990
DATA
INSIGHT
DATA
INSIGHT
DATA
INSIGHT
1990s - 2000 2000 ->
Key Takeaways
• This isn’t a new problem
• The problem isn’t going away
• Remember to focus on the
VALUE
https://www.flickr.com/photos/djwtwo/8331524425/
Where do we…
https://www.flickr.com/photos/ekosystem/4334671818/
https://www.flickr.com/photos/libraryacu/7695938410/
Complexity
Value
Descriptive
Analytics
Diagnostic
Analytics
Predictive
Analytics
Prescriptive
Analytics
What happened?
Why did it happen?
What will happen?
How can we
make it happen?
Analytics - Goals
https://www.flickr.com/photos/lopetz/3912416793/
REAL TIME BATCH
Volume
Velocity
REAL TIME BATCH
https://www.flickr.com/photos/ingythewingy/5510406450/
THINK
BIG
S M A L L
A C T
S M A L L
A C T
Small
is the
New Big
(Seth Godin)
https://www.flickr.com/photos/pauldineen/4529216647/
“80% of the work in any data project is in cleaning the data” – D J Patil
https://www.flickr.com/photos/desideratum/8595251348/
https://www.flickr.com/photos/22280677@N07/2504310138/
https://www.flickr.com/photos/jm3/4814208649/
SQL
https://www.flickr.com/photos/marc_smith/6793088143/
Key Takeaways
• Start small
• Start with the ?
• Iteratively follow the value
• Using freely available tooling
• Volume vs Velocity
https://www.flickr.com/photos/djwtwo/8331524425/
Scaling the Solution
https://www.flickr.com/photos/auntiep/4310240/
https://www.flickr.com/photos/111692634@N04/11407095913/
–attributed to Gene Amdahl 1967
“Amdahl’s law is used to find the maximum
expected improvement to an overall system
when only part of the system is improved.”
https://twitter.com/PieCalculus/status/459485747842523136/photo/1
https://www.flickr.com/photos/rofi/2097239111/
Batch
Speed
Serving Query
query = function(all data)
All Data
Lambda Architecture
Scaled Data
Store
Event
Processing
Network
QueryAll Data
Lambda Architecture
Batch View
Realtime View
Batch
Write
Random
Write
Batch
Speed
Serving Query
query = function(all data)
All Data
Lambda Architecture
Client
Master Node
JobTracker
Name Node
Metadata Operations
to Get Block Info
Job assignment to cluster
Task Tracker
Slave Node
Data Node
Map Reduce
Task Tracker
Slave Node
Data Node
Map Reduce
Task Tracker
Slave Node
Data Node
Map Reduce
Task Tracker
Slave Node
Data Node
Map Reduce
1 3 1 2 1 5 6 4
Data Replication on Multiple Nodes
DataWrite
DataRead
Batch - Hadoop (MR1)
Batch - MapReduce
Map Shuffle Reduce
Batch - Cascading
Batch - Spark
Segment
Servers
Query processing
and data storage
Network
Interconnect
Master
Servers
Query planning &
dispatch
External Sources
Loading, streaming,
etc.
SQL or
MapReduceBatch - MPP database
Batch
Speed
Serving Query
query = function(all data)
All Data
Lambda Architecture
Speed - Storm
CEP
Batch
Speed
Serving Query
query = function(all data)
All Data
Lambda Architecture
Lambda Architecture - Serving
http://www.wallzhq.com/wp-content/uploads/2014/02/matrix_binary-wide.jpg
Pull-based
Batch Loads
Enterprise
Data Models
Complex ETL
Logic
Poorly
Suited to
Non-Relational Data
Emergent design is difficult
Conventional Architectures
Pivotal Business Data Lake Architecture
http://www.gopivotal.com/sites/default/files/Pivotal-Business-Data-Lake-Technical_Brochure_WEB.PDF
DATA CORE
RAW FACTUAL DATA
HISTORIZED EVENTS
RETAIN BUSINESS KEY
DATA LINEAGE
DATA INGESTION
EVENT DRIVEN
MESSAGE QUEUE
TRICKLE FEED
BATCH LOAD
INFORMATION PUBLISHING
TOPICAL QUEUES
POST PROCESSING
INFORMATION TIER
PURPOSE BUILT
DATA SUBSETS
TRANSFORMATION
DATA GOVERNANCE
MDM CONCERNS
POST PROCESSING
PRESENTATION TIER
BUSINESS VALUE
APPLICATIONS
DATA SERVICES
AD HOC QUERYING
WRITE BACK?
Transformation
Logic
Data
Post Processing
Near Real Time
Feed
Emergent Design
&
Agile Delivery
Apache Kafka
Apache Storm
Micro-data-services
Drive Towards In Memory Processing
https://www.tele-task.de/archive/lecture/overview/5721/
Remember
https://www.flickr.com/photos/anjin/695894443/
Data Structures
Algorithmshttps://www.flickr.com/photos/herrolsen/7645876896/
Raw Data
Data
Structure
Algorithm Insight
Key Takeaways
• Embrace the cloud
• Fit the Architecture to the
problem
• Remember Knuth
https://www.flickr.com/photos/djwtwo/8331524425/
https://www.flickr.com/photos/tim_norris/2789759648/
SUMMARY
http://www.datameer.com/blog/uncategorized/the-hadoop-ecosystem-visualized-in-datameer.html
48
30
26
22
18 18
16 15 15 15
13 13 13 13 12
0
13
25
38
50
63
Hadoop Ecosystem
https://www.flickr.com/photos/classblog/5136926303/
Commercial Open Source
https://blog.cloudera.com/blog/2011/10/the-community-effect/
https://www.flickr.com/photos/ctsi-global/6556284907/
https://www.flickr.com/photos/will-lion/2597608152/
https://www.flickr.com/photos/jurvetson/14105339228/
Open Questions
http://talkmarketing.co.uk/wp-content/uploads/2013/07/Open-Ended-Questions.jpg
https://www.flickr.com/photos/typoatelier/5615759848/
https://www.flickr.com/photos/rembcc/3802038945/
https://www.flickr.com/photos/sidelong/246816211/
No matter how much you speed up
the computers or the way you put
computers together, the real issues
are at the DATA LEVEL
https://www.flickr.com/photos/opensourceway/5556249000/
Enterprise Master
Data Management
Localised Formats
Single System of
Record
SoR is a process not
a place
Database Integration
(by another name)
http://www.bain.com/infographics/big-data/
Organisational Models

Mais conteúdo relacionado

Mais procurados

How to become a Data Scientist?
How to become a Data Scientist? How to become a Data Scientist?
How to become a Data Scientist? HackerEarth
 
Creating knowledge out of interlinked data
Creating knowledge out of interlinked dataCreating knowledge out of interlinked data
Creating knowledge out of interlinked dataSören Auer
 
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...BigMine
 
R, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsR, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsKrishna Sankar
 
Make Embeddings Semantic Again!
Make Embeddings Semantic Again!Make Embeddings Semantic Again!
Make Embeddings Semantic Again!Heiko Paulheim
 
Designing a second generation of open data platforms
Designing a second generation of open data platformsDesigning a second generation of open data platforms
Designing a second generation of open data platformsYannis Charalabidis
 
GalvanizeU Seattle: Eleven Almost-Truisms About Data
GalvanizeU Seattle: Eleven Almost-Truisms About DataGalvanizeU Seattle: Eleven Almost-Truisms About Data
GalvanizeU Seattle: Eleven Almost-Truisms About DataPaco Nathan
 
SSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW
 
Data Science Folk Knowledge
Data Science Folk KnowledgeData Science Folk Knowledge
Data Science Folk KnowledgeKrishna Sankar
 
Sql saturday el salvador 2016 - Me, A Data Scientist?
Sql saturday el salvador 2016 - Me, A Data Scientist?Sql saturday el salvador 2016 - Me, A Data Scientist?
Sql saturday el salvador 2016 - Me, A Data Scientist?Fabricio Quintanilla
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataSören Auer
 
Data Science in Future Tense
Data Science in Future TenseData Science in Future Tense
Data Science in Future TensePaco Nathan
 

Mais procurados (14)

How to become a Data Scientist?
How to become a Data Scientist? How to become a Data Scientist?
How to become a Data Scientist?
 
Workshop / Meetup: Visão geral sobre Big Data
Workshop / Meetup: Visão geral sobre Big DataWorkshop / Meetup: Visão geral sobre Big Data
Workshop / Meetup: Visão geral sobre Big Data
 
Creating knowledge out of interlinked data
Creating knowledge out of interlinked dataCreating knowledge out of interlinked data
Creating knowledge out of interlinked data
 
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
 
R, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsR, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science Competitions
 
Make Embeddings Semantic Again!
Make Embeddings Semantic Again!Make Embeddings Semantic Again!
Make Embeddings Semantic Again!
 
Designing a second generation of open data platforms
Designing a second generation of open data platformsDesigning a second generation of open data platforms
Designing a second generation of open data platforms
 
GalvanizeU Seattle: Eleven Almost-Truisms About Data
GalvanizeU Seattle: Eleven Almost-Truisms About DataGalvanizeU Seattle: Eleven Almost-Truisms About Data
GalvanizeU Seattle: Eleven Almost-Truisms About Data
 
SSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow Tutorial
 
Data Science Folk Knowledge
Data Science Folk KnowledgeData Science Folk Knowledge
Data Science Folk Knowledge
 
Data Wrangling
Data WranglingData Wrangling
Data Wrangling
 
Sql saturday el salvador 2016 - Me, A Data Scientist?
Sql saturday el salvador 2016 - Me, A Data Scientist?Sql saturday el salvador 2016 - Me, A Data Scientist?
Sql saturday el salvador 2016 - Me, A Data Scientist?
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
 
Data Science in Future Tense
Data Science in Future TenseData Science in Future Tense
Data Science in Future Tense
 

Destaque

Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...SoftServe
 
Quarterly Technology Briefing - Big Data - Germany
Quarterly Technology Briefing - Big Data - GermanyQuarterly Technology Briefing - Big Data - Germany
Quarterly Technology Briefing - Big Data - GermanyThoughtworks
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data ArchitectureGuido Schmutz
 
Big Data & Analytics Architecture
Big Data & Analytics ArchitectureBig Data & Analytics Architecture
Big Data & Analytics ArchitectureArvind Sathi
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
 
Choosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChoosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChicago Hadoop Users Group
 
Creating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitectureCreating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitecturePerficient, Inc.
 
Big Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in ActionBig Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in ActionGuido Schmutz
 
New Analytical Architectures for Big Data
New Analytical Architectures for Big DataNew Analytical Architectures for Big Data
New Analytical Architectures for Big DataCasey Kiernan
 
Analyse the analyst hire QAs for the right reasons
Analyse the analyst   hire QAs for the right reasonsAnalyse the analyst   hire QAs for the right reasons
Analyse the analyst hire QAs for the right reasonsThoughtworks
 
Taking Swift for a spin
Taking Swift for a spinTaking Swift for a spin
Taking Swift for a spinThoughtworks
 
Personal retrospectives
Personal retrospectivesPersonal retrospectives
Personal retrospectivesThoughtworks
 
Using Clojure for Sentiment Analysis of the Twittersphere
Using Clojure for Sentiment Analysis of the TwittersphereUsing Clojure for Sentiment Analysis of the Twittersphere
Using Clojure for Sentiment Analysis of the TwittersphereThoughtworks
 
Ford Mondeo Case Study Analisys
Ford Mondeo  Case Study AnalisysFord Mondeo  Case Study Analisys
Ford Mondeo Case Study Analisystasmeen
 
A quick introduction to AWS Lambda
A quick introduction to AWS LambdaA quick introduction to AWS Lambda
A quick introduction to AWS Lambdaogeisser
 
Lambda and serverless - DevOps North East Jan 2017
Lambda and serverless - DevOps North East Jan 2017Lambda and serverless - DevOps North East Jan 2017
Lambda and serverless - DevOps North East Jan 2017Mike Shutlar
 
AWS Lambda - Event Driven Event-driven Code in the Cloud
AWS Lambda - Event Driven Event-driven Code in the CloudAWS Lambda - Event Driven Event-driven Code in the Cloud
AWS Lambda - Event Driven Event-driven Code in the CloudAmazon Web Services
 
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICSUSING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICSHCL Technologies
 
Эволюция Big Data и Information Management. Reference Architecture.
Эволюция Big Data и Information Management. Reference Architecture.Эволюция Big Data и Information Management. Reference Architecture.
Эволюция Big Data и Information Management. Reference Architecture.Andrey Akulov
 

Destaque (20)

Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
 
Quarterly Technology Briefing - Big Data - Germany
Quarterly Technology Briefing - Big Data - GermanyQuarterly Technology Briefing - Big Data - Germany
Quarterly Technology Briefing - Big Data - Germany
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
 
Big Data & Analytics Architecture
Big Data & Analytics ArchitectureBig Data & Analytics Architecture
Big Data & Analytics Architecture
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
Choosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChoosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your Business
 
Creating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitectureCreating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data Architecture
 
Big Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in ActionBig Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in Action
 
New Analytical Architectures for Big Data
New Analytical Architectures for Big DataNew Analytical Architectures for Big Data
New Analytical Architectures for Big Data
 
Analyse the analyst hire QAs for the right reasons
Analyse the analyst   hire QAs for the right reasonsAnalyse the analyst   hire QAs for the right reasons
Analyse the analyst hire QAs for the right reasons
 
Taking Swift for a spin
Taking Swift for a spinTaking Swift for a spin
Taking Swift for a spin
 
Personal retrospectives
Personal retrospectivesPersonal retrospectives
Personal retrospectives
 
Using Clojure for Sentiment Analysis of the Twittersphere
Using Clojure for Sentiment Analysis of the TwittersphereUsing Clojure for Sentiment Analysis of the Twittersphere
Using Clojure for Sentiment Analysis of the Twittersphere
 
Ford Mondeo Case Study Analisys
Ford Mondeo  Case Study AnalisysFord Mondeo  Case Study Analisys
Ford Mondeo Case Study Analisys
 
A quick introduction to AWS Lambda
A quick introduction to AWS LambdaA quick introduction to AWS Lambda
A quick introduction to AWS Lambda
 
Lambda and serverless - DevOps North East Jan 2017
Lambda and serverless - DevOps North East Jan 2017Lambda and serverless - DevOps North East Jan 2017
Lambda and serverless - DevOps North East Jan 2017
 
AWS Lambda - Event Driven Event-driven Code in the Cloud
AWS Lambda - Event Driven Event-driven Code in the CloudAWS Lambda - Event Driven Event-driven Code in the Cloud
AWS Lambda - Event Driven Event-driven Code in the Cloud
 
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICSUSING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
 
Эволюция Big Data и Information Management. Reference Architecture.
Эволюция Big Data и Information Management. Reference Architecture.Эволюция Big Data и Information Management. Reference Architecture.
Эволюция Big Data и Information Management. Reference Architecture.
 

Semelhante a Big Data: Architectures and Approaches

Fujitsu IT Future 2013 : Alignement de l'IT avec les contraintes Business, té...
Fujitsu IT Future 2013 : Alignement de l'IT avec les contraintes Business, té...Fujitsu IT Future 2013 : Alignement de l'IT avec les contraintes Business, té...
Fujitsu IT Future 2013 : Alignement de l'IT avec les contraintes Business, té...Fujitsu France
 
Up close and personal - Future of Digital 2010
Up close and personal - Future of Digital 2010Up close and personal - Future of Digital 2010
Up close and personal - Future of Digital 2010Rob Manson
 
Introduction To Linked Data
Introduction To Linked DataIntroduction To Linked Data
Introduction To Linked DataLeigh Dodds
 
Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?Oscar Corcho
 
Cloud Architecture + Cloud Architects / Jan 24th 2012
Cloud Architecture + Cloud Architects / Jan 24th 2012Cloud Architecture + Cloud Architects / Jan 24th 2012
Cloud Architecture + Cloud Architects / Jan 24th 2012Lothar Wieske
 
The web is too slow
The web is too slow The web is too slow
The web is too slow Andy Davies
 
Pivotal - Advanced Analytics for Telecommunications
Pivotal - Advanced Analytics for Telecommunications Pivotal - Advanced Analytics for Telecommunications
Pivotal - Advanced Analytics for Telecommunications Hortonworks
 
Web Integrated Data
Web Integrated DataWeb Integrated Data
Web Integrated DataLeigh Dodds
 
Drupal as Base For Your NEXT Mobile App
Drupal as Base For Your NEXT Mobile AppDrupal as Base For Your NEXT Mobile App
Drupal as Base For Your NEXT Mobile AppSumit Kataria
 
Austin cassandra meetup
Austin cassandra meetupAustin cassandra meetup
Austin cassandra meetupgdusbabek
 
Puppet Camp Tokyo 2014: Fireballs, ice bats and 1,000,000 plugins: a story of...
Puppet Camp Tokyo 2014: Fireballs, ice bats and 1,000,000 plugins: a story of...Puppet Camp Tokyo 2014: Fireballs, ice bats and 1,000,000 plugins: a story of...
Puppet Camp Tokyo 2014: Fireballs, ice bats and 1,000,000 plugins: a story of...Puppet
 
Http/2 - What's it all about?
Http/2  - What's it all about?Http/2  - What's it all about?
Http/2 - What's it all about?Andy Davies
 
Big Data, Big Local
Big Data, Big LocalBig Data, Big Local
Big Data, Big LocalTyler Bell
 
Enabling Microservices @Orbitz - DevOpsDays Chicago 2015
Enabling Microservices @Orbitz - DevOpsDays Chicago 2015Enabling Microservices @Orbitz - DevOpsDays Chicago 2015
Enabling Microservices @Orbitz - DevOpsDays Chicago 2015Steve Hoffman
 
Speed is Essential for a Great Web Experience
Speed is Essential for a Great Web ExperienceSpeed is Essential for a Great Web Experience
Speed is Essential for a Great Web ExperienceAndy Davies
 
Telecom APIs Getting Back to Basics
Telecom APIs Getting Back to BasicsTelecom APIs Getting Back to Basics
Telecom APIs Getting Back to BasicsCA API Management
 
Open Data - What does it mean for Government, Business and INSPIRE?
Open Data - What does it mean for Government, Business and INSPIRE?Open Data - What does it mean for Government, Business and INSPIRE?
Open Data - What does it mean for Government, Business and INSPIRE?Fingal Open Data
 
Behaviour-Driven Development: escrevendo especificações ágeis
Behaviour-Driven Development: escrevendo especificações ágeisBehaviour-Driven Development: escrevendo especificações ágeis
Behaviour-Driven Development: escrevendo especificações ágeisHugo Lopes Tavares
 

Semelhante a Big Data: Architectures and Approaches (20)

Fujitsu IT Future 2013 : Alignement de l'IT avec les contraintes Business, té...
Fujitsu IT Future 2013 : Alignement de l'IT avec les contraintes Business, té...Fujitsu IT Future 2013 : Alignement de l'IT avec les contraintes Business, té...
Fujitsu IT Future 2013 : Alignement de l'IT avec les contraintes Business, té...
 
Up close and personal - Future of Digital 2010
Up close and personal - Future of Digital 2010Up close and personal - Future of Digital 2010
Up close and personal - Future of Digital 2010
 
Introduction To Linked Data
Introduction To Linked DataIntroduction To Linked Data
Introduction To Linked Data
 
Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?
 
Cloud Architecture + Cloud Architects / Jan 24th 2012
Cloud Architecture + Cloud Architects / Jan 24th 2012Cloud Architecture + Cloud Architects / Jan 24th 2012
Cloud Architecture + Cloud Architects / Jan 24th 2012
 
The web is too slow
The web is too slow The web is too slow
The web is too slow
 
Pivotal - Advanced Analytics for Telecommunications
Pivotal - Advanced Analytics for Telecommunications Pivotal - Advanced Analytics for Telecommunications
Pivotal - Advanced Analytics for Telecommunications
 
Web Integrated Data
Web Integrated DataWeb Integrated Data
Web Integrated Data
 
Drupal as Base For Your NEXT Mobile App
Drupal as Base For Your NEXT Mobile AppDrupal as Base For Your NEXT Mobile App
Drupal as Base For Your NEXT Mobile App
 
Austin cassandra meetup
Austin cassandra meetupAustin cassandra meetup
Austin cassandra meetup
 
Puppet Camp Tokyo 2014: Fireballs, ice bats and 1,000,000 plugins: a story of...
Puppet Camp Tokyo 2014: Fireballs, ice bats and 1,000,000 plugins: a story of...Puppet Camp Tokyo 2014: Fireballs, ice bats and 1,000,000 plugins: a story of...
Puppet Camp Tokyo 2014: Fireballs, ice bats and 1,000,000 plugins: a story of...
 
Http/2 - What's it all about?
Http/2  - What's it all about?Http/2  - What's it all about?
Http/2 - What's it all about?
 
Big Data, Big Local
Big Data, Big LocalBig Data, Big Local
Big Data, Big Local
 
Enabling Microservices @Orbitz - DevOpsDays Chicago 2015
Enabling Microservices @Orbitz - DevOpsDays Chicago 2015Enabling Microservices @Orbitz - DevOpsDays Chicago 2015
Enabling Microservices @Orbitz - DevOpsDays Chicago 2015
 
Speed is Essential for a Great Web Experience
Speed is Essential for a Great Web ExperienceSpeed is Essential for a Great Web Experience
Speed is Essential for a Great Web Experience
 
Informatica online training
Informatica online trainingInformatica online training
Informatica online training
 
Telecom APIs Getting Back to Basics
Telecom APIs Getting Back to BasicsTelecom APIs Getting Back to Basics
Telecom APIs Getting Back to Basics
 
Open Data - What does it mean for Government, Business and INSPIRE?
Open Data - What does it mean for Government, Business and INSPIRE?Open Data - What does it mean for Government, Business and INSPIRE?
Open Data - What does it mean for Government, Business and INSPIRE?
 
Behaviour-Driven Development: escrevendo especificações ágeis
Behaviour-Driven Development: escrevendo especificações ágeisBehaviour-Driven Development: escrevendo especificações ágeis
Behaviour-Driven Development: escrevendo especificações ágeis
 
Mobile Web Talk
Mobile Web TalkMobile Web Talk
Mobile Web Talk
 

Mais de Thoughtworks

Design System as a Product
Design System as a ProductDesign System as a Product
Design System as a ProductThoughtworks
 
Designers, Developers & Dogs
Designers, Developers & DogsDesigners, Developers & Dogs
Designers, Developers & DogsThoughtworks
 
Cloud-first for fast innovation
Cloud-first for fast innovationCloud-first for fast innovation
Cloud-first for fast innovationThoughtworks
 
More impact with flexible teams
More impact with flexible teamsMore impact with flexible teams
More impact with flexible teamsThoughtworks
 
Culture of Innovation
Culture of InnovationCulture of Innovation
Culture of InnovationThoughtworks
 
Developer Experience
Developer ExperienceDeveloper Experience
Developer ExperienceThoughtworks
 
When we design together
When we design togetherWhen we design together
When we design togetherThoughtworks
 
Hardware is hard(er)
Hardware is hard(er)Hardware is hard(er)
Hardware is hard(er)Thoughtworks
 
Customer-centric innovation enabled by cloud
 Customer-centric innovation enabled by cloud Customer-centric innovation enabled by cloud
Customer-centric innovation enabled by cloudThoughtworks
 
Amazon's Culture of Innovation
Amazon's Culture of InnovationAmazon's Culture of Innovation
Amazon's Culture of InnovationThoughtworks
 
When in doubt, go live
When in doubt, go liveWhen in doubt, go live
When in doubt, go liveThoughtworks
 
Don't cross the Rubicon
Don't cross the RubiconDon't cross the Rubicon
Don't cross the RubiconThoughtworks
 
Your test coverage is a lie!
Your test coverage is a lie!Your test coverage is a lie!
Your test coverage is a lie!Thoughtworks
 
Docker container security
Docker container securityDocker container security
Docker container securityThoughtworks
 
Redefining the unit
Redefining the unitRedefining the unit
Redefining the unitThoughtworks
 
Technology Radar Webinar UK - Vol. 22
Technology Radar Webinar UK - Vol. 22Technology Radar Webinar UK - Vol. 22
Technology Radar Webinar UK - Vol. 22Thoughtworks
 
A Tribute to Turing
A Tribute to TuringA Tribute to Turing
A Tribute to TuringThoughtworks
 
Rsa maths worked out
Rsa maths worked outRsa maths worked out
Rsa maths worked outThoughtworks
 

Mais de Thoughtworks (20)

Design System as a Product
Design System as a ProductDesign System as a Product
Design System as a Product
 
Designers, Developers & Dogs
Designers, Developers & DogsDesigners, Developers & Dogs
Designers, Developers & Dogs
 
Cloud-first for fast innovation
Cloud-first for fast innovationCloud-first for fast innovation
Cloud-first for fast innovation
 
More impact with flexible teams
More impact with flexible teamsMore impact with flexible teams
More impact with flexible teams
 
Culture of Innovation
Culture of InnovationCulture of Innovation
Culture of Innovation
 
Dual-Track Agile
Dual-Track AgileDual-Track Agile
Dual-Track Agile
 
Developer Experience
Developer ExperienceDeveloper Experience
Developer Experience
 
When we design together
When we design togetherWhen we design together
When we design together
 
Hardware is hard(er)
Hardware is hard(er)Hardware is hard(er)
Hardware is hard(er)
 
Customer-centric innovation enabled by cloud
 Customer-centric innovation enabled by cloud Customer-centric innovation enabled by cloud
Customer-centric innovation enabled by cloud
 
Amazon's Culture of Innovation
Amazon's Culture of InnovationAmazon's Culture of Innovation
Amazon's Culture of Innovation
 
When in doubt, go live
When in doubt, go liveWhen in doubt, go live
When in doubt, go live
 
Don't cross the Rubicon
Don't cross the RubiconDon't cross the Rubicon
Don't cross the Rubicon
 
Error handling
Error handlingError handling
Error handling
 
Your test coverage is a lie!
Your test coverage is a lie!Your test coverage is a lie!
Your test coverage is a lie!
 
Docker container security
Docker container securityDocker container security
Docker container security
 
Redefining the unit
Redefining the unitRedefining the unit
Redefining the unit
 
Technology Radar Webinar UK - Vol. 22
Technology Radar Webinar UK - Vol. 22Technology Radar Webinar UK - Vol. 22
Technology Radar Webinar UK - Vol. 22
 
A Tribute to Turing
A Tribute to TuringA Tribute to Turing
A Tribute to Turing
 
Rsa maths worked out
Rsa maths worked outRsa maths worked out
Rsa maths worked out
 

Último

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 

Último (20)

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 

Big Data: Architectures and Approaches

Notas do Editor

  1. Dave http://www.forbes.com/sites/gilpress/2013/05/09/a-very-short-history-of-big-data/Reference
  2. Ashok Big data analytics are driving rapid growth for public cloud computing vendors with revenues for the top 50 public cloud providers shooting up 47% in the fourth quarter last year to $6.2 billion
  3. Dave http://nsa.gov1.info/utah-data-center/
  4. Ashok Who is that handsome man!
  5. Dave & Ashok Growth in retail, usage of iBeacons, Precision marketing, some sophistication with web analytics & CRM - greater penetration. Healthcare - remote monitoring, automated procedures
  6. Ashok
  7. Ashok Validation or Discovery picture of fork in the road?
  8. Ashok & Dave
  9. Dave
  10. Ashok Exploring alternate models
  11. Dave
  12. Ashok Lambda Architecture - section heading
  13. Ashok - high level description of components
  14. Dave Batch Hadoop 2.0/MR2 goal: allows you to share a large cluster of machines between different frameworks. Similar to Mesos, both are steps towards distributed data OS.
  15. Dave Data Lakes
  16. Dave
  17. Ashok
  18. Ashok
  19. Ashok Fast and Scalable Analytics depends on efficient data structures Matching the Algorithm to the data structure Morphing the Raw data into the data structure Raw data > Data Structure > Algorithm > Insight
  20. Conclusion
  21. Ashok Balance shifting from Commercial to Open-Source Innovations coming from the open source world
  22. Ashok Quantum computing - this is one apparently!
  23. closing statement before Q&A
  24. Dave