Enviar pesquisa
Carregar
CWIN17 India / Bigdata architecture yashowardhan sowale
•
0 gostou
•
937 visualizações
Capgemini
Seguir
Bigdata architecture
Leia menos
Leia mais
Apresentações e oratória
Denunciar
Compartilhar
Denunciar
Compartilhar
1 de 16
Baixar agora
Baixar para ler offline
Recomendados
Boosting Innovation and Value for Your Subsidiaries with SAP S/4HANA Cloud
Boosting Innovation and Value for Your Subsidiaries with SAP S/4HANA Cloud
Capgemini
The Need for Speed
The Need for Speed
Capgemini
Connected Autonomous Planning: a continuous touchless model enabling an agile...
Connected Autonomous Planning: a continuous touchless model enabling an agile...
Capgemini
Digital manufacturing cwin18-milan
Digital manufacturing cwin18-milan
Capgemini
Top Trends in Wealth Management 2020
Top Trends in Wealth Management 2020
Capgemini
Digital manufacturing cwin18 mexico
Digital manufacturing cwin18 mexico
Capgemini
Ai and data migration as a service subhash bhat cwin18-india
Ai and data migration as a service subhash bhat cwin18-india
Capgemini
Top Trends in Payments 2022
Top Trends in Payments 2022
Capgemini
Recomendados
Boosting Innovation and Value for Your Subsidiaries with SAP S/4HANA Cloud
Boosting Innovation and Value for Your Subsidiaries with SAP S/4HANA Cloud
Capgemini
The Need for Speed
The Need for Speed
Capgemini
Connected Autonomous Planning: a continuous touchless model enabling an agile...
Connected Autonomous Planning: a continuous touchless model enabling an agile...
Capgemini
Digital manufacturing cwin18-milan
Digital manufacturing cwin18-milan
Capgemini
Top Trends in Wealth Management 2020
Top Trends in Wealth Management 2020
Capgemini
Digital manufacturing cwin18 mexico
Digital manufacturing cwin18 mexico
Capgemini
Ai and data migration as a service subhash bhat cwin18-india
Ai and data migration as a service subhash bhat cwin18-india
Capgemini
Top Trends in Payments 2022
Top Trends in Payments 2022
Capgemini
Track 3 - Atelier 3 - Assurez l’agilité et la profitabilité de votre business...
Track 3 - Atelier 3 - Assurez l’agilité et la profitabilité de votre business...
Amazon Web Services
Introducing Gartner
Introducing Gartner
chrisforte43
UNLIMITED by Capgemini: Foundation of Digital Business
UNLIMITED by Capgemini: Foundation of Digital Business
Capgemini
Pluto7 - Tableau Webinar on enabling Organization to be Data Driven in 201...
Pluto7 - Tableau Webinar on enabling Organization to be Data Driven in 201...
Manju Devadas
The Perfect Storm & Your Information Strategy
The Perfect Storm & Your Information Strategy
Capgemini
Artificial intelligence capabilities overview yashowardhan sowale cwin18-india
Artificial intelligence capabilities overview yashowardhan sowale cwin18-india
Capgemini
Top Trends in Commercial Banking: 2020
Top Trends in Commercial Banking: 2020
Capgemini
Invenio content financials
Invenio content financials
invenioLSI
Make it a valuable experience, think design
Make it a valuable experience, think design
Capgemini
20151014 Presentation Conferência Banca e Seguros Portugal
20151014 Presentation Conferência Banca e Seguros Portugal
Pascal Spelier
Software-Defined Storage Accelerates Storage Cost Reduction and Service-Level...
Software-Defined Storage Accelerates Storage Cost Reduction and Service-Level...
DataCore Software
Achieving GxP compliance with SAP S/4HANA in the AWS Cloud
Achieving GxP compliance with SAP S/4HANA in the AWS Cloud
Capgemini
Hampshire City Council and Capgemini at SAPPHIRENOW
Hampshire City Council and Capgemini at SAPPHIRENOW
Capgemini
Infographic-Unlocking Customer Satisfaction: Why Digital Holds the key for Te...
Infographic-Unlocking Customer Satisfaction: Why Digital Holds the key for Te...
Capgemini
Construction Viz Project Tracker
Construction Viz Project Tracker
Jeffrey Lydon
CWIN17 New-York / insurance spotlight building the digital core
CWIN17 New-York / insurance spotlight building the digital core
Capgemini
CWIN17 san francisco-shawn kelly-iot business value
CWIN17 san francisco-shawn kelly-iot business value
Capgemini
Enabling and accelerating multi-tenancy with Capgemini Digital Cloud Platform...
Enabling and accelerating multi-tenancy with Capgemini Digital Cloud Platform...
Capgemini
Future of service
Future of service
Capgemini
A strategic review of the top five offshore vendors
A strategic review of the top five offshore vendors
Semalytix
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Denodo
Klarna Tech Talk - Mind the Data!
Klarna Tech Talk - Mind the Data!
Jeffrey T. Pollock
Mais conteúdo relacionado
Mais procurados
Track 3 - Atelier 3 - Assurez l’agilité et la profitabilité de votre business...
Track 3 - Atelier 3 - Assurez l’agilité et la profitabilité de votre business...
Amazon Web Services
Introducing Gartner
Introducing Gartner
chrisforte43
UNLIMITED by Capgemini: Foundation of Digital Business
UNLIMITED by Capgemini: Foundation of Digital Business
Capgemini
Pluto7 - Tableau Webinar on enabling Organization to be Data Driven in 201...
Pluto7 - Tableau Webinar on enabling Organization to be Data Driven in 201...
Manju Devadas
The Perfect Storm & Your Information Strategy
The Perfect Storm & Your Information Strategy
Capgemini
Artificial intelligence capabilities overview yashowardhan sowale cwin18-india
Artificial intelligence capabilities overview yashowardhan sowale cwin18-india
Capgemini
Top Trends in Commercial Banking: 2020
Top Trends in Commercial Banking: 2020
Capgemini
Invenio content financials
Invenio content financials
invenioLSI
Make it a valuable experience, think design
Make it a valuable experience, think design
Capgemini
20151014 Presentation Conferência Banca e Seguros Portugal
20151014 Presentation Conferência Banca e Seguros Portugal
Pascal Spelier
Software-Defined Storage Accelerates Storage Cost Reduction and Service-Level...
Software-Defined Storage Accelerates Storage Cost Reduction and Service-Level...
DataCore Software
Achieving GxP compliance with SAP S/4HANA in the AWS Cloud
Achieving GxP compliance with SAP S/4HANA in the AWS Cloud
Capgemini
Hampshire City Council and Capgemini at SAPPHIRENOW
Hampshire City Council and Capgemini at SAPPHIRENOW
Capgemini
Infographic-Unlocking Customer Satisfaction: Why Digital Holds the key for Te...
Infographic-Unlocking Customer Satisfaction: Why Digital Holds the key for Te...
Capgemini
Construction Viz Project Tracker
Construction Viz Project Tracker
Jeffrey Lydon
CWIN17 New-York / insurance spotlight building the digital core
CWIN17 New-York / insurance spotlight building the digital core
Capgemini
CWIN17 san francisco-shawn kelly-iot business value
CWIN17 san francisco-shawn kelly-iot business value
Capgemini
Enabling and accelerating multi-tenancy with Capgemini Digital Cloud Platform...
Enabling and accelerating multi-tenancy with Capgemini Digital Cloud Platform...
Capgemini
Future of service
Future of service
Capgemini
A strategic review of the top five offshore vendors
A strategic review of the top five offshore vendors
Semalytix
Mais procurados
(20)
Track 3 - Atelier 3 - Assurez l’agilité et la profitabilité de votre business...
Track 3 - Atelier 3 - Assurez l’agilité et la profitabilité de votre business...
Introducing Gartner
Introducing Gartner
UNLIMITED by Capgemini: Foundation of Digital Business
UNLIMITED by Capgemini: Foundation of Digital Business
Pluto7 - Tableau Webinar on enabling Organization to be Data Driven in 201...
Pluto7 - Tableau Webinar on enabling Organization to be Data Driven in 201...
The Perfect Storm & Your Information Strategy
The Perfect Storm & Your Information Strategy
Artificial intelligence capabilities overview yashowardhan sowale cwin18-india
Artificial intelligence capabilities overview yashowardhan sowale cwin18-india
Top Trends in Commercial Banking: 2020
Top Trends in Commercial Banking: 2020
Invenio content financials
Invenio content financials
Make it a valuable experience, think design
Make it a valuable experience, think design
20151014 Presentation Conferência Banca e Seguros Portugal
20151014 Presentation Conferência Banca e Seguros Portugal
Software-Defined Storage Accelerates Storage Cost Reduction and Service-Level...
Software-Defined Storage Accelerates Storage Cost Reduction and Service-Level...
Achieving GxP compliance with SAP S/4HANA in the AWS Cloud
Achieving GxP compliance with SAP S/4HANA in the AWS Cloud
Hampshire City Council and Capgemini at SAPPHIRENOW
Hampshire City Council and Capgemini at SAPPHIRENOW
Infographic-Unlocking Customer Satisfaction: Why Digital Holds the key for Te...
Infographic-Unlocking Customer Satisfaction: Why Digital Holds the key for Te...
Construction Viz Project Tracker
Construction Viz Project Tracker
CWIN17 New-York / insurance spotlight building the digital core
CWIN17 New-York / insurance spotlight building the digital core
CWIN17 san francisco-shawn kelly-iot business value
CWIN17 san francisco-shawn kelly-iot business value
Enabling and accelerating multi-tenancy with Capgemini Digital Cloud Platform...
Enabling and accelerating multi-tenancy with Capgemini Digital Cloud Platform...
Future of service
Future of service
A strategic review of the top five offshore vendors
A strategic review of the top five offshore vendors
Semelhante a CWIN17 India / Bigdata architecture yashowardhan sowale
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Denodo
Klarna Tech Talk - Mind the Data!
Klarna Tech Talk - Mind the Data!
Jeffrey T. Pollock
Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks
DataWorks Summit/Hadoop Summit
IoT Crash Course Hadoop Summit SJ
IoT Crash Course Hadoop Summit SJ
Daniel Madrigal
Balancing data democratization with comprehensive information governance: bui...
Balancing data democratization with comprehensive information governance: bui...
DataWorks Summit
Benefits of a data lake
Benefits of a data lake
Sun Technologies
Capturing big value in big data
Capturing big value in big data
BSP Media Group
Data lake benefits
Data lake benefits
Ricky Barron
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
MetroStar
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Hortonworks
The Double win business transformation and in-year ROI and TCO reduction
The Double win business transformation and in-year ROI and TCO reduction
MongoDB
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
Ashraf Uddin
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Pentaho
Real-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughton
Real-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughton
Synerzip
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
DATAVERSITY
8.17.11 big data and hadoop with informatica slideshare
8.17.11 big data and hadoop with informatica slideshare
Julianna DeLua
Using the information server toolset to deliver end to end traceability
Using the information server toolset to deliver end to end traceability
IBM Sverige
Neoaug 2013 critical success factors for data quality management-chain-sys-co...
Neoaug 2013 critical success factors for data quality management-chain-sys-co...
Chain Sys Corporation
KNIME Meetup 2016-04-16
KNIME Meetup 2016-04-16
W. Daniel Cox, III CMA, CFM
DataPlatform.pptx
DataPlatform.pptx
RahulGupta417334
Semelhante a CWIN17 India / Bigdata architecture yashowardhan sowale
(20)
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Klarna Tech Talk - Mind the Data!
Klarna Tech Talk - Mind the Data!
Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks
IoT Crash Course Hadoop Summit SJ
IoT Crash Course Hadoop Summit SJ
Balancing data democratization with comprehensive information governance: bui...
Balancing data democratization with comprehensive information governance: bui...
Benefits of a data lake
Benefits of a data lake
Capturing big value in big data
Capturing big value in big data
Data lake benefits
Data lake benefits
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
The Double win business transformation and in-year ROI and TCO reduction
The Double win business transformation and in-year ROI and TCO reduction
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Real-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughton
Real-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughton
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
8.17.11 big data and hadoop with informatica slideshare
8.17.11 big data and hadoop with informatica slideshare
Using the information server toolset to deliver end to end traceability
Using the information server toolset to deliver end to end traceability
Neoaug 2013 critical success factors for data quality management-chain-sys-co...
Neoaug 2013 critical success factors for data quality management-chain-sys-co...
KNIME Meetup 2016-04-16
KNIME Meetup 2016-04-16
DataPlatform.pptx
DataPlatform.pptx
Mais de Capgemini
Top Healthcare Trends 2022
Top Healthcare Trends 2022
Capgemini
Top P&C Insurance Trends 2022
Top P&C Insurance Trends 2022
Capgemini
Commercial Banking Trends book 2022
Commercial Banking Trends book 2022
Capgemini
Top Trends in Wealth Management 2022
Top Trends in Wealth Management 2022
Capgemini
Retail Banking Trends book 2022
Retail Banking Trends book 2022
Capgemini
Top Life Insurance Trends 2022
Top Life Insurance Trends 2022
Capgemini
キャップジェミニ、あなたの『RISE WITH SAP』のパートナーです
キャップジェミニ、あなたの『RISE WITH SAP』のパートナーです
Capgemini
Property & Casualty Insurance Top Trends 2021
Property & Casualty Insurance Top Trends 2021
Capgemini
Life Insurance Top Trends 2021
Life Insurance Top Trends 2021
Capgemini
Top Trends in Commercial Banking: 2021
Top Trends in Commercial Banking: 2021
Capgemini
Top Trends in Wealth Management: 2021
Top Trends in Wealth Management: 2021
Capgemini
Top Trends in Payments: 2021
Top Trends in Payments: 2021
Capgemini
Health Insurance Top Trends 2021
Health Insurance Top Trends 2021
Capgemini
Top Trends in Retail Banking: 2021
Top Trends in Retail Banking: 2021
Capgemini
Capgemini’s Connected Autonomous Planning
Capgemini’s Connected Autonomous Planning
Capgemini
Top Trends in Retail Banking: 2020
Top Trends in Retail Banking: 2020
Capgemini
Top Trends in Life Insurance: 2020
Top Trends in Life Insurance: 2020
Capgemini
Top Trends in Health Insurance: 2020
Top Trends in Health Insurance: 2020
Capgemini
Top Trends in Payments: 2020
Top Trends in Payments: 2020
Capgemini
How to get off the white elephant of physical and leverage the true benefits ...
How to get off the white elephant of physical and leverage the true benefits ...
Capgemini
Mais de Capgemini
(20)
Top Healthcare Trends 2022
Top Healthcare Trends 2022
Top P&C Insurance Trends 2022
Top P&C Insurance Trends 2022
Commercial Banking Trends book 2022
Commercial Banking Trends book 2022
Top Trends in Wealth Management 2022
Top Trends in Wealth Management 2022
Retail Banking Trends book 2022
Retail Banking Trends book 2022
Top Life Insurance Trends 2022
Top Life Insurance Trends 2022
キャップジェミニ、あなたの『RISE WITH SAP』のパートナーです
キャップジェミニ、あなたの『RISE WITH SAP』のパートナーです
Property & Casualty Insurance Top Trends 2021
Property & Casualty Insurance Top Trends 2021
Life Insurance Top Trends 2021
Life Insurance Top Trends 2021
Top Trends in Commercial Banking: 2021
Top Trends in Commercial Banking: 2021
Top Trends in Wealth Management: 2021
Top Trends in Wealth Management: 2021
Top Trends in Payments: 2021
Top Trends in Payments: 2021
Health Insurance Top Trends 2021
Health Insurance Top Trends 2021
Top Trends in Retail Banking: 2021
Top Trends in Retail Banking: 2021
Capgemini’s Connected Autonomous Planning
Capgemini’s Connected Autonomous Planning
Top Trends in Retail Banking: 2020
Top Trends in Retail Banking: 2020
Top Trends in Life Insurance: 2020
Top Trends in Life Insurance: 2020
Top Trends in Health Insurance: 2020
Top Trends in Health Insurance: 2020
Top Trends in Payments: 2020
Top Trends in Payments: 2020
How to get off the white elephant of physical and leverage the true benefits ...
How to get off the white elephant of physical and leverage the true benefits ...
Último
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
AsifArshad8
Application of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptx
Roquia Salam
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RachelAnnTenibroAmaz
A Guide to Choosing the Ideal Air Cooler
A Guide to Choosing the Ideal Air Cooler
enquirieskenstar
Internship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SE
Saleh Ibne Omar
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Sebastiano Panichella
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
Sebastiano Panichella
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
漢銘 謝
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck. .pptx
ogubuikealex
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for RA (1ST SEM
Charmi13
proposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeeger
kumenegertelayegrama
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
erickamwana1
General Elections Final Press Noteas per M
General Elections Final Press Noteas per M
VidyaAdsule1
GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
sarwankumar4524
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
App Ethena
cse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber security
sandeepnani2260
Último
(17)
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
Application of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptx
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
A Guide to Choosing the Ideal Air Cooler
A Guide to Choosing the Ideal Air Cooler
Internship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SE
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck. .pptx
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for RA (1ST SEM
proposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeeger
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
General Elections Final Press Noteas per M
General Elections Final Press Noteas per M
GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
cse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber security
CWIN17 India / Bigdata architecture yashowardhan sowale
1.
1Copyright © Capgemini
2016. All Rights Reserved Bigdata Architecture Overview
2.
2Copyright © Capgemini
2016. All Rights Reserved Gartner Hype Cycle – Emerging Technologies
3.
3Copyright © Capgemini
2016. All Rights Reserved Benefits
4.
4Copyright © Capgemini
2016. All Rights Reserved Big Data and its Dimensions Extracting insight from an immense volume, variety and velocity of data, in context, beyond what was previously possible Manage the complexity of data in many different structures, ranging from relational, to logs, to raw text Streaming data and large volume data movement Scale from Terabytes to Petabytes (1K TBs) to Zetabytes (1B TBs) Having a lot of data in different volumes coming in at high speed is worthless if that data is incorrect. Organizations need to ensure that the data is correct as well as the analyses performed on the data are correct. Discovering value from multichannel datasets Variety: Velocity: Volume: Veracity: Value:
5.
5Copyright © Capgemini
2016. All Rights Reserved Applications for Big Data Analytics Homeland Security FinanceSmarter Healthcare Multi-channel sales Telecom Manufacturing Traffic Control Trading Analytics Fraud and Risk Log Analysis Search Quality Retail: Churn
6.
6Copyright © Capgemini
2016. All Rights Reserved Manage Data governance and security Data privacy Compliance Collaboration Value generation Program delivery Data-driven culture Information strategy Skill development Master data mgmt Metadata mgmt Data quality mgmt Operations, SLA’s Orchestration General reference architecture for Big Data Analytics ValueActInsightAnalyzeInformationProcessSource data Customer profitability Operational cost cutting Risk prevention Market share increase Business Applications Customer campaign Trigger activity Business Processes Trigger event Adjust process Decision makers Approve/reject business opportunities Develop new business models and products Customer Experience Operational Process Optimization Risk, Fraud Disruptive Business Model Search What is relevant? Explorative How does it work? Descriptive What happened? Diagnostic Why did it happen? Predictive What will happen? Prescriptive How to act next? Data asset descriptions Processed data Measures, KPI’s Dimensions, Master data Granular data Events Context information Ingest Catalog Stream Store Prepare Refine, blend Manage lifecycle Internal data IT managed applications (ERP, SCM, CRM) Master and reference data Business owned informal data Documents, mail, images, voice, video Web and mobile apps B2B Internet, Social, Internet of Things (machine, sensor) Third party data: market, weather, climate, geolocation Open data External Data Business performance Performance improvement Mask
7.
7Copyright © Capgemini
2016. All Rights Reserved The BDL is also aligned with our principles Unleash Data and Insights as-a-service Make Insight-driven Value a Crucial Business KPI Empower your People with Insights at the Point of Action Develop an Enterprise Data Science Culture Master Governance, Security and Privacy of your Data Assets Enable your Data Landscape for the Flood coming from Connected People and Things Embark on the Journey to Insights within your Business and Technology Context 1 2 3 7654 It concerns both Business and (disruptive) Technology It works with high volumes of all kinds of data It integrates Unified Data Management capabilities to manage governance, security, privacy, MDM, RDM, etc it also comes with a new, specific mindset that has to be addressed at the Enterprise level We (Capgemini) intend to offer the BDL as-a-Service Bringing Business Value by delivering Insights at the Point of Action is the motto of the BDL 1 2 3 7 654
8.
8Copyright © Capgemini
2016. All Rights Reserved Business Data Lake Reference Architecture - Conceptual Characteristics Store-anything; analyze everything Blend traditional data elements with new data types Manage centrally, govern locally Future-proof design Highly scalable and available Data Access Layer Data Distillation Layer Data Quality Governance Framework (Business Rules, Transformation, Aggregation) Customer Master (CRM) Data Lake Layer Landing Self-service 4 Data Ingestion LayerExtract & Load Streams 3 Structured data Sources 2 1 ODS SandboxSQL-on-Hadoop In-Memory Grid Data Visualization and Reporting Advanced Analytics Data Virtualization Or Blending Marts DataGovernance(Audit,Lineage) 7 MetadataManagement Transactional Systems(RES/CRM) Un/Semi-Structured Data Sources Data Dissemination Layer Data Provisioning Layer HR Mart 1 HR Mart 2 Distributed Compute Layer / Services Distributed Storage Layer Data Governance Integration APILayer 11 6 5 DataSecurity(Authentication,Authorization,Kerberos) 8 9 10
9.
9Copyright © Capgemini
2016. All Rights Reserved Business Data Lake Reference Architecture - Logical Talend 6.3 or latest Data Access Layer Data Distillation Layer Data Quality Governance Framework (Business Rules, Transformation, Aggregation) Customer Master (CRM) Data Lake Layer Landing 4 Data Ingestion LayerExtract & Load Streams 3 Structured data Sources 2 1 ODS SandboxSQL-on-Hadoop In-Memory Grid Data Virtualization Or Blending Marts DataGovernance(Audit,Lineage) 7 MetadataManagement Transactional Systems(RES/CRM) Un/Semi-Structured Data Sources Data Dissemination Layer Data Provisioning Layer HR Mart 1 HR Mart 2 APILayer 11 6 5 DataSecurity(Authentication,Authorization,Kerberos) 8 9 10 Ranger, Knox Atlas Hortonworks HDP 2.5 or latest Spark HBASE Hive HBASE / Hive Datamarts Redshift Zeppellin RESTful Service Self-serviceData Visualization and Reporting Advanced Analytics Spark Streaming/Storm Kafka
10.
10Copyright © Capgemini
2016. All Rights Reserved Detailed layer breakup
11.
11Copyright © Capgemini
2016. All Rights Reserved Reference architecture for data ingestion - Indicative Functionality: Ingest Data from a variety of sources and with varying latency, into the Data Lake Data Integration Services S/FTP based push (Logs, text, other file based) Changed Data Management (Delta extracts, event mgmt) Data Sourcing Source Extraction Services (XML, Relational, Other extracts) DataTransformation Transformation Services Fast Data Manipulation • Sorting • File Merges • Joins • File Splitting • Others Transform Routines • Aggregation • Mappings • Lookups • Calculations • others Metadata Management Automation Services Deployment (Job & others) Error Handling Clustering & Capacity Common Services Data Sources (Structured, Semi-Structured, Unstructured) DataState Data at Rest (ETL pushdown, batch using standard DI tools or Sqoop) Data in Motion (Fast data, processed via tools like Flume, Storm, Spark, etc) Data Persistence Big Data Transformations • User-defined functions / custom MR code (Java, Python etc.) for complex logic ETL Pushdown Processing (Execute mapping jobs on Hadoop cluster on HDFS/Hive/Spark….) Characteristics The Data Ingestion design principles are based on integrating raw data characterized by extreme scale and variability, and making provisions for both ‘data at rest’ (batch) and ‘data in motion’ (low latency) The framework combines traditional data integration methodologies leveraging the Extract-Transform-Load approach and extends it to also process semi-structured and unstructured data elements. The classical model of tracking data elements through their lifecycle and providing for lineage can be added in this framework.
12.
12Copyright © Capgemini
2016. All Rights Reserved Data Acquisition and Reconciliation The Data Reconciliation is part of data quality and ensures data integrity in the data lake. Reconciliation process checks if the data has been loaded properly to ensure accuracy and completeness of the data Master Data – This is a fairly simple process as the Master Data is not subject to frequent changes. The granularity of the data remains the same in the source and the target Transactional Data – Reconciliation of the Transactional Data is instrumental to the success of the big data systems. Reconciliation can happen on the entire data set or on the incremental data based on the method by which the data is ingested Separate metadata tables / files are designed specifically for reconciliation. These tables/ files are populated with reconciliation queries and reconciliation reports are generated after data is loaded into the data lake. Data Reconciliation (Optional) The Data Acquisition can be described as combination of Landing Zone & Data validation, Delta Detection & Data Enrichment Landing Zone – It is an area wherein data from all the source systems across client’s landscape will land for the utilization/consumption by downstream systems Data validation – It is the first check point or zone wherein the MDM based checks will be applied on the incoming source data files. Delta Detection : This will be applicable to the data feeds from those source systems which have the capability to send/provide incremental delta data for the regular ongoing data processing into data lake solution. Data Enrichment : Data enrichment refers to processes used to enhance, refine or otherwise improve raw data. Data from various enrichment sources will be pushed to data lake via Landing zone for enrichment of existing data. Data Acquisition
13.
13Copyright © Capgemini
2016. All Rights Reserved Data Distillation in the Data Lake: approach to provisioning for data consumption Characteristics Uniform approach for distillation of information from the data lake A centralized Data Quality engine for application of uniform data quality rules across the enterprise An Integrated Data Quality function to cleanse, standardize, enrich and de-duplicate data Console for Design, Development & Validation of rules Data Quality Services for Integration with operational systems, MDM A Exception Management solution for resolving data issues and errors. Data quality process running on the data will be translated into MapReduce for faster processing. Data Persistence Layer Distillation Layer AGGREGATION EXTRACT TRANSFORM Σ SECURE DATA QUALITY STORE DATA QUALITY CONSOLE DATA QUALITY ENGINE DATA PROFILING DATA CLEANSING MATCH & MERGE DATA ENRICHMENT RULE MANAGER DQ META-DATA DATA DASHBOARD EXCEPTION MANAGEMENT DATA QUALITY CONFIGURATOR EXCEPTION REPOSITORY DQ MART Functionality: Ability to ingest data from the storage tier and convert it to structured data for easier analysis by downstream applications. This is done through a combination of Extraction, transformation and aggregation of high quality data from the Data Lake and making it available for Analytical and Reporting Applications. Transformation will also involve data quality checks and corrections like profiling, validating, cleansing structured and unstructured data based on Business rules. Data is distilled (or prepared) on a per-function basis, and made available for consumption. This is consistent with the design practice of ‘manage data centrally and provision locally’
14.
14Copyright © Capgemini
2016. All Rights Reserved Data Persistence Layer : Schema on Read & Distill on Demand Namenode Hadoop Distributed File System (HDFS) Datanodes Replication Job / Task Tracker Storage Cluster/Rack Characteristics Deliver a single, comprehensive view of all data, across functional areas – to conduct deep analysis Multi-tiered Data Lake that serves distinct functionalities – e.g., Landing, staging and curated stores A landing area containing both traditional data as well as non-traditional data – characterized by attributes of value, veracity, volume, velocity and variety Eliminate the need for upfront schema design and rigid pre-configured models Easy and cost-effective configuration for scale up and scale down Store everything, distill on demand Landing Staging Data Lake Curated Audit Metadata Search Data Ingestion Functionality: Create a single repository for information and deliver a single, silo-less store to handle all types of data for all reporting, analysis and discovery requirements
15.
15Copyright © Capgemini
2016. All Rights Reserved Approach to Data Provisioning DataAccessLayer Data provisioning Discovery Platform / Sandboxes Analytical Views Data Virtualization DataDissemination HR Mart 1 HR Mart 2 HR Mart 3 HR Mart 4 Characteristics The Data Marts & Aggregate Structures layer will include subject specific data mart structures which can be used by various tools to retrieve data and information. This layer will also support User specific Sandbox for power users to perform various activities such as data mining, identifying data patterns, running analytical and statistical model using various tools If required, there will be multiple versions of the subject areas for different production streams Data marts and aggregate structures such as summary tables will be created based on business and performance requirements. As far as possible, database managed aggregates such as computed views and indexes will be created to reduce ETL based data movement Data Virtualization will address combining datasets from multiple data stores across various layers in the data lake stack. Functionality: Provision data-sets to create various combinations of custom views – by specific functions/departments and also cross- functional access
16.
16Copyright © Capgemini
2016. All Rights Reserved © David Feinleib 16
Baixar agora