SlideShare uma empresa Scribd logo
1 de 33
Baixar para ler offline
Data Scientists:Myths &
Mathemagical Powers
      James Kobielus
James Kobielus shoots down
10 myths about Data Scientists



      “Data Scientists: Myths and Mathemagical Powers,”
    James Kobielus, Thinking Inside the Box, June 29, 2012
Myth #1




Data scientists are mythical
 beings, like the unicorns.
IBMbigdatahub.com
IBMbigdatahub.com
Myth #2




 Data scientists are an elite
bunch of precious eggheads.
Data scientists get their fingernails
  dirty dumping piles of data into
 analytical sandboxes, cleansing,
  and sifting through it for useful
patterns that may or may not exist.
  Then, they do it all over again.



              Reality #2    IBMbigdatahub.com
Data scientists get their fingernails
                  It’s ofte
               nu piles n mind- into
  dirty dumpingm
                     bingly
                           of data
 analytical sandboxes, detailed
                 grunt       cleansing,
             the sp      work,
                     ort of a n useful
  and sifting through it for ot
                             rm
              data por may chairexist.
patterns that may hiloso not
                             phers.
  Then, they do it all over again.



              Reality #2     IBMbigdatahub.com
Myth #3




Data scientists are a nouveau
   fad that will soon fade.
The term “data scientist” has been
around for years, and the various
   advanced analytics specialties
  that fall under it are even older.
Recently, the term has been used
 in the convergence of disciplines
    that have become super-hot.


             Reality #3    IBMbigdatahub.com
The term “data scientist” has been
around for years, and the various
   advanced analytics specialties
  that fall growth
               under      n job
                        iit are even older.
     Ste  ady the academic been used
Recently,and term has.
      st i ngs              iable
                   unden
    lithe convergence of disciplines
 in ricula is
    c ur               fad.
    that Thi   s is no
             have become super-hot.


                Reality #3       IBMbigdatahub.com
Myth #4




Data scientists are all just
  PhD statisticians who
 failed to make tenure.
Many data scientists acquired
 their quantitative and statistical
   modeling skills in college, but
   pursued degrees in business
  administration, economics and
engineering. They actually know
    about business problems.


            Reality #4     IBMbigdatahub.com
M ny
  Many dataascientists acquired
                   data s
                                c entis
            you’ll and istatistical
 their quantitativenco
                   e                    ts
            the wo           unter
   modeling skills rking
                    in college, but  in
          are bu                world
                 sine in business
   pursued degreesss dom
               sp e c ia            ain
  administration, economics and
                         l i st s !
engineering. They actually know
    about business problems.


               Reality #4       IBMbigdatahub.com
Myth #5




  Data scientists are just BI
specialists with fancier titles.
Many longtime BI power users
 are, in fact, data scientists of a
 sort. They are business domain
  specialists whose jobs involve
multivariate analysis, forecasting,
what-if modeling, and simulation.



             Reality #5   IBMbigdatahub.com
nt
                    meBI power users
 Many develop ey
       er longtime
 Care            i f th
                tdata scientists of a
 are,yintall ou speed
    a s fact, to
  m           p
           y uare business domain
 sort.t They e Hadoop
 do n’ sta ik
  on to ictiv
  specialists e mod     e ing.
        pics l whose ljobs involve
      pred
multivariate analysis, forecasting,
and
what-if modeling, and simulation.



             Reality #5     IBMbigdatahub.com
Myth #6




 Data scientists aren’t really
scientists in any meaningful
     sense of the word.
Statistical controls are the
  bedrock of true science—the core
responsibility of the data scientist. If
 data scientists are confirming their
 findings through statistical controls
and real-world experiments, they’re
     scientists, plain and simple.


               Reality #6     IBMbigdatahub.com
Statistical controls are the
  bedrock of true science—the core
responsibility of the data scientist. If
                  True s
                         cience
 data scientistsnare confirming their
                  othing         is
                           withou
 findings throughvstatistical tcontrols
               obser
                     ationa
                             l data
and real-world experiments, .they’re
     scientists, plain and simple.


               Reality #6     IBMbigdatahub.com
Myth #7




 Data scientists need fancy,
 expensive statistical power
tools to get their work done.
The job of the data scientists is to
 look for hidden patterns. They can
accomplish this through user-friendly
  visualization tools, search-driven
 BI tools and other approaches that
   don’t require a deep mastery of
          statistical analysis.


              Reality #7    IBMbigdatahub.com
The job of the data scientists is to
 look for hidden patterns. They can
accomplish rthisfo ory  r cost- user-friendly
               a ket through
      The m explorat
  visualization tools, y
           ctive            n search-driven
      effe           as ma g
 BI tools tools h cludin
        BI and other approaches that
   don’t end    ors, ina deep mastery of
        v require gnos.
             I BM C o
            statistical analysis.


                 Reality #7      IBMbigdatahub.com
Myth #8




Data scientists simply pour
data into Hadoop and pull
out mind-blowing insights.
The data scientist will be the
first to tell you that Hadoop is
just another platform for deep
      exploration into data.




           Reality #8    IBMbigdatahub.com
There
                      i n’t a
 The data scientistswill be the
              Ouija           magic
                     board
first to tell youich
               wh that Hadoop h
                             throug is
                      the big
just anotherspirits sp forddeep
                platform          ata
                        eak to
                 me e m
      exploration rintoodata. s   u
                           rtals.




             Reality #8       IBMbigdatahub.com
Myth #9




 Data scientists are analytics
junkies who couldn’t care less
 about business applications.
If you spend time with any real-
  world data scientist, they’ll bend
    your ear discussing how they
tackled a specific business problem,
 such as reducing customer churn,
  targeting offers across channels,
    and mitigating financial risks.


             Reality #9    IBMbigdatahub.com
If you spend time withnany real-
                              e t i st s
                       ta sci
  world data ost da rds. They bend
            Mscientist, they’ll
             are  n’t ne
    your ear discussing how    egarthey d
                       e ople r ingo
            kn  ow pbusinessl problem,
tackled a specific big data on.
            al l th is       g jarg churn,
                       u si n
 such as reducing fcustomer
             as con
  targeting offers across channels,
    and mitigating financial risks.


               Reality #9      IBMbigdatahub.com
Myth #10




Data scientists don’t have any
responsibilities that force them
   out of their ivory towers.
That used to be the case. However,
 as next best action and real-world
experiments become ubiquitous, the
  data scientist is evolving into the
  role that stokes, tweaks and fuels
        the operational engine.



             Reality #10   IBMbigdatahub.com
That used to be the case. However,
       Da best action and real-world
 as nextta scien
      analy        tists te
                            s the
            tic become t ubiquitous, the
experiments- cent
       at the        ric mo
                              dels
  data scientistrt oevolving into the
               hea is
       busine           f agile
               ss pro tweaks and fuels
  role that stokes,cess
                            es.
        the operational engine.



              Reality #10     IBMbigdatahub.com
For more from James Kobielus and
  other big data thought leaders,
     visit The Big Data Hub at
       IBMbigdatahub.com

Mais conteúdo relacionado

Mais procurados

Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...
Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...
Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...SlideTeam
 
Artificial Intelligence Overview Powerpoint Presentation Slides
Artificial Intelligence Overview Powerpoint Presentation SlidesArtificial Intelligence Overview Powerpoint Presentation Slides
Artificial Intelligence Overview Powerpoint Presentation SlidesSlideTeam
 
ChatGPT - AI.pdf
ChatGPT - AI.pdfChatGPT - AI.pdf
ChatGPT - AI.pdfBannoon1
 
Bias in Artificial Intelligence
Bias in Artificial IntelligenceBias in Artificial Intelligence
Bias in Artificial IntelligenceNeelima Kumar
 
Artificial intelligence - A human revolution
Artificial intelligence - A human revolutionArtificial intelligence - A human revolution
Artificial intelligence - A human revolutionAccenture BeLux
 
Big data Presentation
Big data PresentationBig data Presentation
Big data PresentationAswadmehar
 
What is Artificial Intelligence | Artificial Intelligence Tutorial For Beginn...
What is Artificial Intelligence | Artificial Intelligence Tutorial For Beginn...What is Artificial Intelligence | Artificial Intelligence Tutorial For Beginn...
What is Artificial Intelligence | Artificial Intelligence Tutorial For Beginn...Edureka!
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1DianaGray10
 
Artificial Intelligence High Technology PowerPoint Presentation Slides Comple...
Artificial Intelligence High Technology PowerPoint Presentation Slides Comple...Artificial Intelligence High Technology PowerPoint Presentation Slides Comple...
Artificial Intelligence High Technology PowerPoint Presentation Slides Comple...SlideTeam
 
The future of big data analytics
The future of big data analyticsThe future of big data analytics
The future of big data analyticsAhmed Banafa
 
Artificial intelligence and its application
Artificial intelligence and its applicationArtificial intelligence and its application
Artificial intelligence and its applicationMohammed Abdel Razek
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceLukas Masuch
 
A Data Driven Roadmap to Enterprise AI Strategy (Sponsored by Contino) - AWS ...
A Data Driven Roadmap to Enterprise AI Strategy (Sponsored by Contino) - AWS ...A Data Driven Roadmap to Enterprise AI Strategy (Sponsored by Contino) - AWS ...
A Data Driven Roadmap to Enterprise AI Strategy (Sponsored by Contino) - AWS ...Amazon Web Services
 
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...ssuser4edc93
 
Large Language Models Bootcamp
Large Language Models BootcampLarge Language Models Bootcamp
Large Language Models BootcampData Science Dojo
 
Deep learning health care
Deep learning health care  Deep learning health care
Deep learning health care Meenakshi Sood
 

Mais procurados (20)

Machine learning
Machine learningMachine learning
Machine learning
 
Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...
Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...
Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...
 
Artificial Intelligence Overview Powerpoint Presentation Slides
Artificial Intelligence Overview Powerpoint Presentation SlidesArtificial Intelligence Overview Powerpoint Presentation Slides
Artificial Intelligence Overview Powerpoint Presentation Slides
 
ChatGPT - AI.pdf
ChatGPT - AI.pdfChatGPT - AI.pdf
ChatGPT - AI.pdf
 
Bias in Artificial Intelligence
Bias in Artificial IntelligenceBias in Artificial Intelligence
Bias in Artificial Intelligence
 
Artificial intelligence - A human revolution
Artificial intelligence - A human revolutionArtificial intelligence - A human revolution
Artificial intelligence - A human revolution
 
Generative AI
Generative AIGenerative AI
Generative AI
 
Big data
Big dataBig data
Big data
 
Big data Presentation
Big data PresentationBig data Presentation
Big data Presentation
 
What is Artificial Intelligence | Artificial Intelligence Tutorial For Beginn...
What is Artificial Intelligence | Artificial Intelligence Tutorial For Beginn...What is Artificial Intelligence | Artificial Intelligence Tutorial For Beginn...
What is Artificial Intelligence | Artificial Intelligence Tutorial For Beginn...
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
 
Artificial Intelligence High Technology PowerPoint Presentation Slides Comple...
Artificial Intelligence High Technology PowerPoint Presentation Slides Comple...Artificial Intelligence High Technology PowerPoint Presentation Slides Comple...
Artificial Intelligence High Technology PowerPoint Presentation Slides Comple...
 
The future of big data analytics
The future of big data analyticsThe future of big data analytics
The future of big data analytics
 
Artificial intelligence and its application
Artificial intelligence and its applicationArtificial intelligence and its application
Artificial intelligence and its application
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence
 
A Data Driven Roadmap to Enterprise AI Strategy (Sponsored by Contino) - AWS ...
A Data Driven Roadmap to Enterprise AI Strategy (Sponsored by Contino) - AWS ...A Data Driven Roadmap to Enterprise AI Strategy (Sponsored by Contino) - AWS ...
A Data Driven Roadmap to Enterprise AI Strategy (Sponsored by Contino) - AWS ...
 
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
 
Large Language Models Bootcamp
Large Language Models BootcampLarge Language Models Bootcamp
Large Language Models Bootcamp
 
Deep learning health care
Deep learning health care  Deep learning health care
Deep learning health care
 
What is big data?
What is big data?What is big data?
What is big data?
 

Destaque

Artificial Intelligence Presentation
Artificial Intelligence PresentationArtificial Intelligence Presentation
Artificial Intelligence Presentationlpaviglianiti
 
Hands-on Deep Learning in Python
Hands-on Deep Learning in PythonHands-on Deep Learning in Python
Hands-on Deep Learning in PythonImry Kissos
 
A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)Prof. Dr. Diego Kuonen
 
How to Interview a Data Scientist
How to Interview a Data ScientistHow to Interview a Data Scientist
How to Interview a Data ScientistDaniel Tunkelang
 
Data By The People, For The People
Data By The People, For The PeopleData By The People, For The People
Data By The People, For The PeopleDaniel Tunkelang
 
Hadoop and Machine Learning
Hadoop and Machine LearningHadoop and Machine Learning
Hadoop and Machine Learningjoshwills
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning SystemsXavier Amatriain
 
How to Become a Data Scientist
How to Become a Data ScientistHow to Become a Data Scientist
How to Become a Data Scientistryanorban
 
A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013Philip Zheng
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDevashish Shanker
 
Introduction to Mahout and Machine Learning
Introduction to Mahout and Machine LearningIntroduction to Mahout and Machine Learning
Introduction to Mahout and Machine LearningVarad Meru
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...Sebastian Raschka
 
Machine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification RulesMachine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification RulesPier Luca Lanzi
 
Tutorial on Deep learning and Applications
Tutorial on Deep learning and ApplicationsTutorial on Deep learning and Applications
Tutorial on Deep learning and ApplicationsNhatHai Phan
 
Tips for data science competitions
Tips for data science competitionsTips for data science competitions
Tips for data science competitionsOwen Zhang
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networksSi Haem
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningLars Marius Garshol
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural networkDEEPASHRI HK
 
10 R Packages to Win Kaggle Competitions
10 R Packages to Win Kaggle Competitions10 R Packages to Win Kaggle Competitions
10 R Packages to Win Kaggle CompetitionsDataRobot
 

Destaque (20)

Artificial Intelligence Presentation
Artificial Intelligence PresentationArtificial Intelligence Presentation
Artificial Intelligence Presentation
 
Hands-on Deep Learning in Python
Hands-on Deep Learning in PythonHands-on Deep Learning in Python
Hands-on Deep Learning in Python
 
A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)
 
How to Interview a Data Scientist
How to Interview a Data ScientistHow to Interview a Data Scientist
How to Interview a Data Scientist
 
Data By The People, For The People
Data By The People, For The PeopleData By The People, For The People
Data By The People, For The People
 
Hadoop and Machine Learning
Hadoop and Machine LearningHadoop and Machine Learning
Hadoop and Machine Learning
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 
How to Become a Data Scientist
How to Become a Data ScientistHow to Become a Data Scientist
How to Become a Data Scientist
 
A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
Introduction to Mahout and Machine Learning
Introduction to Mahout and Machine LearningIntroduction to Mahout and Machine Learning
Introduction to Mahout and Machine Learning
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
 
Machine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification RulesMachine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification Rules
 
Tutorial on Deep learning and Applications
Tutorial on Deep learning and ApplicationsTutorial on Deep learning and Applications
Tutorial on Deep learning and Applications
 
Tips for data science competitions
Tips for data science competitionsTips for data science competitions
Tips for data science competitions
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networks
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine Learning
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
10 R Packages to Win Kaggle Competitions
10 R Packages to Win Kaggle Competitions10 R Packages to Win Kaggle Competitions
10 R Packages to Win Kaggle Competitions
 
Robots
RobotsRobots
Robots
 

Semelhante a Myths and Mathemagical Superpowers of Data Scientists

Myths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data ScientistsMyths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data ScientistsIBM Analytics
 
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...Inside Analysis
 
Big Data for Beginners
Big Data for BeginnersBig Data for Beginners
Big Data for BeginnersMichael Perez
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)mark madsen
 
Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...
Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...
Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...Garrett Teoh Hor Keong
 
Understanding the New World of Cognitive Computing
Understanding the New World of Cognitive ComputingUnderstanding the New World of Cognitive Computing
Understanding the New World of Cognitive ComputingDATAVERSITY
 
Big Data; Big Potential: How to find the talent who can harness its power
Big Data; Big Potential: How to find the talent who can harness its powerBig Data; Big Potential: How to find the talent who can harness its power
Big Data; Big Potential: How to find the talent who can harness its powerLucas Group
 
Who is a data scientist
Who is a data scientist  Who is a data scientist
Who is a data scientist prateek kumar
 
20 Emerging influencers in 2020 for big data
20 Emerging influencers in 2020 for big data20 Emerging influencers in 2020 for big data
20 Emerging influencers in 2020 for big dataRiver11river
 
Embracing data science
Embracing data scienceEmbracing data science
Embracing data scienceVipul Kalamkar
 
Ch1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptxCh1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptxAbderrahmanABID2
 
Realism credai dec 2010 article
Realism credai dec 2010 articleRealism credai dec 2010 article
Realism credai dec 2010 articlerealism.IN
 
Top 10 areas of expertise in data science
Top 10 areas of expertise in data scienceTop 10 areas of expertise in data science
Top 10 areas of expertise in data scienceGlobalTechCouncil
 
Big Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin MalhotraBig Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin MalhotraVin Malhotra
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactDr. Sunil Kr. Pandey
 
Data Scientist - Good Rebels -
Data Scientist - Good Rebels -Data Scientist - Good Rebels -
Data Scientist - Good Rebels -Good Rebels
 

Semelhante a Myths and Mathemagical Superpowers of Data Scientists (20)

Myths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data ScientistsMyths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data Scientists
 
Data science
Data scienceData science
Data science
 
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
 
Data scientist
Data scientistData scientist
Data scientist
 
Big Data for Beginners
Big Data for BeginnersBig Data for Beginners
Big Data for Beginners
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
 
Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...
Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...
Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...
 
Understanding the New World of Cognitive Computing
Understanding the New World of Cognitive ComputingUnderstanding the New World of Cognitive Computing
Understanding the New World of Cognitive Computing
 
Big Data; Big Potential: How to find the talent who can harness its power
Big Data; Big Potential: How to find the talent who can harness its powerBig Data; Big Potential: How to find the talent who can harness its power
Big Data; Big Potential: How to find the talent who can harness its power
 
Who is a data scientist
Who is a data scientist  Who is a data scientist
Who is a data scientist
 
Data science
Data scienceData science
Data science
 
20 Emerging influencers in 2020 for big data
20 Emerging influencers in 2020 for big data20 Emerging influencers in 2020 for big data
20 Emerging influencers in 2020 for big data
 
Embracing data science
Embracing data scienceEmbracing data science
Embracing data science
 
Big Data Challenges
Big Data ChallengesBig Data Challenges
Big Data Challenges
 
Ch1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptxCh1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptx
 
Realism credai dec 2010 article
Realism credai dec 2010 articleRealism credai dec 2010 article
Realism credai dec 2010 article
 
Top 10 areas of expertise in data science
Top 10 areas of expertise in data scienceTop 10 areas of expertise in data science
Top 10 areas of expertise in data science
 
Big Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin MalhotraBig Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin Malhotra
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
 
Data Scientist - Good Rebels -
Data Scientist - Good Rebels -Data Scientist - Good Rebels -
Data Scientist - Good Rebels -
 

Mais de David Pittman

Cloud Infrastructure & IT Optimization Expo Highlights
Cloud Infrastructure & IT Optimization Expo HighlightsCloud Infrastructure & IT Optimization Expo Highlights
Cloud Infrastructure & IT Optimization Expo HighlightsDavid Pittman
 
Data, Analytics and the Insurance Industry
Data, Analytics and the Insurance IndustryData, Analytics and the Insurance Industry
Data, Analytics and the Insurance IndustryDavid Pittman
 
Big Data & Analytics and the Retail Industry: Luxottica
Big Data & Analytics and the Retail Industry: Luxottica Big Data & Analytics and the Retail Industry: Luxottica
Big Data & Analytics and the Retail Industry: Luxottica David Pittman
 
Seattle Children's Hospital turns Big Data into better care
Seattle Children's Hospital turns Big Data into better careSeattle Children's Hospital turns Big Data into better care
Seattle Children's Hospital turns Big Data into better careDavid Pittman
 
First Tennessee Bank: applying analytics to drive higher ROI from market prog...
First Tennessee Bank: applying analytics to drive higher ROI from market prog...First Tennessee Bank: applying analytics to drive higher ROI from market prog...
First Tennessee Bank: applying analytics to drive higher ROI from market prog...David Pittman
 
Acquire, grow and retain customers with IBM Big Data & Analytics - Client Exa...
Acquire, grow and retain customers with IBM Big Data & Analytics - Client Exa...Acquire, grow and retain customers with IBM Big Data & Analytics - Client Exa...
Acquire, grow and retain customers with IBM Big Data & Analytics - Client Exa...David Pittman
 
Infographic: Big Data Exploration
Infographic: Big Data ExplorationInfographic: Big Data Exploration
Infographic: Big Data ExplorationDavid Pittman
 
Big Data in Retail - Examples in Action
Big Data in Retail - Examples in ActionBig Data in Retail - Examples in Action
Big Data in Retail - Examples in ActionDavid Pittman
 
Analytics: The Real-world Use of Big Data
Analytics: The Real-world Use of Big DataAnalytics: The Real-world Use of Big Data
Analytics: The Real-world Use of Big DataDavid Pittman
 

Mais de David Pittman (9)

Cloud Infrastructure & IT Optimization Expo Highlights
Cloud Infrastructure & IT Optimization Expo HighlightsCloud Infrastructure & IT Optimization Expo Highlights
Cloud Infrastructure & IT Optimization Expo Highlights
 
Data, Analytics and the Insurance Industry
Data, Analytics and the Insurance IndustryData, Analytics and the Insurance Industry
Data, Analytics and the Insurance Industry
 
Big Data & Analytics and the Retail Industry: Luxottica
Big Data & Analytics and the Retail Industry: Luxottica Big Data & Analytics and the Retail Industry: Luxottica
Big Data & Analytics and the Retail Industry: Luxottica
 
Seattle Children's Hospital turns Big Data into better care
Seattle Children's Hospital turns Big Data into better careSeattle Children's Hospital turns Big Data into better care
Seattle Children's Hospital turns Big Data into better care
 
First Tennessee Bank: applying analytics to drive higher ROI from market prog...
First Tennessee Bank: applying analytics to drive higher ROI from market prog...First Tennessee Bank: applying analytics to drive higher ROI from market prog...
First Tennessee Bank: applying analytics to drive higher ROI from market prog...
 
Acquire, grow and retain customers with IBM Big Data & Analytics - Client Exa...
Acquire, grow and retain customers with IBM Big Data & Analytics - Client Exa...Acquire, grow and retain customers with IBM Big Data & Analytics - Client Exa...
Acquire, grow and retain customers with IBM Big Data & Analytics - Client Exa...
 
Infographic: Big Data Exploration
Infographic: Big Data ExplorationInfographic: Big Data Exploration
Infographic: Big Data Exploration
 
Big Data in Retail - Examples in Action
Big Data in Retail - Examples in ActionBig Data in Retail - Examples in Action
Big Data in Retail - Examples in Action
 
Analytics: The Real-world Use of Big Data
Analytics: The Real-world Use of Big DataAnalytics: The Real-world Use of Big Data
Analytics: The Real-world Use of Big Data
 

Último

Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.francesco barbera
 
Spring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfSpring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfAnna Loughnan Colquhoun
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncObject Automation
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum ComputingGDSC PJATK
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
RAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIRAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIUdaiappa Ramachandran
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServiceRenan Moreira de Oliveira
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 

Último (20)

Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.
 
Spring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfSpring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdf
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation Inc
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum Computing
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
RAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIRAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AI
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 

Myths and Mathemagical Superpowers of Data Scientists

  • 1. Data Scientists:Myths & Mathemagical Powers James Kobielus
  • 2. James Kobielus shoots down 10 myths about Data Scientists “Data Scientists: Myths and Mathemagical Powers,” James Kobielus, Thinking Inside the Box, June 29, 2012
  • 3. Myth #1 Data scientists are mythical beings, like the unicorns.
  • 6. Myth #2 Data scientists are an elite bunch of precious eggheads.
  • 7. Data scientists get their fingernails dirty dumping piles of data into analytical sandboxes, cleansing, and sifting through it for useful patterns that may or may not exist. Then, they do it all over again. Reality #2 IBMbigdatahub.com
  • 8. Data scientists get their fingernails It’s ofte nu piles n mind- into dirty dumpingm bingly of data analytical sandboxes, detailed grunt cleansing, the sp work, ort of a n useful and sifting through it for ot rm data por may chairexist. patterns that may hiloso not phers. Then, they do it all over again. Reality #2 IBMbigdatahub.com
  • 9. Myth #3 Data scientists are a nouveau fad that will soon fade.
  • 10. The term “data scientist” has been around for years, and the various advanced analytics specialties that fall under it are even older. Recently, the term has been used in the convergence of disciplines that have become super-hot. Reality #3 IBMbigdatahub.com
  • 11. The term “data scientist” has been around for years, and the various advanced analytics specialties that fall growth under n job iit are even older. Ste ady the academic been used Recently,and term has. st i ngs iable unden lithe convergence of disciplines in ricula is c ur fad. that Thi s is no have become super-hot. Reality #3 IBMbigdatahub.com
  • 12. Myth #4 Data scientists are all just PhD statisticians who failed to make tenure.
  • 13. Many data scientists acquired their quantitative and statistical modeling skills in college, but pursued degrees in business administration, economics and engineering. They actually know about business problems. Reality #4 IBMbigdatahub.com
  • 14. M ny Many dataascientists acquired data s c entis you’ll and istatistical their quantitativenco e ts the wo unter modeling skills rking in college, but in are bu world sine in business pursued degreesss dom sp e c ia ain administration, economics and l i st s ! engineering. They actually know about business problems. Reality #4 IBMbigdatahub.com
  • 15. Myth #5 Data scientists are just BI specialists with fancier titles.
  • 16. Many longtime BI power users are, in fact, data scientists of a sort. They are business domain specialists whose jobs involve multivariate analysis, forecasting, what-if modeling, and simulation. Reality #5 IBMbigdatahub.com
  • 17. nt meBI power users Many develop ey er longtime Care i f th tdata scientists of a are,yintall ou speed a s fact, to m p y uare business domain sort.t They e Hadoop do n’ sta ik on to ictiv specialists e mod e ing. pics l whose ljobs involve pred multivariate analysis, forecasting, and what-if modeling, and simulation. Reality #5 IBMbigdatahub.com
  • 18. Myth #6 Data scientists aren’t really scientists in any meaningful sense of the word.
  • 19. Statistical controls are the bedrock of true science—the core responsibility of the data scientist. If data scientists are confirming their findings through statistical controls and real-world experiments, they’re scientists, plain and simple. Reality #6 IBMbigdatahub.com
  • 20. Statistical controls are the bedrock of true science—the core responsibility of the data scientist. If True s cience data scientistsnare confirming their othing is withou findings throughvstatistical tcontrols obser ationa l data and real-world experiments, .they’re scientists, plain and simple. Reality #6 IBMbigdatahub.com
  • 21. Myth #7 Data scientists need fancy, expensive statistical power tools to get their work done.
  • 22. The job of the data scientists is to look for hidden patterns. They can accomplish this through user-friendly visualization tools, search-driven BI tools and other approaches that don’t require a deep mastery of statistical analysis. Reality #7 IBMbigdatahub.com
  • 23. The job of the data scientists is to look for hidden patterns. They can accomplish rthisfo ory r cost- user-friendly a ket through The m explorat visualization tools, y ctive n search-driven effe as ma g BI tools tools h cludin BI and other approaches that don’t end ors, ina deep mastery of v require gnos. I BM C o statistical analysis. Reality #7 IBMbigdatahub.com
  • 24. Myth #8 Data scientists simply pour data into Hadoop and pull out mind-blowing insights.
  • 25. The data scientist will be the first to tell you that Hadoop is just another platform for deep exploration into data. Reality #8 IBMbigdatahub.com
  • 26. There i n’t a The data scientistswill be the Ouija magic board first to tell youich wh that Hadoop h throug is the big just anotherspirits sp forddeep platform ata eak to me e m exploration rintoodata. s u rtals. Reality #8 IBMbigdatahub.com
  • 27. Myth #9 Data scientists are analytics junkies who couldn’t care less about business applications.
  • 28. If you spend time with any real- world data scientist, they’ll bend your ear discussing how they tackled a specific business problem, such as reducing customer churn, targeting offers across channels, and mitigating financial risks. Reality #9 IBMbigdatahub.com
  • 29. If you spend time withnany real- e t i st s ta sci world data ost da rds. They bend Mscientist, they’ll are n’t ne your ear discussing how egarthey d e ople r ingo kn ow pbusinessl problem, tackled a specific big data on. al l th is g jarg churn, u si n such as reducing fcustomer as con targeting offers across channels, and mitigating financial risks. Reality #9 IBMbigdatahub.com
  • 30. Myth #10 Data scientists don’t have any responsibilities that force them out of their ivory towers.
  • 31. That used to be the case. However, as next best action and real-world experiments become ubiquitous, the data scientist is evolving into the role that stokes, tweaks and fuels the operational engine. Reality #10 IBMbigdatahub.com
  • 32. That used to be the case. However, Da best action and real-world as nextta scien analy tists te s the tic become t ubiquitous, the experiments- cent at the ric mo dels data scientistrt oevolving into the hea is busine f agile ss pro tweaks and fuels role that stokes,cess es. the operational engine. Reality #10 IBMbigdatahub.com
  • 33. For more from James Kobielus and other big data thought leaders, visit The Big Data Hub at IBMbigdatahub.com