SlideShare uma empresa Scribd logo
1 de 200
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Agenda
❖ Evolution of Data
❖ Introduction To Data Science
❖ Data Science Careers and Salary
❖ Statistics for Data Science
❖ What is Machine Learning?
❖ Types of Machine Learning
❖ What is Deep Learning?
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Evolution of Data
2.5 x 1018 Bytes
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Evolution of Data
3 MILLION
4.3 MILLION
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Data science, also known as data-driven science, is an
interdisciplinary field about scientific methods, processes, and
systems to extract knowledge or insights from data in various
forms, either structured or unstructured.
What is Data Science?
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Career : Data Science
Data Analyst AI/ML Engineer Data Scientist
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Data Analyst
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Data Scientist
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Machine Learning Engineer
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Salary Trends
Average Salary (US)
Average Salary (IND)
Data Analyst Data Scientist ML Engineers
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
RoadMap
Earn a Bachelor’s Degree
• Computer Science
• Mathematics
• Information Technology
• Statistics
• Finance
• Economics
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
RoadMap
edureka!
Earn a Bachelor’s Degree
Fine Tune Technical Skills
• Statistical methods and Packages
• R/Python and/or SAS languages
• Data warehousing and BI
• Data Cleaning, Visualization and Reporting Techniques
• Working knowledge of Hadoop & MapReduce and Machine learning techniques
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
RoadMap
edureka!
Earn a Bachelor’s Degree
Fine Tune Technical Skills
Develop Business Skills
• Analytic Problem-Solving
• Effective Communication
• Creative Thinking
• Industry Knowledge
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
RoadMap
edureka!
Earn a Bachelor’s Degree
Fine Tune Technical Skills
Develop Business Skills
Master’s Degree or Certifications Programs
• MS/MTech in CS, Statistics, ML
• Big Data Certifications
• Data Analysis / Machine Learning
Certifications
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
RoadMap
edureka!
Earn a Bachelor’s Degree
Fine Tune Technical Skills
Develop Business Skills
Master’s Degree or Certifications Programs
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
RoadMap
edureka!
Earn a Bachelor’s Degree
Fine Tune Technical Skills
Develop Business Skills
Master’s Degree or Certifications Programs
Work on Projects
Related to Field
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Data Analyst Skills
Analytical skills Communication skills Critical thinking
Attention to detail Mathematics skills Technical skills/tools
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Data Scientist Skills
Analytics & Statistics Machine Learning Algorithms Problem Solving Skills
Deep Learning Business Communication Technical skills/tools
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
ML Engineer Skills
Programming Languages Calculus & Statistics Signal Processing
Applied Maths Neural Networks Language Processing
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Statistics
Data Science Peripherals
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Statistics Prog Languages
Data Science Peripherals
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Statistics Prog Languages Software
Data Science Peripherals
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Statistics Prog Languages Software Machine Learning
Data Science Peripherals
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Statistics Prog Languages Software Machine Learning Big Data
Data Science Peripherals
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
What is Data ?
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
What is Data ?
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Variables & Research
Independent variable
Dependentvariable
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Population & Sampling
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Population & Sampling
Non-Probability
Probability
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Population & Sampling
Non-Probability
Probability
Random sampling
Systematic sampling
Stratified sampling
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Measures of Center
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Measures of Spread
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Measures of Spread
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Skewness
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Confusion Matrix
You can calculate the accuracy of your model with:
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Probability
Probability is the measure of how likely something will occur
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Probability
The equation describing a continuous probability distribution is called
a probability density function.
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Probability
The normal distribution is a probability distribution that associates the
normal random variable X with a cumulative probability .
The normal distribution is defined by the following equation:
Y = [ 1/σ * sqrt(2π) ] * e -(x - μ)2/2σ2
Where,
X is a normal random variable.
μ is the mean and
σ is the standard deviation.
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Probability
The central limit theorem states that the sampling distribution of the
mean of any independent, random variable will be normal or nearly
normal, if the sample size is large enough.
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
What is Machine Learning?
Machine Learning is a class of algorithms which is data-driven, i.e. unlike "normal"
algorithms it is the data that "tells" what the "good answer" is
Getting computers to program themselves and also teaching them to make decisions using data
“Where writing software is the bottleneck, let the data do the work instead.”
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Features of Machine Learning
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
How It Works?
Learn from Data
Find Hidden Insights
Train and Grow
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Applications of Machine Learning
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Applications of Machine Learning
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Applications of Machine Learning
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Market Trend: Machine Learning
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Machine Learning Life Cycle
Step 1 Step 2 Step 3 Step 4 Step 5 Step 6
Collecting Data
Data Wrangling
Analyse Data
Train Algorithm
Test Algorithm
Deployment
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
4
5
6
1
2
3
Step 1: Collecting Data
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
4
5
6
1
2
3
Data Wrangling
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
4
5
6
1
2
3
Analyse Data
model
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
4
5
6
1
2
3
Train Algorithm
model
Training set
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
4
5
6
1
2
3
Test Algorithm
model
Test set Accurate?
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
4
5
6
1
2
3
Operation and Optimization
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Important Python Libraries
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Types of Machine Learning
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Supervised Learning
It’s a
Face
Label
Face
Label
Non-Face
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Unsupervised Learning
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Reinforcement Learning
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Supervised Learning
Supervised learning is where you have input variables (x) and an output variable (Y)
and you use an algorithm to learn the mapping function from the input to the output
It is called Supervised Learning because the process of an algorithm learning from the training
dataset can be thought as a teacher supervising the learning process
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Supervised Learning
Training and Testing
Prediction
Historical
Data
Random
Sampling
Training
Dataset
Testing
Dataset
Machine
Learning
Statistical
Models
Prediction
&
Testing
Model Validation
Outcome
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Supervised Learning
Training and Testing
Prediction
New
Data
Model Predicted Outcome
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Supervised Learning Algorithms
Linear Regression
Logistic Regression
Decision Tree
Random Forest
Naïve Bayes Classifier
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Linear Regression
Linear Regression Analysis is a powerful technique used for predicting the unknown value of a
variable (Dependent Variable) from the known value of another variables (Independent Variable)
• A Dependent Variable(DV) is the variable to be predicted or explained in
a regression model
• An Independent Variable(IDV) is the variable related to the dependent
variable in a regression equation
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Simple Linear Regression
Dependent Variable Independent Variable
Y = a + bX
Y - Intercept Slope of the Line
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Regression Line
Linear Regression Analysis is a powerful technique used for predicting the unknown value of a variable
(Dependent Variable) from The regression line is simply a single line that best fits the data
(In terms of having the smallest overall distance from the line to the points)
Fitted Points
Regression Line
Copyright © 2019, edureka and/or its affiliates. All rights reserved.
Demo
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Hi I am john, I need some
baseline for pricing my Villas
and Independent Houses
Real Estate Company Use Case
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Dataset Description
Column Description
CRIM per capita crime rate by town
ZN proportion of residential land zoned for lots over 25,000 sq.ft.
INDUS proportion of non-retail business acres per town.
CHAS Charles River dummy variable (1 if tract bounds river; 0 otherwise)
NOX nitric oxides concentration (parts per 10 million)
RM average number of rooms per dwelling
AGE proportion of owner-occupied units built prior to 1940
DIS weighted distances to five Boston employment centres
RAD index of accessibility to radial highways
TAX full-value property-tax rate per $10,000
PTRATIO pupil-teacher ratio by town
B 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
LSTAT % lower status of the population
MEDV Median value of owner-occupied homes in $1000's
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Steps
Step 1 Step 2 Step 3 Step 4 Step 5 Step 6
Collecting Data
Data Wrangling
Analyse Data
Train Algorithm
Test Algorithm
Deployment
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Model Fitting
Fitting a model means that you're making your algorithm learn the relationship between predictors
and outcome so that you can predict the future values of the outcome .
So the best fitted model has a specific set of parameters which best defines the problem at hand
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Types of Fitting
Machine Learning algorithms first attempt to solve the problem of under-fitting; that is, of taking a line
that does not approximate the data well, and making it to approximate the data better.
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Need For Logistic Regression
WHO WILL WIN ?
Here, the best fit line in linear regression
is going below 0 and above 1
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
What is Logistic Regression?
The outcome(result)
will be binary(0/1)
0- If malignant
1- If benign
Logistic Regression is a statistical method for analysing a dataset in which there are one or more
independent variables that determine an outcome.
Outcome is a binary class type.
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
What is Logistic Regression?
Based on the threshold value set, we
decide the output from the function
The Logistic Regression Curve is called as “Sigmoid Curve”, also known as S-Curve
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
What is Polynomial Regression?
When we have non linear data, which can’t be predicted with a linear model. We switch to
Polynomial Regression.
Such a scenario is shown in the below graph
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
What is a Decision Tree?
A decision tree is a tree-like structure in which internal node represents test on an attribute
• Each branch represents outcome of test and each leaf
node represents class label (decision taken after
computing all attributes)
• A path from root to leaf represents classification rules.
True False
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Building a Decision Tree
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Building a Decision Tree
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Building a Decision Tree
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Building a Decision Tree
Copyright © 2019, edureka and/or its affiliates. All rights reserved.
Demo
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
What is Random Forest?
Random Forest is an ensemble classifier made using many Decision tree models
What are Ensemble models? How is it better from Decision Trees ?
Copyright © 2019, edureka and/or its affiliates. All rights reserved.
Demo
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
What is Naïve Bayes?
Random Forest is an ensemble classifier made using many Decision tree models
What are Ensemble models? How is it better from Random Forest?
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
What is Naïve Bayes ?
Naive Bayes is a simple but surprisingly powerful algorithm
for predictive modeling.
Bayes
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Bayes’ Theorem
Given a hypothesis H and evidence E , Bayes' theorem states
that the relationship between the probability of the hypothesis
before getting the evidence P(H) and the probability of the
hypothesis after getting the evidence P(H|E) is
P(H|E) = P(E|H).P(H)
P(E)
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Bayes’ Theorem Example
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Bayes’ Theorem Example
P(King) = 4/52 =1/13
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Bayes’ Theorem Example
P(King) = 4/52 =1/13
P(King|Face) = P(Face|King).P(King)
P(Face)
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Bayes’ Theorem Example
P(King) = 4/52 =1/13
P(King|Face) = P(Face|King).P(King)
P(Face)
P(Face|King) = 1
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Bayes’ Theorem Example
P(King) = 4/52 =1/13
P(King|Face) = P(Face|King).P(King)
P(Face)
P(Face|King) = 1
P(Face) =12/52 = 3/13
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Bayes’ Theorem Example
P(A|B) = P(A∩B)
P(B)
P(B|A) = P(B∩A)
P(A) P(A∩B) = P(A|B).P(B) = P(B|A).P(A)
= P(A|B) = P(B|A).P(A)
P(B)
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Bayes’ Theorem Proof
P(H|E) = P(E|H).P(H)
P(E)
Likelihood
How probable is the evidence
Given that our hypothesis is true?
Posterior
How probable is our Hypothesis
Given the observed evidence?
(Not directly computable)
Prior
How probable was our hypothesis
Before observing the evidence?
Marginal
How probable is the new evidence
Under all possible hypothesis?
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Naïve Bayes: Working
PYTHON CERTIFICATION TRAINING www.edureka.co/python
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Classification Steps
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Classification Steps
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Classification Steps
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Classification Steps
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Classification Steps
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Industrial Use Cases
News Categorization Spam Filtering
Weather Predictions
Copyright © 2019, edureka and/or its affiliates. All rights reserved.
Demo
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Unsupervised Learning
Unsupervised learning is a type of machine learning algorithm used to draw inferences from datasets
consisting of input data Without labelled responses
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Unsupervised Learning: Process Flow
Training data is collection of
information without any label
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
What is Clustering?
“Clustering is the process of dividing the datasets into
groups, consisting of similar data-points”
It means grouping of objects based on the information
found in the data, describing the objects or their
relationship
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Why is Clustering Used?
The goal of clustering is to determine the intrinsic grouping
in a set of Unlabelled Data
Sometimes, Partitioning is the goal
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Where is it used?
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Types of Clustering
Exclusive Clustering
Overlapping Clustering
Hierarchical Clustering
K-Means Clustering
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Types of Clustering
Exclusive Clustering
Overlapping Clustering
Hierarchical Clustering
C-Means Clustering
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Types of Clustering
Exclusive Clustering
Overlapping Clustering
Hierarchical Clustering
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
K-Means Clustering
The process by which objects are classified into a
predefined number of groups so that they are as
much dissimilar as possible from one group to
another group, but as much similar as possible
within each group.
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
K-Means Algorithm Working
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
First we need to decide the number of clusters to be made. (Guessing)
K-Means Clustering : Steps
1
Let’s assume , Number of clusters = 3
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
First we need to decide the number of clusters to be made. (Guessing)
Then we provide centroids of all the clusters. (Guessing)
K-Means Clustering : Steps
1
2
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
First we need to decide the number of clusters to be made. (Guessing)
Then we provide centroids of all the clusters. (Guessing)
The Algorithm calculates Euclidian distance of the points from each
centroid and assigns the point to the closest cluster.
K-Means Clustering : Steps
1
2
3
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
First we need to decide the number of clusters to be made. (Guessing)
Then we provide centroids of all the clusters. (Guessing)
The Algorithm calculates Euclidian distance of the points from each
centroid and assigns the point to the closest cluster.
Next the Centroids are calculated again, when we have our new cluster.
K-Means Clustering : Steps
1
2
3
4
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
First we need to decide the number of clusters to be made. (Guessing)
Then we provide centroids of all the clusters. (Guessing)
The Algorithm calculates Euclidian distance of the points from each
centroid and assigns the point to the closest cluster.
Next the Centroids are calculated again, when we have our new cluster.
The distance of the points from the centre of clusters are calculated
again and points are assigned to the closest cluster.
K-Means Clustering : Steps
1
2
3
4
5
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
First we need to decide the number of clusters to be made. (Guessing)
Then we provide centroids of all the clusters. (Guessing)
The Algorithm calculates Euclidian distance of the points from each
centroid and assigns the point to the closest cluster.
Next the Centroids are calculated again, when we have our new cluster.
The distance of the points from the centre of clusters are calculated
again and points are assigned to the closest cluster.
And then again the new centroid for the cluster is calculated.
K-Means Clustering : Steps
1
2
3
4
5
6
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
First we need to decide the number of clusters to be made. (Guessing)
Then we provide centroids of all the clusters. (Guessing)
The Algorithm calculates Euclidian distance of the points from each
centroid and assigns the point to the closest cluster.
Next the Centroids are calculated again, when we have our new cluster.
The distance of the points from the centre of clusters are calculated
again and points are assigned to the closest cluster.
And then again the new centroid for the cluster is calculated.
These steps are repeated until we have a repetition in centroids or new
centroids are very close to the previous ones.
K-Means Clustering : Steps
1
2
3
4
5
6
7
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
How to Decide the number of Clusters
The Elbow Method :
First of all, compute the sum of squared error (SSE) for some values of k (for
example 2, 4, 6, 8, etc.). The SSE is defined as the sum of the squared distance
between each member of the cluster and its centroid. Mathematically:
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Pros and Cons: K-Means Clustering
• Simple, understandable
• Items automatically assigned to clusters
• Must define number of clusters
• All items forced into clusters
• Unable to handle noisy data and outliers
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Fuzzy C – Means Clustering
Fuzzy C-Means is an extension of K-Means, the popular simple clustering technique
Fuzzy clustering (also referred to as soft clustering) is a form of Clustering
in which each data point can belong to more than one cluster
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Pros and Cons: C-Means Clustering
• Allows a data point to be in multiple clusters
• A more natural representation of the behaviour of genes
• Genes usually are involved in multiple functions
• Need to define c, the number of clusters
• Need to determine membership cut-off value
• Clusters are sensitive to initial assignment of centroids
• Fuzzy c-means is not a deterministic algorithm
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Hierarchical Clustering
Hierarchical clustering is an alternative approach which builds a hierarchy from the bottom-up,
and doesn’t require us to specify the number of clusters beforehand
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Pros and Cons: Hierarchical Clustering
• No assumption of a particular number of clusters
• May corresponds to meaningful taxonomies
• Once a decision is made to combine two clusters, it can’t be
undone
• Too slow for large datasets
Copyright © 2019, edureka and/or its affiliates. All rights reserved.
Demo
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Market Basket Analysis
Market basket analysis explains the combinations of products that frequently
co-occur in transactions.
Market Basket Analysis algorithms
1. Association Rule Mining
2. Apriori
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Association Rule Mining
Association rule mining is a technique that shows how items are associated to each other.
Customer who purchase laptops are more likely to purchase laptop bags.
Customer who purchase bread have a 60% likelihood of also purchasing jam.
Example:
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Association Rule Mining
A B
Example of Association rule
➢ It means that if a person buys item A then he will also buy item B
➢ Three common ways to measure association:
Support
Confidence
Lift
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Association Rule Mining
Support gives fraction of
transactions which contains
the item A and B
𝑓𝑟𝑒𝑞 𝐴, 𝐵
𝑁
Support =
Confidence gives how often
the items A & B occur
together, given no. of times
A occurs
𝑓𝑟𝑒𝑞 𝐴, 𝐵
𝑓𝑟𝑒𝑞(𝐴)
Confidence =
Lift indicates the strength of
a rule over the random co-
occurrence of A and B
𝑆𝑢𝑝𝑝𝑜𝑟𝑡
𝑆𝑢𝑝𝑝 𝐴 𝑥 𝑆𝑢𝑝𝑝(𝐵)
Lift =
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Association Rule Mining Example
T1 A B C
T2 A C D
T3 B C D
T4 A D E
T5 B C E
A B C A C D B C D
A D E B C E
Transactions at a local store
Set of items {A, B, C, D, E}
Set of transactions {T1, T2, T3, T4, T5}
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Association Rule Mining Example
Consider the following
association rules:
1. A D
2. C A
3. A C
4. B & C A
=>
=>
=>
=>
Rule Support Confidence Lift
A D 2/5 2/3 10/9
C A 2/5 2/4 5/6
A C 2/5 2/3 5/6
B, C A 1/5 1/3 5/9
=>
=>
=>
=>
Calculate support, confidence and lift for these rules:
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Apriori Algorithm
Apriori algorithm uses frequent item sets to generate association rules. It is based on the concept
that a subset of a frequent itemset must also be a frequent itemset.
But what is a
frequent item set?
Frequent Itemset: an itemset
whose support value is greater
than a threshold value.
Example: If {A,B} is a frequent
item set, then {A} and {B} should
be frequent item sets
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Apriori Algorithm
Consider the following transactions:
TID Items
T1 1 3 4
T2 2 3 5
T3 1 2 3 5
T4 2 5
T5 1 3 5
Min. support count = 2
Note:
Now the first step is to
build a list of item sets of
size one by using this
transactional dataset.
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Apriori Algorithm – First Iteration
TID Items
T1 1 3 4
T2 2 3 5
T3 1 2 3 5
T4 2 5
T5 1 3 5
Step 1: Create item sets of size one & calculate their support values
Itemset Support
{1} 3
{2} 3
{3} 4
{4} 1
{5} 4
Itemset Support
{1} 3
{2} 3
{3} 4
{5} 4
Table: C1|
Table: F1|
Item sets with support value less than min. support value (i.e. 2) are eliminated
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Apriori Algorithm – Second Iteration
TID Items
T1 1 3 4
T2 2 3 5
T3 1 2 3 5
T4 2 5
T5 1 3 5
Step 2: Create item sets of size two & calculate their support values.
All the combinations of item sets in F1| are used in this iteration
Itemset Support
{1,2} 1
{1,3} 3
{1,5} 2
{2,3} 2
{2,5} 3
{3,5} 3
Itemset Support
{1,3} 3
{1,5} 2
{2,3} 2
{2,5} 3
{3,5} 3
Table: C2|
Table: F2|
Item sets with support less than 2 it are eliminated
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Apriori Algorithm – Third Iteration
TID Items
T1 1 3 4
T2 2 3 5
T3 1 2 3 5
T4 2 5
T5 1 3 5
Step 3: Create item sets of size three & calculate their support values.
All the combinations of item sets in F2| are used in this iteration
Itemset Support
{1,2,3}
{1,2,5}
{1,3,5}
{2,3,5}
Table: C3|
Before calculating support
values, let’s perform
pruning on the dataset!
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Apriori Algorithm – Pruning
TID Items
T1 1 3 4
T2 2 3 5
T3 1 2 3 5
T4 2 5
T5 1 3 5
After the combinations are made, divide C3| item sets to check if there
any other subsets whose support is less than min support value.
Itemset Support
{1,3} 3
{1,5} 2
{2,3} 2
{2,5} 3
{3,5} 3
Table: C3|
Table: F2|
If any of the subsets of these item sets are not there in FI2 then we remove that itemset
Itemset In F2|?
{1,2,3}
{1,2},{1,3},{2,3}
NO
{1,2,5}
{1,2},{1,5},{2,5}
NO
{1,3,5}
{1,5},{1,3},{3,5}
YES
{2,3,5}
{2,3},{2,5},{3,5}
YES
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Apriori Algorithm – Fourth Iteration
TID Items
T1 1 3 4
T2 2 3 5
T3 1 2 3 5
T4 2 5
T5 1 3 5
Using the item sets of C3|, create new itemset C4|.
Itemset Support
{1,2,3,5} 1
Table: F3|
Table: C4|
Since support of C4| is less than 2, stop and return to the previous itemset, i.e. CI3
Itemset Support
{1,3,5} 2
{2,3,5} 2
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Apriori Algorithm – Subset Creation
Frequent Item set F3|
Itemset Support
{1,3,5} 2
{2,3,5} 2
• Generate all non empty subsets for each frequent item set
❖ For I = {1,3,5}, subsets are {1,3}, {1,5}, {3,5}, {1}, {3}, {5}
❖ For I = {2,3,5}, subsets are {2,3}, {2,5}, {3,5}, {2}, {3}, {5}
• For every subsets S of I, output the rule:
S → (I-S) (S recommends I-S)
if support(I)/support(S) >= min_conf value
Let’s assume our
minimum confidence
value is 60%
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Apriori Algorithm – Applying Rules
Applying rules to item sets of F3|:
1. {1,3,5}
✓ Rule 1: {1,3} → ({1,3,5} - {1,3}) means 1 & 3 → 5
Confidence = support(1,3,5)/support(1,3) = 2/3 = 66.66% > 60%
Rule 1 is selected
✓ Rule 2: {1,5} → ({1,3,5} - {1,5}) means 1 & 5 → 3
Confidence = support(1,3,5)/support(1,5) = 2/2 = 100% > 60%
Rule 2 is selected
✓ Rule 3: {3,5} → ({1,3,5} - {3,5}) means 3 & 5 → 1
Confidence = support(1,3,5)/support(3,5) = 2/3 = 66.66% > 60%
Rule 3 is selected
TID Items
T1 1 3 4
T2 2 3 5
T3 1 2 3 5
T4 2 5
T5 1 3 5
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Apriori Algorithm – Applying Rules
Applying rules to item sets of F3|:
1. {1,3,5}
✓ Rule 4: {1} → ({1,3,5} - {1}) means 1 → 3 & 5
Confidence = support(1,3,5)/support(1) = 2/3 = 66.66% > 60%
Rule 4 is selected
✓ Rule 5: {3} → ({1,3,5} - {3}) means 3 → 1 & 5
Confidence = support(1,3,5)/support(3) = 2/4 = 50% <60%
Rule 5 is rejected
✓ Rule 6: {5} → ({1,3,5} - {5}) means 5 → 1 & 3
Confidence = support(1,3,5)/support(3) = 2/4 = 50% < 60%
Rule 6 is rejected
TID Items
T1 1 3 4
T2 2 3 5
T3 1 2 3 5
T4 2 5
T5 1 3 5
Copyright © 2019, edureka and/or its affiliates. All rights reserved.
Demo
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
What is Reinforcement Learning?
Reinforcement learning is a type of Machine Learning where an agent learns to
behave in an environment by performing actions and seeing the results
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Analogy
Scenario 1: Baby starts crawling and makes it to the candy
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Analogy
Scenario 2: Baby starts crawling but falls due to some hurdle in between
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Analogy
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Reinforcement Learning Definitions
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Reinforcement Learning Definitions
Agent: The RL algorithm that learns from trial and error
Environment: The world through which the agent moves
Action (A): All the possible steps that the agent can take
State (S): Current condition returned by the environment
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Reinforcement Learning Definitions
Reward (R): An instant return from the environment to appraise the last action
Policy (π): The approach that the agent uses to determine the next action based on
the current state
Value (V): The expected long-term return with discount, as opposed to the short-term
reward R
Action-value (Q): This similar to Value, except, it takes an extra parameter, the current
action (A)
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Reward Maximization
Reward maximization theory states that, a RL agent must be trained in such a way that, he
takes the best action so that the reward is maximum.
Agent
Opponent
Reward
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Exploration & Exploitation
Agent
Opponent
Reward
Exploitation is about using the already known exploited information to heighten the rewards
Exploration is about exploring and capturing more information about an environment
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
The K-Armed Bandit Problem
• The K-armed bandit is a metaphor representing a casino slot machine with k pull levers (or arms)
• The user or customer pulls any one of the levers to win a predefined reward
• The objective is to select the lever that will provide the user with the highest reward
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
The Epsilon Greedy Algorithm
Takes whatever action seems best at the present moment
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
The Epsilon Greedy Algorithm
With probability 1 – epsilon, the Epsilon-Greedy
algorithm exploits the best known option
With probability epsilon / 2, the Epsilon-Greedy algorithm explores
the best known option
With probability epsilon / 2, the Epsilon-Greedy algorithm explores
the worst known option
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Markov Decision Process
The mathematical approach for mapping a solution in reinforcement learning is called
Markov Decision Process (MDP)
The following parameters are used to attain a solution:
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Markov Decision Process- Shortest Path Problem
A
B
C
D
30
-20
-10
10
50
15
Goal: Find the shortest path between A and D with minimum possible cost
In this problem,
• Set of states are denoted by nodes i.e. {A, B, C, D}
• Action is to traverse from one node to another {A -> B, C -> D}
• Reward is the cost represented by each edge
• Policy is the path taken to reach the destination {A -> C -> D}
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Understanding Q-Learning With An Example
• 5 rooms in a building connected by
doors
• each room is numbered 0 through 4
• The outside of the building can be
thought of as one big room (5)
• Doors 1 and 4 lead into the building
from room 5 (outside)
Place an agent in any one of the rooms (0,1,2,3,4) and the goal is to reach outside the building (room 5)
0
4
3
1
2
5
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Understanding Q-Learning With An Example
0
4
3
1
2
5
Let’s represent the rooms on a graph, each room as a node, and each door as a link
1
2 3
40
5 Goal
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Understanding Q-Learning With An Example
Next step is to associate a reward value to each door:
• doors that lead directly to the goal have a
reward of 100
• Doors not directly connected to the target
room have zero reward
• Because doors are two-way, two arrows are
assigned to each room
• Each arrow contains an instant reward value
1
2 3
40
5 Goal
00
0
0
0
0
0 0
0
0
100
100
100
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Understanding Q-Learning With An Example
The terminology in Q-Learning includes the terms state and action:
• Room (including room 5) represents a state
• agent's movement from one room to another represents an action
• In the figure, a state is depicted as a node, while "action" is represented by the arrows
Example (Agent traverse from room 2 to room5):
1. Initial state = state 2
2. State 2 -> state 3
3. State 3 -> state (2, 1, 4)
4. State 4 -> state 5
1
2 3
40
5 Goal
00
0
0
0
0
0 0
0
0
100
100
100
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Understanding Q-Learning With An Example
We can put the state diagram and the instant reward values into a reward table, matrix R.
The -1's in the table represent null values
1
2 3
40
5 Goal
00
0
0
0
0
0 0
0
0
100
100
100
0 1 2 3 4 5
-1 -1 -1 -1 0 -1
-1 -1 -1 0 -1 100
-1 -1 -1 0 -1 -1
-1 0 0 -1 0 -1
0 -1 -1 0 -1 100
-1 0 -1 -1 0 100
0
1
2
3
4
5
State
Action
R =
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Understanding Q-Learning With An Example
Add another matrix Q, representing the memory of what the agent has learned through experience.
• The rows of matrix Q represent the current state of the agent
• columns represent the possible actions leading to the next state
• Formula to calculate the Q matrix:
Q(state, action) = R(state, action) + Gamma * Max [Q(next state, all actions)]
The Gamma parameter has a range of 0 to 1 (0 <= Gamma > 1).
• If Gamma is closer to zero, the agent will tend to consider only
immediate rewards.
• If Gamma is closer to one, the agent will consider future rewards
with greater weight
Note
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Q – Learning Algorithm
1
2
3
4
5
6
7
8
9
Set the gamma parameter, and environment rewards in matrix R
Initialize matrix Q to zero
Select a random initial state
Set initial state = current state
Select one among all possible actions for the current state
Using this possible action, consider going to the next state
Get maximum Q value for this next state based on all possible actions
Compute: Q(state, action) = R(state, action) + Gamma * Max[Q(next state, all actions)]
Repeat above steps until current state = goal state
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Q – Learning Example
First step is to set the value of the learning parameter Gamma = 0.8, and the initial state as Room 1.
Next, initialize matrix Q as a zero matrix:
• From room 1 you can either go to room 3 or 5, let’s select room 5.
• From room 5, calculate maximum Q value for this next state based on all possible actions:
Q(state, action) = R(state, action) + Gamma * Max[Q(next state, all actions)]
Q(1,5) = R(1,5) + 0.8 * Max[Q(5,1), Q(5,4), Q(5,5)] = 100 + 0.8 * 0 = 100
1
3
5
1
4
5
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Q – Learning Example
First step is to set the value of the learning parameter Gamma = 0.8, and the initial state as Room 1.
Next, initialize matrix Q as a zero matrix:
• From room 1 you can either go to room 3 or 5, let’s select room 5.
• From room 5, calculate maximum Q value for this next state based on all possible actions
Q(state, action) = R(state, action) + Gamma * Max[Q(next state, all actions)]
Q(1,5) = R(1,5) + 0.8 * Max[Q(5,1), Q(5,4), Q(5,5)] = 100 + 0.8 * 0 = 100
1
3
5
1
4
5
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Q – Learning Example
For the next episode, we start with a randomly chosen initial state, i.e. state 3
• From room 3 you can either go to room 1,2 or 4, let’s select room 1.
• From room 1, calculate maximum Q value for this next state based on all possible actions:
Q(state, action) = R(state, action) + Gamma * Max[Q(next state, all actions)]
Q(3,1) = R(3,1) + 0.8 * Max[Q(1,3), Q(1,5)]= 0 + 0.8 * [0, 100] = 80
The matrix Q get’s updated
3
1
2
4
3
5
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Q – Learning Example
For the next episode, the next state, 1, now becomes the current state. We repeat the inner loop of the Q
learning algorithm because state 1 is not the goal state.
• From room 1 you can either go to room 3 or 5, let’s select room 5.
• From room 5, calculate maximum Q value for this next state based on all possible actions:
Q(state, action) = R(state, action) + Gamma * Max[Q(next state, all actions)]
Q(1,5) = R(1,5) + 0.8 * Max[Q(5,1), Q(5,4), Q(5,5)] = 100 + 0.8 * 0 = 100
The matrix Q remains the same since, Q(1,5) is already fed to the agent
1
3
5
1
4
5
Copyright © 2019, edureka and/or its affiliates. All rights reserved.
Demo
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
AI and ML and DL
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Limitations of Machine Learning
Traditional Machine Learning algorithms have failed to solve crucial problems of AI, such as
Natural language processing, image recognition and so on.
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Limitations of Machine Learning
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Deep Learning to Rescue
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Deep Learning to Rescue
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
What is Deep Learning ?
Deep Learning is a subset of Machine Learning where similar Machine Learning Algorithms are
used to train Deep Neural Networks so as to achieve better accuracy in those cases where the
former was not performing up to the mark.
Basically, Deep learning mimics the way our brain functions i.e. it learns from experience.
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Applications of Deep Learning
• Automatic Machine Translation
• Object Classification in Photographs
• Automatic Handwriting Generation
• Character Text Generation
• Image Caption Generation
• Colorization of Black and White Images
• Automatic Game Playing
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Applications of Deep Learning
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
How Neuron Works?
Deep learning is a form of machine learning that uses a model of computing that's very much
inspired by the structure of the brain, so lets understand that first.
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
A Perceptron
Each neuron has a set of inputs, each of which is given a specific weight. The neuron
computes some function on these weighted inputs and gives the output.
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Role of Weights and Bias
• For a perceptron, there can be one more input called bias
• While the weights determine the slope of the classifier line, bias allows us to shift the line towards left or right
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Activation Functions
1.Linear or Identity
2.Unit or Binary Step
3.Sigmoid or Logistic
4.Tanh
5.ReLU
6.SoftMax
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Activation Functions
LINEAR UNIT STEP
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Activation Functions
Sigmoid Tan h
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Activation Functions
ReLu SoftMax
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Perceptron Example
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Perceptron Example
Copyright © 2019, edureka and/or its affiliates. All rights reserved.
Demo
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
What are Tensors?
Tensors are the standard way of representing data
in deep learning
Tensors are just multidimensional arrays, an extension of 2-dimensional
tables (matrices) to data with higher dimension.
Tensor 0f Dimension -6 Tensor 0f Dimension –[6,4] Tensor 0f Dimension –[6,4,2]
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
What is TensorFlow?
In Tensorflow, computation is approached as a
dataflow graph
Tensor Flow
ADD
MATMUL
Data
RESULT
Copyright © 2019, edureka and/or its affiliates. All rights reserved.
Demo
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Perceptron Problems
• Single-Layer Perceptrons cannot classify non-linearly separable data points.
• Complex problems, that involve a lot of parameters cannot be solved by Single-Layer Perceptrons.
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Deep Neural Network
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Deep Neural Network
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Training Network Weights
• We can estimate the weight values for our training data using ‘stochastic gradient descent’ optimizer.
• Stochastic gradient descent requires two parameters:
• Learning Rate: Used to limit the amount each weight is corrected each time it is updated.
• Epochs: The number of times to run through the training data while updating the weight.
• These, along with the training data will be the arguments to the function.
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
MNIST Dataset
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Deep Neural Network
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Companies Hiring
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Data Science Master Program
YouTube Video Link in the Description
Data Science Full Course | Edureka

Mais conteúdo relacionado

Mais procurados

Introduction to Data Science.pptx
Introduction to Data Science.pptxIntroduction to Data Science.pptx
Introduction to Data Science.pptxVrishit Saraswat
 
Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Simplilearn
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data ScienceJason Geng
 
Introduction to data science.pptx
Introduction to data science.pptxIntroduction to data science.pptx
Introduction to data science.pptxSadhanaParameswaran
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data scienceSampath Kumar
 
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...Edureka!
 
Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...Edureka!
 
Python for Data Science | Python Data Science Tutorial | Data Science Certifi...
Python for Data Science | Python Data Science Tutorial | Data Science Certifi...Python for Data Science | Python Data Science Tutorial | Data Science Certifi...
Python for Data Science | Python Data Science Tutorial | Data Science Certifi...Edureka!
 
Python Machine Learning Tutorial | Machine Learning Algorithms | Python Train...
Python Machine Learning Tutorial | Machine Learning Algorithms | Python Train...Python Machine Learning Tutorial | Machine Learning Algorithms | Python Train...
Python Machine Learning Tutorial | Machine Learning Algorithms | Python Train...Edureka!
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaEdureka!
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data ScienceSpotle.ai
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data sciencebhavesh lande
 
Introduction to data science club
Introduction to data science clubIntroduction to data science club
Introduction to data science clubData Science Club
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceEdureka!
 
PPT on Data Science Using Python
PPT on Data Science Using PythonPPT on Data Science Using Python
PPT on Data Science Using PythonNishantKumar1179
 

Mais procurados (20)

Introduction to Data Science.pptx
Introduction to Data Science.pptxIntroduction to Data Science.pptx
Introduction to Data Science.pptx
 
Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data Science
 
Introduction to data science.pptx
Introduction to data science.pptxIntroduction to data science.pptx
Introduction to data science.pptx
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
 
Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...
 
Python for Data Science | Python Data Science Tutorial | Data Science Certifi...
Python for Data Science | Python Data Science Tutorial | Data Science Certifi...Python for Data Science | Python Data Science Tutorial | Data Science Certifi...
Python for Data Science | Python Data Science Tutorial | Data Science Certifi...
 
Data science
Data science Data science
Data science
 
Introduction to Data Analytics
Introduction to Data AnalyticsIntroduction to Data Analytics
Introduction to Data Analytics
 
Python Machine Learning Tutorial | Machine Learning Algorithms | Python Train...
Python Machine Learning Tutorial | Machine Learning Algorithms | Python Train...Python Machine Learning Tutorial | Machine Learning Algorithms | Python Train...
Python Machine Learning Tutorial | Machine Learning Algorithms | Python Train...
 
Data science
Data scienceData science
Data science
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | Edureka
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data Science
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Introduction to data science club
Introduction to data science clubIntroduction to data science club
Introduction to data science club
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
PPT on Data Science Using Python
PPT on Data Science Using PythonPPT on Data Science Using Python
PPT on Data Science Using Python
 
Data science Big Data
Data science Big DataData science Big Data
Data science Big Data
 

Semelhante a Data Science Full Course | Edureka

Predictive Analytics Using R | Edureka
Predictive Analytics Using R | EdurekaPredictive Analytics Using R | Edureka
Predictive Analytics Using R | EdurekaEdureka!
 
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Edureka!
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification courseKumarNaik21
 
Data science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptxData science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptxNagarajanG35
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf
 
Il ruolo chiave degli Advanced Analytics per la Supply Chain
Il ruolo chiave degli Advanced Analytics per la Supply ChainIl ruolo chiave degli Advanced Analytics per la Supply Chain
Il ruolo chiave degli Advanced Analytics per la Supply ChainACTOR
 
ACTOR - "Il ruolo chiave degli Advanced Analytics per la Supply Chain. Intel...
ACTOR -  "Il ruolo chiave degli Advanced Analytics per la Supply Chain. Intel...ACTOR -  "Il ruolo chiave degli Advanced Analytics per la Supply Chain. Intel...
ACTOR - "Il ruolo chiave degli Advanced Analytics per la Supply Chain. Intel...logisticaefficiente
 
Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)SayyedYusufali
 
Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)SayyedYusufali
 
Data science training in hydpdf converted (1)
Data science training in hydpdf  converted (1)Data science training in hydpdf  converted (1)
Data science training in hydpdf converted (1)SayyedYusufali
 
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...Edureka!
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?DIGITALSAI1
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification courseKumarNaik21
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)SayyedYusufali
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabadVamsiNihal
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabadsaitejavella
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training HyderabadNithinsunil1
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabadVamsiNihal
 

Semelhante a Data Science Full Course | Edureka (20)

Predictive Analytics Using R | Edureka
Predictive Analytics Using R | EdurekaPredictive Analytics Using R | Edureka
Predictive Analytics Using R | Edureka
 
How to crack down big data?
How to crack down big data? How to crack down big data?
How to crack down big data?
 
Data Science and Analysis.pptx
Data Science and Analysis.pptxData Science and Analysis.pptx
Data Science and Analysis.pptx
 
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Data science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptxData science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptx
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Il ruolo chiave degli Advanced Analytics per la Supply Chain
Il ruolo chiave degli Advanced Analytics per la Supply ChainIl ruolo chiave degli Advanced Analytics per la Supply Chain
Il ruolo chiave degli Advanced Analytics per la Supply Chain
 
ACTOR - "Il ruolo chiave degli Advanced Analytics per la Supply Chain. Intel...
ACTOR -  "Il ruolo chiave degli Advanced Analytics per la Supply Chain. Intel...ACTOR -  "Il ruolo chiave degli Advanced Analytics per la Supply Chain. Intel...
ACTOR - "Il ruolo chiave degli Advanced Analytics per la Supply Chain. Intel...
 
Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)
 
Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)
 
Data science training in hydpdf converted (1)
Data science training in hydpdf  converted (1)Data science training in hydpdf  converted (1)
Data science training in hydpdf converted (1)
 
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabad
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 

Mais de Edureka!

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaEdureka!
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaEdureka!
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaEdureka!
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaEdureka!
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaEdureka!
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaEdureka!
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaEdureka!
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaEdureka!
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaEdureka!
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | EdurekaEdureka!
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEdureka!
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEdureka!
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaEdureka!
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaEdureka!
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaEdureka!
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Edureka!
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaEdureka!
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaEdureka!
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | EdurekaEdureka!
 
ITIL® Tutorial for Beginners | ITIL® Foundation Training | Edureka
ITIL® Tutorial for Beginners | ITIL® Foundation Training | EdurekaITIL® Tutorial for Beginners | ITIL® Foundation Training | Edureka
ITIL® Tutorial for Beginners | ITIL® Foundation Training | EdurekaEdureka!
 

Mais de Edureka! (20)

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | Edureka
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | Edureka
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | Edureka
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | Edureka
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | Edureka
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| Edureka
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | Edureka
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | Edureka
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | Edureka
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | Edureka
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | Edureka
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | Edureka
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | Edureka
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | Edureka
 
ITIL® Tutorial for Beginners | ITIL® Foundation Training | Edureka
ITIL® Tutorial for Beginners | ITIL® Foundation Training | EdurekaITIL® Tutorial for Beginners | ITIL® Foundation Training | Edureka
ITIL® Tutorial for Beginners | ITIL® Foundation Training | Edureka
 

Último

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 

Data Science Full Course | Edureka

  • 1.
  • 2. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Agenda ❖ Evolution of Data ❖ Introduction To Data Science ❖ Data Science Careers and Salary ❖ Statistics for Data Science ❖ What is Machine Learning? ❖ Types of Machine Learning ❖ What is Deep Learning?
  • 3. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
  • 4. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Evolution of Data 2.5 x 1018 Bytes
  • 5. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Evolution of Data 3 MILLION 4.3 MILLION
  • 6. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data science, also known as data-driven science, is an interdisciplinary field about scientific methods, processes, and systems to extract knowledge or insights from data in various forms, either structured or unstructured. What is Data Science?
  • 7. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Career : Data Science Data Analyst AI/ML Engineer Data Scientist
  • 8. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Analyst
  • 9. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Scientist
  • 10. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Machine Learning Engineer
  • 11. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Salary Trends Average Salary (US) Average Salary (IND) Data Analyst Data Scientist ML Engineers
  • 12. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science RoadMap Earn a Bachelor’s Degree • Computer Science • Mathematics • Information Technology • Statistics • Finance • Economics
  • 13. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science RoadMap edureka! Earn a Bachelor’s Degree Fine Tune Technical Skills • Statistical methods and Packages • R/Python and/or SAS languages • Data warehousing and BI • Data Cleaning, Visualization and Reporting Techniques • Working knowledge of Hadoop & MapReduce and Machine learning techniques
  • 14. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science RoadMap edureka! Earn a Bachelor’s Degree Fine Tune Technical Skills Develop Business Skills • Analytic Problem-Solving • Effective Communication • Creative Thinking • Industry Knowledge
  • 15. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science RoadMap edureka! Earn a Bachelor’s Degree Fine Tune Technical Skills Develop Business Skills Master’s Degree or Certifications Programs • MS/MTech in CS, Statistics, ML • Big Data Certifications • Data Analysis / Machine Learning Certifications
  • 16. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science RoadMap edureka! Earn a Bachelor’s Degree Fine Tune Technical Skills Develop Business Skills Master’s Degree or Certifications Programs
  • 17. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science RoadMap edureka! Earn a Bachelor’s Degree Fine Tune Technical Skills Develop Business Skills Master’s Degree or Certifications Programs Work on Projects Related to Field
  • 18. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Analyst Skills Analytical skills Communication skills Critical thinking Attention to detail Mathematics skills Technical skills/tools
  • 19. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Scientist Skills Analytics & Statistics Machine Learning Algorithms Problem Solving Skills Deep Learning Business Communication Technical skills/tools
  • 20. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science ML Engineer Skills Programming Languages Calculus & Statistics Signal Processing Applied Maths Neural Networks Language Processing
  • 21. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Statistics Data Science Peripherals
  • 22. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Statistics Prog Languages Data Science Peripherals
  • 23. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Statistics Prog Languages Software Data Science Peripherals
  • 24. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Statistics Prog Languages Software Machine Learning Data Science Peripherals
  • 25. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Statistics Prog Languages Software Machine Learning Big Data Data Science Peripherals
  • 26. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science What is Data ?
  • 27. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science What is Data ?
  • 28. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Variables & Research Independent variable Dependentvariable
  • 29. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Population & Sampling
  • 30. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Population & Sampling Non-Probability Probability
  • 31. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Population & Sampling Non-Probability Probability Random sampling Systematic sampling Stratified sampling
  • 32. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Measures of Center
  • 33. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Measures of Spread
  • 34. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Measures of Spread
  • 35. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Skewness
  • 36. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Confusion Matrix You can calculate the accuracy of your model with:
  • 37. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Probability Probability is the measure of how likely something will occur
  • 38. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Probability The equation describing a continuous probability distribution is called a probability density function.
  • 39. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Probability The normal distribution is a probability distribution that associates the normal random variable X with a cumulative probability . The normal distribution is defined by the following equation: Y = [ 1/σ * sqrt(2π) ] * e -(x - μ)2/2σ2 Where, X is a normal random variable. μ is the mean and σ is the standard deviation.
  • 40. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Probability The central limit theorem states that the sampling distribution of the mean of any independent, random variable will be normal or nearly normal, if the sample size is large enough.
  • 41. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science What is Machine Learning? Machine Learning is a class of algorithms which is data-driven, i.e. unlike "normal" algorithms it is the data that "tells" what the "good answer" is Getting computers to program themselves and also teaching them to make decisions using data “Where writing software is the bottleneck, let the data do the work instead.”
  • 42. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Features of Machine Learning
  • 43. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science How It Works? Learn from Data Find Hidden Insights Train and Grow
  • 44. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Applications of Machine Learning
  • 45. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Applications of Machine Learning
  • 46. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Applications of Machine Learning
  • 47. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Market Trend: Machine Learning
  • 48. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Machine Learning Life Cycle Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Collecting Data Data Wrangling Analyse Data Train Algorithm Test Algorithm Deployment
  • 49. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science 4 5 6 1 2 3 Step 1: Collecting Data
  • 50. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science 4 5 6 1 2 3 Data Wrangling
  • 51. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science 4 5 6 1 2 3 Analyse Data model
  • 52. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science 4 5 6 1 2 3 Train Algorithm model Training set
  • 53. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science 4 5 6 1 2 3 Test Algorithm model Test set Accurate?
  • 54. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science 4 5 6 1 2 3 Operation and Optimization
  • 55. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Important Python Libraries
  • 56. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Types of Machine Learning
  • 57. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Supervised Learning It’s a Face Label Face Label Non-Face
  • 58. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Unsupervised Learning
  • 59. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Reinforcement Learning
  • 60. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Supervised Learning Supervised learning is where you have input variables (x) and an output variable (Y) and you use an algorithm to learn the mapping function from the input to the output It is called Supervised Learning because the process of an algorithm learning from the training dataset can be thought as a teacher supervising the learning process
  • 61. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Supervised Learning Training and Testing Prediction Historical Data Random Sampling Training Dataset Testing Dataset Machine Learning Statistical Models Prediction & Testing Model Validation Outcome
  • 62. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Supervised Learning Training and Testing Prediction New Data Model Predicted Outcome
  • 63. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Supervised Learning Algorithms Linear Regression Logistic Regression Decision Tree Random Forest Naïve Bayes Classifier
  • 64. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Linear Regression Linear Regression Analysis is a powerful technique used for predicting the unknown value of a variable (Dependent Variable) from the known value of another variables (Independent Variable) • A Dependent Variable(DV) is the variable to be predicted or explained in a regression model • An Independent Variable(IDV) is the variable related to the dependent variable in a regression equation
  • 65. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Simple Linear Regression Dependent Variable Independent Variable Y = a + bX Y - Intercept Slope of the Line
  • 66. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Regression Line Linear Regression Analysis is a powerful technique used for predicting the unknown value of a variable (Dependent Variable) from The regression line is simply a single line that best fits the data (In terms of having the smallest overall distance from the line to the points) Fitted Points Regression Line
  • 67. Copyright © 2019, edureka and/or its affiliates. All rights reserved. Demo
  • 68. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Hi I am john, I need some baseline for pricing my Villas and Independent Houses Real Estate Company Use Case
  • 69. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Dataset Description Column Description CRIM per capita crime rate by town ZN proportion of residential land zoned for lots over 25,000 sq.ft. INDUS proportion of non-retail business acres per town. CHAS Charles River dummy variable (1 if tract bounds river; 0 otherwise) NOX nitric oxides concentration (parts per 10 million) RM average number of rooms per dwelling AGE proportion of owner-occupied units built prior to 1940 DIS weighted distances to five Boston employment centres RAD index of accessibility to radial highways TAX full-value property-tax rate per $10,000 PTRATIO pupil-teacher ratio by town B 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town LSTAT % lower status of the population MEDV Median value of owner-occupied homes in $1000's
  • 70. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Steps Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Collecting Data Data Wrangling Analyse Data Train Algorithm Test Algorithm Deployment
  • 71. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Model Fitting Fitting a model means that you're making your algorithm learn the relationship between predictors and outcome so that you can predict the future values of the outcome . So the best fitted model has a specific set of parameters which best defines the problem at hand
  • 72. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Types of Fitting Machine Learning algorithms first attempt to solve the problem of under-fitting; that is, of taking a line that does not approximate the data well, and making it to approximate the data better.
  • 73. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Need For Logistic Regression WHO WILL WIN ? Here, the best fit line in linear regression is going below 0 and above 1
  • 74. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science What is Logistic Regression? The outcome(result) will be binary(0/1) 0- If malignant 1- If benign Logistic Regression is a statistical method for analysing a dataset in which there are one or more independent variables that determine an outcome. Outcome is a binary class type.
  • 75. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science What is Logistic Regression? Based on the threshold value set, we decide the output from the function The Logistic Regression Curve is called as “Sigmoid Curve”, also known as S-Curve
  • 76. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science What is Polynomial Regression? When we have non linear data, which can’t be predicted with a linear model. We switch to Polynomial Regression. Such a scenario is shown in the below graph
  • 77. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science What is a Decision Tree? A decision tree is a tree-like structure in which internal node represents test on an attribute • Each branch represents outcome of test and each leaf node represents class label (decision taken after computing all attributes) • A path from root to leaf represents classification rules. True False
  • 78. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Building a Decision Tree
  • 79. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Building a Decision Tree
  • 80. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Building a Decision Tree
  • 81. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Building a Decision Tree
  • 82. Copyright © 2019, edureka and/or its affiliates. All rights reserved. Demo
  • 83. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science What is Random Forest? Random Forest is an ensemble classifier made using many Decision tree models What are Ensemble models? How is it better from Decision Trees ?
  • 84. Copyright © 2019, edureka and/or its affiliates. All rights reserved. Demo
  • 85. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science What is Naïve Bayes? Random Forest is an ensemble classifier made using many Decision tree models What are Ensemble models? How is it better from Random Forest?
  • 86. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science What is Naïve Bayes ? Naive Bayes is a simple but surprisingly powerful algorithm for predictive modeling. Bayes
  • 87. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Bayes’ Theorem Given a hypothesis H and evidence E , Bayes' theorem states that the relationship between the probability of the hypothesis before getting the evidence P(H) and the probability of the hypothesis after getting the evidence P(H|E) is P(H|E) = P(E|H).P(H) P(E)
  • 88. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Bayes’ Theorem Example
  • 89. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Bayes’ Theorem Example P(King) = 4/52 =1/13
  • 90. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Bayes’ Theorem Example P(King) = 4/52 =1/13 P(King|Face) = P(Face|King).P(King) P(Face)
  • 91. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Bayes’ Theorem Example P(King) = 4/52 =1/13 P(King|Face) = P(Face|King).P(King) P(Face) P(Face|King) = 1
  • 92. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Bayes’ Theorem Example P(King) = 4/52 =1/13 P(King|Face) = P(Face|King).P(King) P(Face) P(Face|King) = 1 P(Face) =12/52 = 3/13
  • 93. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Bayes’ Theorem Example P(A|B) = P(A∩B) P(B) P(B|A) = P(B∩A) P(A) P(A∩B) = P(A|B).P(B) = P(B|A).P(A) = P(A|B) = P(B|A).P(A) P(B)
  • 94. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Bayes’ Theorem Proof P(H|E) = P(E|H).P(H) P(E) Likelihood How probable is the evidence Given that our hypothesis is true? Posterior How probable is our Hypothesis Given the observed evidence? (Not directly computable) Prior How probable was our hypothesis Before observing the evidence? Marginal How probable is the new evidence Under all possible hypothesis?
  • 95. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Naïve Bayes: Working PYTHON CERTIFICATION TRAINING www.edureka.co/python
  • 96. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Classification Steps
  • 97. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Classification Steps
  • 98. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Classification Steps
  • 99. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Classification Steps
  • 100. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Classification Steps
  • 101. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Industrial Use Cases News Categorization Spam Filtering Weather Predictions
  • 102. Copyright © 2019, edureka and/or its affiliates. All rights reserved. Demo
  • 103. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Unsupervised Learning Unsupervised learning is a type of machine learning algorithm used to draw inferences from datasets consisting of input data Without labelled responses
  • 104. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Unsupervised Learning: Process Flow Training data is collection of information without any label
  • 105. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science What is Clustering? “Clustering is the process of dividing the datasets into groups, consisting of similar data-points” It means grouping of objects based on the information found in the data, describing the objects or their relationship
  • 106. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Why is Clustering Used? The goal of clustering is to determine the intrinsic grouping in a set of Unlabelled Data Sometimes, Partitioning is the goal
  • 107. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Where is it used?
  • 108. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Types of Clustering Exclusive Clustering Overlapping Clustering Hierarchical Clustering K-Means Clustering
  • 109. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Types of Clustering Exclusive Clustering Overlapping Clustering Hierarchical Clustering C-Means Clustering
  • 110. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Types of Clustering Exclusive Clustering Overlapping Clustering Hierarchical Clustering
  • 111. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science K-Means Clustering The process by which objects are classified into a predefined number of groups so that they are as much dissimilar as possible from one group to another group, but as much similar as possible within each group.
  • 112. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science K-Means Algorithm Working
  • 113. Copyright © 2017, edureka and/or its affiliates. All rights reserved. First we need to decide the number of clusters to be made. (Guessing) K-Means Clustering : Steps 1 Let’s assume , Number of clusters = 3
  • 114. Copyright © 2017, edureka and/or its affiliates. All rights reserved. First we need to decide the number of clusters to be made. (Guessing) Then we provide centroids of all the clusters. (Guessing) K-Means Clustering : Steps 1 2
  • 115. Copyright © 2017, edureka and/or its affiliates. All rights reserved. First we need to decide the number of clusters to be made. (Guessing) Then we provide centroids of all the clusters. (Guessing) The Algorithm calculates Euclidian distance of the points from each centroid and assigns the point to the closest cluster. K-Means Clustering : Steps 1 2 3
  • 116. Copyright © 2017, edureka and/or its affiliates. All rights reserved. First we need to decide the number of clusters to be made. (Guessing) Then we provide centroids of all the clusters. (Guessing) The Algorithm calculates Euclidian distance of the points from each centroid and assigns the point to the closest cluster. Next the Centroids are calculated again, when we have our new cluster. K-Means Clustering : Steps 1 2 3 4
  • 117. Copyright © 2017, edureka and/or its affiliates. All rights reserved. First we need to decide the number of clusters to be made. (Guessing) Then we provide centroids of all the clusters. (Guessing) The Algorithm calculates Euclidian distance of the points from each centroid and assigns the point to the closest cluster. Next the Centroids are calculated again, when we have our new cluster. The distance of the points from the centre of clusters are calculated again and points are assigned to the closest cluster. K-Means Clustering : Steps 1 2 3 4 5
  • 118. Copyright © 2017, edureka and/or its affiliates. All rights reserved. First we need to decide the number of clusters to be made. (Guessing) Then we provide centroids of all the clusters. (Guessing) The Algorithm calculates Euclidian distance of the points from each centroid and assigns the point to the closest cluster. Next the Centroids are calculated again, when we have our new cluster. The distance of the points from the centre of clusters are calculated again and points are assigned to the closest cluster. And then again the new centroid for the cluster is calculated. K-Means Clustering : Steps 1 2 3 4 5 6
  • 119. Copyright © 2017, edureka and/or its affiliates. All rights reserved. First we need to decide the number of clusters to be made. (Guessing) Then we provide centroids of all the clusters. (Guessing) The Algorithm calculates Euclidian distance of the points from each centroid and assigns the point to the closest cluster. Next the Centroids are calculated again, when we have our new cluster. The distance of the points from the centre of clusters are calculated again and points are assigned to the closest cluster. And then again the new centroid for the cluster is calculated. These steps are repeated until we have a repetition in centroids or new centroids are very close to the previous ones. K-Means Clustering : Steps 1 2 3 4 5 6 7
  • 120. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science How to Decide the number of Clusters The Elbow Method : First of all, compute the sum of squared error (SSE) for some values of k (for example 2, 4, 6, 8, etc.). The SSE is defined as the sum of the squared distance between each member of the cluster and its centroid. Mathematically:
  • 121. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Pros and Cons: K-Means Clustering • Simple, understandable • Items automatically assigned to clusters • Must define number of clusters • All items forced into clusters • Unable to handle noisy data and outliers
  • 122. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Fuzzy C – Means Clustering Fuzzy C-Means is an extension of K-Means, the popular simple clustering technique Fuzzy clustering (also referred to as soft clustering) is a form of Clustering in which each data point can belong to more than one cluster
  • 123. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Pros and Cons: C-Means Clustering • Allows a data point to be in multiple clusters • A more natural representation of the behaviour of genes • Genes usually are involved in multiple functions • Need to define c, the number of clusters • Need to determine membership cut-off value • Clusters are sensitive to initial assignment of centroids • Fuzzy c-means is not a deterministic algorithm
  • 124. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Hierarchical Clustering Hierarchical clustering is an alternative approach which builds a hierarchy from the bottom-up, and doesn’t require us to specify the number of clusters beforehand
  • 125. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Pros and Cons: Hierarchical Clustering • No assumption of a particular number of clusters • May corresponds to meaningful taxonomies • Once a decision is made to combine two clusters, it can’t be undone • Too slow for large datasets
  • 126. Copyright © 2019, edureka and/or its affiliates. All rights reserved. Demo
  • 127. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Market Basket Analysis Market basket analysis explains the combinations of products that frequently co-occur in transactions. Market Basket Analysis algorithms 1. Association Rule Mining 2. Apriori
  • 128. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Association Rule Mining Association rule mining is a technique that shows how items are associated to each other. Customer who purchase laptops are more likely to purchase laptop bags. Customer who purchase bread have a 60% likelihood of also purchasing jam. Example:
  • 129. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Association Rule Mining A B Example of Association rule ➢ It means that if a person buys item A then he will also buy item B ➢ Three common ways to measure association: Support Confidence Lift
  • 130. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Association Rule Mining Support gives fraction of transactions which contains the item A and B 𝑓𝑟𝑒𝑞 𝐴, 𝐵 𝑁 Support = Confidence gives how often the items A & B occur together, given no. of times A occurs 𝑓𝑟𝑒𝑞 𝐴, 𝐵 𝑓𝑟𝑒𝑞(𝐴) Confidence = Lift indicates the strength of a rule over the random co- occurrence of A and B 𝑆𝑢𝑝𝑝𝑜𝑟𝑡 𝑆𝑢𝑝𝑝 𝐴 𝑥 𝑆𝑢𝑝𝑝(𝐵) Lift =
  • 131. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Association Rule Mining Example T1 A B C T2 A C D T3 B C D T4 A D E T5 B C E A B C A C D B C D A D E B C E Transactions at a local store Set of items {A, B, C, D, E} Set of transactions {T1, T2, T3, T4, T5}
  • 132. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Association Rule Mining Example Consider the following association rules: 1. A D 2. C A 3. A C 4. B & C A => => => => Rule Support Confidence Lift A D 2/5 2/3 10/9 C A 2/5 2/4 5/6 A C 2/5 2/3 5/6 B, C A 1/5 1/3 5/9 => => => => Calculate support, confidence and lift for these rules:
  • 133. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Apriori Algorithm Apriori algorithm uses frequent item sets to generate association rules. It is based on the concept that a subset of a frequent itemset must also be a frequent itemset. But what is a frequent item set? Frequent Itemset: an itemset whose support value is greater than a threshold value. Example: If {A,B} is a frequent item set, then {A} and {B} should be frequent item sets
  • 134. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Apriori Algorithm Consider the following transactions: TID Items T1 1 3 4 T2 2 3 5 T3 1 2 3 5 T4 2 5 T5 1 3 5 Min. support count = 2 Note: Now the first step is to build a list of item sets of size one by using this transactional dataset.
  • 135. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Apriori Algorithm – First Iteration TID Items T1 1 3 4 T2 2 3 5 T3 1 2 3 5 T4 2 5 T5 1 3 5 Step 1: Create item sets of size one & calculate their support values Itemset Support {1} 3 {2} 3 {3} 4 {4} 1 {5} 4 Itemset Support {1} 3 {2} 3 {3} 4 {5} 4 Table: C1| Table: F1| Item sets with support value less than min. support value (i.e. 2) are eliminated
  • 136. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Apriori Algorithm – Second Iteration TID Items T1 1 3 4 T2 2 3 5 T3 1 2 3 5 T4 2 5 T5 1 3 5 Step 2: Create item sets of size two & calculate their support values. All the combinations of item sets in F1| are used in this iteration Itemset Support {1,2} 1 {1,3} 3 {1,5} 2 {2,3} 2 {2,5} 3 {3,5} 3 Itemset Support {1,3} 3 {1,5} 2 {2,3} 2 {2,5} 3 {3,5} 3 Table: C2| Table: F2| Item sets with support less than 2 it are eliminated
  • 137. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Apriori Algorithm – Third Iteration TID Items T1 1 3 4 T2 2 3 5 T3 1 2 3 5 T4 2 5 T5 1 3 5 Step 3: Create item sets of size three & calculate their support values. All the combinations of item sets in F2| are used in this iteration Itemset Support {1,2,3} {1,2,5} {1,3,5} {2,3,5} Table: C3| Before calculating support values, let’s perform pruning on the dataset!
  • 138. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Apriori Algorithm – Pruning TID Items T1 1 3 4 T2 2 3 5 T3 1 2 3 5 T4 2 5 T5 1 3 5 After the combinations are made, divide C3| item sets to check if there any other subsets whose support is less than min support value. Itemset Support {1,3} 3 {1,5} 2 {2,3} 2 {2,5} 3 {3,5} 3 Table: C3| Table: F2| If any of the subsets of these item sets are not there in FI2 then we remove that itemset Itemset In F2|? {1,2,3} {1,2},{1,3},{2,3} NO {1,2,5} {1,2},{1,5},{2,5} NO {1,3,5} {1,5},{1,3},{3,5} YES {2,3,5} {2,3},{2,5},{3,5} YES
  • 139. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Apriori Algorithm – Fourth Iteration TID Items T1 1 3 4 T2 2 3 5 T3 1 2 3 5 T4 2 5 T5 1 3 5 Using the item sets of C3|, create new itemset C4|. Itemset Support {1,2,3,5} 1 Table: F3| Table: C4| Since support of C4| is less than 2, stop and return to the previous itemset, i.e. CI3 Itemset Support {1,3,5} 2 {2,3,5} 2
  • 140. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Apriori Algorithm – Subset Creation Frequent Item set F3| Itemset Support {1,3,5} 2 {2,3,5} 2 • Generate all non empty subsets for each frequent item set ❖ For I = {1,3,5}, subsets are {1,3}, {1,5}, {3,5}, {1}, {3}, {5} ❖ For I = {2,3,5}, subsets are {2,3}, {2,5}, {3,5}, {2}, {3}, {5} • For every subsets S of I, output the rule: S → (I-S) (S recommends I-S) if support(I)/support(S) >= min_conf value Let’s assume our minimum confidence value is 60%
  • 141. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Apriori Algorithm – Applying Rules Applying rules to item sets of F3|: 1. {1,3,5} ✓ Rule 1: {1,3} → ({1,3,5} - {1,3}) means 1 & 3 → 5 Confidence = support(1,3,5)/support(1,3) = 2/3 = 66.66% > 60% Rule 1 is selected ✓ Rule 2: {1,5} → ({1,3,5} - {1,5}) means 1 & 5 → 3 Confidence = support(1,3,5)/support(1,5) = 2/2 = 100% > 60% Rule 2 is selected ✓ Rule 3: {3,5} → ({1,3,5} - {3,5}) means 3 & 5 → 1 Confidence = support(1,3,5)/support(3,5) = 2/3 = 66.66% > 60% Rule 3 is selected TID Items T1 1 3 4 T2 2 3 5 T3 1 2 3 5 T4 2 5 T5 1 3 5
  • 142. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Apriori Algorithm – Applying Rules Applying rules to item sets of F3|: 1. {1,3,5} ✓ Rule 4: {1} → ({1,3,5} - {1}) means 1 → 3 & 5 Confidence = support(1,3,5)/support(1) = 2/3 = 66.66% > 60% Rule 4 is selected ✓ Rule 5: {3} → ({1,3,5} - {3}) means 3 → 1 & 5 Confidence = support(1,3,5)/support(3) = 2/4 = 50% <60% Rule 5 is rejected ✓ Rule 6: {5} → ({1,3,5} - {5}) means 5 → 1 & 3 Confidence = support(1,3,5)/support(3) = 2/4 = 50% < 60% Rule 6 is rejected TID Items T1 1 3 4 T2 2 3 5 T3 1 2 3 5 T4 2 5 T5 1 3 5
  • 143. Copyright © 2019, edureka and/or its affiliates. All rights reserved. Demo
  • 144. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science What is Reinforcement Learning? Reinforcement learning is a type of Machine Learning where an agent learns to behave in an environment by performing actions and seeing the results
  • 145. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Analogy Scenario 1: Baby starts crawling and makes it to the candy
  • 146. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Analogy Scenario 2: Baby starts crawling but falls due to some hurdle in between
  • 147. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Analogy
  • 148. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Reinforcement Learning Definitions
  • 149. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Reinforcement Learning Definitions Agent: The RL algorithm that learns from trial and error Environment: The world through which the agent moves Action (A): All the possible steps that the agent can take State (S): Current condition returned by the environment
  • 150. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Reinforcement Learning Definitions Reward (R): An instant return from the environment to appraise the last action Policy (π): The approach that the agent uses to determine the next action based on the current state Value (V): The expected long-term return with discount, as opposed to the short-term reward R Action-value (Q): This similar to Value, except, it takes an extra parameter, the current action (A)
  • 151. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Reward Maximization Reward maximization theory states that, a RL agent must be trained in such a way that, he takes the best action so that the reward is maximum. Agent Opponent Reward
  • 152. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Exploration & Exploitation Agent Opponent Reward Exploitation is about using the already known exploited information to heighten the rewards Exploration is about exploring and capturing more information about an environment
  • 153. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science The K-Armed Bandit Problem • The K-armed bandit is a metaphor representing a casino slot machine with k pull levers (or arms) • The user or customer pulls any one of the levers to win a predefined reward • The objective is to select the lever that will provide the user with the highest reward
  • 154. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science The Epsilon Greedy Algorithm Takes whatever action seems best at the present moment
  • 155. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science The Epsilon Greedy Algorithm With probability 1 – epsilon, the Epsilon-Greedy algorithm exploits the best known option With probability epsilon / 2, the Epsilon-Greedy algorithm explores the best known option With probability epsilon / 2, the Epsilon-Greedy algorithm explores the worst known option
  • 156. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Markov Decision Process The mathematical approach for mapping a solution in reinforcement learning is called Markov Decision Process (MDP) The following parameters are used to attain a solution:
  • 157. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Markov Decision Process- Shortest Path Problem A B C D 30 -20 -10 10 50 15 Goal: Find the shortest path between A and D with minimum possible cost In this problem, • Set of states are denoted by nodes i.e. {A, B, C, D} • Action is to traverse from one node to another {A -> B, C -> D} • Reward is the cost represented by each edge • Policy is the path taken to reach the destination {A -> C -> D}
  • 158. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Understanding Q-Learning With An Example • 5 rooms in a building connected by doors • each room is numbered 0 through 4 • The outside of the building can be thought of as one big room (5) • Doors 1 and 4 lead into the building from room 5 (outside) Place an agent in any one of the rooms (0,1,2,3,4) and the goal is to reach outside the building (room 5) 0 4 3 1 2 5
  • 159. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Understanding Q-Learning With An Example 0 4 3 1 2 5 Let’s represent the rooms on a graph, each room as a node, and each door as a link 1 2 3 40 5 Goal
  • 160. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Understanding Q-Learning With An Example Next step is to associate a reward value to each door: • doors that lead directly to the goal have a reward of 100 • Doors not directly connected to the target room have zero reward • Because doors are two-way, two arrows are assigned to each room • Each arrow contains an instant reward value 1 2 3 40 5 Goal 00 0 0 0 0 0 0 0 0 100 100 100
  • 161. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Understanding Q-Learning With An Example The terminology in Q-Learning includes the terms state and action: • Room (including room 5) represents a state • agent's movement from one room to another represents an action • In the figure, a state is depicted as a node, while "action" is represented by the arrows Example (Agent traverse from room 2 to room5): 1. Initial state = state 2 2. State 2 -> state 3 3. State 3 -> state (2, 1, 4) 4. State 4 -> state 5 1 2 3 40 5 Goal 00 0 0 0 0 0 0 0 0 100 100 100
  • 162. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Understanding Q-Learning With An Example We can put the state diagram and the instant reward values into a reward table, matrix R. The -1's in the table represent null values 1 2 3 40 5 Goal 00 0 0 0 0 0 0 0 0 100 100 100 0 1 2 3 4 5 -1 -1 -1 -1 0 -1 -1 -1 -1 0 -1 100 -1 -1 -1 0 -1 -1 -1 0 0 -1 0 -1 0 -1 -1 0 -1 100 -1 0 -1 -1 0 100 0 1 2 3 4 5 State Action R =
  • 163. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Understanding Q-Learning With An Example Add another matrix Q, representing the memory of what the agent has learned through experience. • The rows of matrix Q represent the current state of the agent • columns represent the possible actions leading to the next state • Formula to calculate the Q matrix: Q(state, action) = R(state, action) + Gamma * Max [Q(next state, all actions)] The Gamma parameter has a range of 0 to 1 (0 <= Gamma > 1). • If Gamma is closer to zero, the agent will tend to consider only immediate rewards. • If Gamma is closer to one, the agent will consider future rewards with greater weight Note
  • 164. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Q – Learning Algorithm 1 2 3 4 5 6 7 8 9 Set the gamma parameter, and environment rewards in matrix R Initialize matrix Q to zero Select a random initial state Set initial state = current state Select one among all possible actions for the current state Using this possible action, consider going to the next state Get maximum Q value for this next state based on all possible actions Compute: Q(state, action) = R(state, action) + Gamma * Max[Q(next state, all actions)] Repeat above steps until current state = goal state
  • 165. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Q – Learning Example First step is to set the value of the learning parameter Gamma = 0.8, and the initial state as Room 1. Next, initialize matrix Q as a zero matrix: • From room 1 you can either go to room 3 or 5, let’s select room 5. • From room 5, calculate maximum Q value for this next state based on all possible actions: Q(state, action) = R(state, action) + Gamma * Max[Q(next state, all actions)] Q(1,5) = R(1,5) + 0.8 * Max[Q(5,1), Q(5,4), Q(5,5)] = 100 + 0.8 * 0 = 100 1 3 5 1 4 5
  • 166. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Q – Learning Example First step is to set the value of the learning parameter Gamma = 0.8, and the initial state as Room 1. Next, initialize matrix Q as a zero matrix: • From room 1 you can either go to room 3 or 5, let’s select room 5. • From room 5, calculate maximum Q value for this next state based on all possible actions Q(state, action) = R(state, action) + Gamma * Max[Q(next state, all actions)] Q(1,5) = R(1,5) + 0.8 * Max[Q(5,1), Q(5,4), Q(5,5)] = 100 + 0.8 * 0 = 100 1 3 5 1 4 5
  • 167. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Q – Learning Example For the next episode, we start with a randomly chosen initial state, i.e. state 3 • From room 3 you can either go to room 1,2 or 4, let’s select room 1. • From room 1, calculate maximum Q value for this next state based on all possible actions: Q(state, action) = R(state, action) + Gamma * Max[Q(next state, all actions)] Q(3,1) = R(3,1) + 0.8 * Max[Q(1,3), Q(1,5)]= 0 + 0.8 * [0, 100] = 80 The matrix Q get’s updated 3 1 2 4 3 5
  • 168. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Q – Learning Example For the next episode, the next state, 1, now becomes the current state. We repeat the inner loop of the Q learning algorithm because state 1 is not the goal state. • From room 1 you can either go to room 3 or 5, let’s select room 5. • From room 5, calculate maximum Q value for this next state based on all possible actions: Q(state, action) = R(state, action) + Gamma * Max[Q(next state, all actions)] Q(1,5) = R(1,5) + 0.8 * Max[Q(5,1), Q(5,4), Q(5,5)] = 100 + 0.8 * 0 = 100 The matrix Q remains the same since, Q(1,5) is already fed to the agent 1 3 5 1 4 5
  • 169. Copyright © 2019, edureka and/or its affiliates. All rights reserved. Demo
  • 170. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science AI and ML and DL
  • 171. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Limitations of Machine Learning Traditional Machine Learning algorithms have failed to solve crucial problems of AI, such as Natural language processing, image recognition and so on.
  • 172. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Limitations of Machine Learning
  • 173. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Deep Learning to Rescue
  • 174. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Deep Learning to Rescue
  • 175. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science What is Deep Learning ? Deep Learning is a subset of Machine Learning where similar Machine Learning Algorithms are used to train Deep Neural Networks so as to achieve better accuracy in those cases where the former was not performing up to the mark. Basically, Deep learning mimics the way our brain functions i.e. it learns from experience.
  • 176. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Applications of Deep Learning • Automatic Machine Translation • Object Classification in Photographs • Automatic Handwriting Generation • Character Text Generation • Image Caption Generation • Colorization of Black and White Images • Automatic Game Playing
  • 177. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Applications of Deep Learning
  • 178. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science How Neuron Works? Deep learning is a form of machine learning that uses a model of computing that's very much inspired by the structure of the brain, so lets understand that first.
  • 179. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science A Perceptron Each neuron has a set of inputs, each of which is given a specific weight. The neuron computes some function on these weighted inputs and gives the output.
  • 180. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Role of Weights and Bias • For a perceptron, there can be one more input called bias • While the weights determine the slope of the classifier line, bias allows us to shift the line towards left or right
  • 181. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Activation Functions 1.Linear or Identity 2.Unit or Binary Step 3.Sigmoid or Logistic 4.Tanh 5.ReLU 6.SoftMax
  • 182. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Activation Functions LINEAR UNIT STEP
  • 183. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Activation Functions Sigmoid Tan h
  • 184. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Activation Functions ReLu SoftMax
  • 185. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Perceptron Example
  • 186. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Perceptron Example
  • 187. Copyright © 2019, edureka and/or its affiliates. All rights reserved. Demo
  • 188. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science What are Tensors? Tensors are the standard way of representing data in deep learning Tensors are just multidimensional arrays, an extension of 2-dimensional tables (matrices) to data with higher dimension. Tensor 0f Dimension -6 Tensor 0f Dimension –[6,4] Tensor 0f Dimension –[6,4,2]
  • 189. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science What is TensorFlow? In Tensorflow, computation is approached as a dataflow graph Tensor Flow ADD MATMUL Data RESULT
  • 190. Copyright © 2019, edureka and/or its affiliates. All rights reserved. Demo
  • 191. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Perceptron Problems • Single-Layer Perceptrons cannot classify non-linearly separable data points. • Complex problems, that involve a lot of parameters cannot be solved by Single-Layer Perceptrons.
  • 192. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Deep Neural Network
  • 193. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Deep Neural Network
  • 194. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Training Network Weights • We can estimate the weight values for our training data using ‘stochastic gradient descent’ optimizer. • Stochastic gradient descent requires two parameters: • Learning Rate: Used to limit the amount each weight is corrected each time it is updated. • Epochs: The number of times to run through the training data while updating the weight. • These, along with the training data will be the arguments to the function.
  • 195. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science MNIST Dataset
  • 196. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Deep Neural Network
  • 197. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Companies Hiring
  • 198. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Science Master Program
  • 199. YouTube Video Link in the Description