SlideShare uma empresa Scribd logo
1 de 18
CREDIT CARD
FRAUD Detection
2023
By
Shivam Tiwari
What is Credit Card fraud ??
E
X
A
M
P
L
E
S
Insider Fraud
Phishing
Skimming
Identity Theft
Credit card fraud is
unauthorized use of
someone's card for
purchases, causing
financial loss and
inconvenience.
“To Predict whether the
transactions are
fraudulent or not”
Data Acquisition &
Description
Data preprocessing
Exploratory Data
Analysis
Data Preparation
Model Selection and
Model Training
Conclusion
Work Flow
🤷♂️ No missing values.
Credit Card Fraud
Detection Dataset
2023
Dataset
Column
s
31
Rows
568630
Features
(Columns)
id : Unique identifier for each transaction
V1-V28: Anonymized features representing various transaction
attributes
(e.g., time, location, etc.)
Amount : The transaction amount
Class: Binary label indicating whether the transaction is
fraudulent(1) or not (0)
🤷♂️ No duplicates.
👍 Data type also looks fine
Data preprocessing
A well-structured dataset with 568,630 rows
and 31 columns, featuring no null values and
balanced distribution, provides a reliable
foundation for in-depth analyses, yielding
valuable insights across various domains
Enhanced security, reduced
financial losses, and improved
customer trust through
identification of fraudulent
credit card transactions”
EDA
Exploratory Data
Analysis
Exploratory Data Analysis (EDA) is a
statistical approach to analyze and visualize
data sets, helping to discover patterns,
relationships, and insights for better
understanding and decision-making
df.info()
df.info() provides
concise information
about a Data Frame,
including data types,
non-null count
and memory usage.
df.describe
()
Summarizes Data Frame statistics
like mean, standard deviation, and
quartiles, offering insights into
numerical data distribution and
central tendencies.
df.shape
Displays the
number of
rows and
columns in
Data Frame.
Df.dtype()
Shows data types of each column in Data
Frame.
1:- V17 and V18 are highly co-related. 2:- V16 and V17 are highly co-related. 3:- V9 and
V10 are also positively co-related. 4:- V14 has a negative correlation with V4.
A heatmap visually represents data intensity using color variations, with
warmer colors indicating higher values and cooler colors indicating lower
values.
#Lets look data at
heatmap
paper =
plt.figure(figsize=[2
0,12])
sns.heatmap(df.cor
r(),cmap='BuPu',an
not=True)
plt.title('Correlation
Heatmap',color='re
d')
plt.show()
df.skew()
id -6.579536e-16 V1
-8.341717e-02 V2 -
1.397952e+00 V3
1.462221e-02 V4 -
4.416893e-02 V5
1.506414e+00 V6 -
2.016110e-01 V7
1.902687e+01 V8
2.999722e-01 V9
1.710575e-01 V10
7.404136e-01 V11 -
2.089056e-02 V12
6.675895e-02 V13
1.490639e-02 V14
2.078348e-01 V15
1.123298e-02 V16
2.664070e-01 V17
3.730610e-01 V18
1.291911e-01 V19 -
1.017123e-02 V20 -
1.556460e+00 V21 -
1.089833e-01 V22
3.185295e-01 V23 -
9.968746e-02 V24
6.608974e-02 V25
2.300804e-02 V26 -
1.895874e-02 V27
2.755452e+00 V28
1.724978e+00 Amount
1.655585e-03 Class
0.000000e+00 dtype:
float64
Observations
(●'◡’●):--
Features like
V1,V23 are
highly
negatively
skewed.
plt.figure(figsize=(6, 4)) # Adjust the figure
size as needed
sns.countplot(x='Class', data=df)
plt.xlabel('Class')
plt.ylabel('Count')
plt.title('Distribution of Class')
plt.show()
df['Amount'].plot.box()
A box plot, or box-and-whisker plot, displays the distribution of a dataset,
showing the median, quartiles, and outliers. It provides a visual summary of
central tendency and spread.
# Assuming 'df' is DataFrame and
'Amount' is a column in it
sns.kdeplot(data=df['Amount'], fill=True)
plt.show()
A KDE (Kernel Density Estimate) plot depicts the probability density function of a
continuous variable, smoothing data distribution visually.
Observations
: ♊ Amount is
fairly
Normally
distributed.
# Lets plot a histogram
paper, axes = plt.subplots(2, 2, figsize=(10, 6))
df['V1'].plot(kind='hist', ax=axes[0,0], title='Distribution of V1')
df['V10'].plot(kind='hist', ax=axes[0,1], title='Distribution of V10')
df['V12'].plot(kind='hist', ax=axes[1,0], title='Distribution of V12')
df['V23'].plot(kind='hist', ax=axes[1,1], title='Distribution of V23')
plt.suptitle('Distribution of V1,V10,V12 and V23',size=14)
plt.tight_layout()
plt.show()
Data Preparation
Dividing Dataset into
“X” and “Y”
Shape of X
(568630, 29)
Shape of Y
(568630) Let's standardize all our
features to bring them on
the same scale. #I have
used standard scaler
Feature Scaling
sc = StandardScaler()
x_scaled = sc.fit_transform(x)
x_scaled_df =
pd.DataFrame(x_scaled,columns=x.c
olumns)
Model Selection and Model Training
Dividing dataset into
Training Data and
Testing Data
# Lets Split our dataset into train and test
x_train,x_test,y_train,y_test =
train_test_split(x_scaled_df,y,test_size=0.25,random_state=15,stratify= y)
Decision Tree Model
Model Classification
report
Accuracy Score:-
99.80022228787687
Random Forest Classifier
Model
Classification
report
Accuracy Score:-
96.44480085538626
Logistic regression
Model Classification
report
Accuracy Score:-
99.98454426173694
ALGORITHM ACCURA
CY
CONFUSION MATRIX
CLASSIFICATTION
REPORT
Decision Tree
Model
99.800222
%
Random Forest
Classifier
99.984544
2%
Logistic
Regression
96.444800
8%
Conclusion
•We have done Exploratory Data analysis for different
features.
•We prepared our Data and built different ML Models.
•We have seen two different models, and how they are
performing w.r.t Accuracy, Precision.
• The Decision Tree Method has a higher accuracy score
on the test dataset.
•We have created the confusion matrix to see the details of
the prediction accuracy of each model.
Thank YOU

Mais conteúdo relacionado

Mais de Boston Institute of Analytics

Enhancing Cybersecurity: An In-depth Analysis of Travelblog.org
Enhancing Cybersecurity: An In-depth Analysis of Travelblog.orgEnhancing Cybersecurity: An In-depth Analysis of Travelblog.org
Enhancing Cybersecurity: An In-depth Analysis of Travelblog.orgBoston Institute of Analytics
 
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRF
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRFExploring Web Security Threats: A Practical Study on SQL Injection and CSRF
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRFBoston Institute of Analytics
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachBoston Institute of Analytics
 
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...Boston Institute of Analytics
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
NLP Based project presentation: Analyzing Automobile Prices
NLP Based project presentation: Analyzing Automobile PricesNLP Based project presentation: Analyzing Automobile Prices
NLP Based project presentation: Analyzing Automobile PricesBoston Institute of Analytics
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationBoston Institute of Analytics
 
Predicting Liver Disease in India: A Machine Learning Approach
Predicting Liver Disease in India: A Machine Learning ApproachPredicting Liver Disease in India: A Machine Learning Approach
Predicting Liver Disease in India: A Machine Learning ApproachBoston Institute of Analytics
 
Employee Churn Prediction: Artificial Intelligence Project Presentation
Employee Churn Prediction: Artificial Intelligence Project PresentationEmployee Churn Prediction: Artificial Intelligence Project Presentation
Employee Churn Prediction: Artificial Intelligence Project PresentationBoston Institute of Analytics
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationBoston Institute of Analytics
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 

Mais de Boston Institute of Analytics (20)

Enhancing Cybersecurity: An In-depth Analysis of Travelblog.org
Enhancing Cybersecurity: An In-depth Analysis of Travelblog.orgEnhancing Cybersecurity: An In-depth Analysis of Travelblog.org
Enhancing Cybersecurity: An In-depth Analysis of Travelblog.org
 
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRF
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRFExploring Web Security Threats: A Practical Study on SQL Injection and CSRF
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRF
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Detecting Credit Card Fraud: An AI-driven Approach
Detecting Credit Card Fraud: An AI-driven ApproachDetecting Credit Card Fraud: An AI-driven Approach
Detecting Credit Card Fraud: An AI-driven Approach
 
Predicting House Prices: A Machine Learning Approach
Predicting House Prices: A Machine Learning ApproachPredicting House Prices: A Machine Learning Approach
Predicting House Prices: A Machine Learning Approach
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
NLP Based project presentation: Analyzing Automobile Prices
NLP Based project presentation: Analyzing Automobile PricesNLP Based project presentation: Analyzing Automobile Prices
NLP Based project presentation: Analyzing Automobile Prices
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
Analyzing Movie Reviews : Machine learning project
Analyzing Movie Reviews : Machine learning projectAnalyzing Movie Reviews : Machine learning project
Analyzing Movie Reviews : Machine learning project
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health Classification
 
Predicting Liver Disease in India: A Machine Learning Approach
Predicting Liver Disease in India: A Machine Learning ApproachPredicting Liver Disease in India: A Machine Learning Approach
Predicting Liver Disease in India: A Machine Learning Approach
 
Employee Churn Prediction: Artificial Intelligence Project Presentation
Employee Churn Prediction: Artificial Intelligence Project PresentationEmployee Churn Prediction: Artificial Intelligence Project Presentation
Employee Churn Prediction: Artificial Intelligence Project Presentation
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 

Último

Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 

Último (20)

Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 

Combating Fraudulent Transactions: A Deep Dive into Credit Card Fraud Detection

  • 1.
  • 3. What is Credit Card fraud ?? E X A M P L E S Insider Fraud Phishing Skimming Identity Theft Credit card fraud is unauthorized use of someone's card for purchases, causing financial loss and inconvenience. “To Predict whether the transactions are fraudulent or not”
  • 4. Data Acquisition & Description Data preprocessing Exploratory Data Analysis Data Preparation Model Selection and Model Training Conclusion Work Flow
  • 5. 🤷♂️ No missing values. Credit Card Fraud Detection Dataset 2023 Dataset Column s 31 Rows 568630 Features (Columns) id : Unique identifier for each transaction V1-V28: Anonymized features representing various transaction attributes (e.g., time, location, etc.) Amount : The transaction amount Class: Binary label indicating whether the transaction is fraudulent(1) or not (0) 🤷♂️ No duplicates. 👍 Data type also looks fine Data preprocessing
  • 6. A well-structured dataset with 568,630 rows and 31 columns, featuring no null values and balanced distribution, provides a reliable foundation for in-depth analyses, yielding valuable insights across various domains Enhanced security, reduced financial losses, and improved customer trust through identification of fraudulent credit card transactions”
  • 7. EDA Exploratory Data Analysis Exploratory Data Analysis (EDA) is a statistical approach to analyze and visualize data sets, helping to discover patterns, relationships, and insights for better understanding and decision-making df.info() df.info() provides concise information about a Data Frame, including data types, non-null count and memory usage. df.describe () Summarizes Data Frame statistics like mean, standard deviation, and quartiles, offering insights into numerical data distribution and central tendencies. df.shape Displays the number of rows and columns in Data Frame. Df.dtype() Shows data types of each column in Data Frame.
  • 8. 1:- V17 and V18 are highly co-related. 2:- V16 and V17 are highly co-related. 3:- V9 and V10 are also positively co-related. 4:- V14 has a negative correlation with V4. A heatmap visually represents data intensity using color variations, with warmer colors indicating higher values and cooler colors indicating lower values. #Lets look data at heatmap paper = plt.figure(figsize=[2 0,12]) sns.heatmap(df.cor r(),cmap='BuPu',an not=True) plt.title('Correlation Heatmap',color='re d') plt.show()
  • 9. df.skew() id -6.579536e-16 V1 -8.341717e-02 V2 - 1.397952e+00 V3 1.462221e-02 V4 - 4.416893e-02 V5 1.506414e+00 V6 - 2.016110e-01 V7 1.902687e+01 V8 2.999722e-01 V9 1.710575e-01 V10 7.404136e-01 V11 - 2.089056e-02 V12 6.675895e-02 V13 1.490639e-02 V14 2.078348e-01 V15 1.123298e-02 V16 2.664070e-01 V17 3.730610e-01 V18 1.291911e-01 V19 - 1.017123e-02 V20 - 1.556460e+00 V21 - 1.089833e-01 V22 3.185295e-01 V23 - 9.968746e-02 V24 6.608974e-02 V25 2.300804e-02 V26 - 1.895874e-02 V27 2.755452e+00 V28 1.724978e+00 Amount 1.655585e-03 Class 0.000000e+00 dtype: float64 Observations (●'◡’●):-- Features like V1,V23 are highly negatively skewed. plt.figure(figsize=(6, 4)) # Adjust the figure size as needed sns.countplot(x='Class', data=df) plt.xlabel('Class') plt.ylabel('Count') plt.title('Distribution of Class') plt.show()
  • 10. df['Amount'].plot.box() A box plot, or box-and-whisker plot, displays the distribution of a dataset, showing the median, quartiles, and outliers. It provides a visual summary of central tendency and spread.
  • 11. # Assuming 'df' is DataFrame and 'Amount' is a column in it sns.kdeplot(data=df['Amount'], fill=True) plt.show() A KDE (Kernel Density Estimate) plot depicts the probability density function of a continuous variable, smoothing data distribution visually. Observations : ♊ Amount is fairly Normally distributed.
  • 12. # Lets plot a histogram paper, axes = plt.subplots(2, 2, figsize=(10, 6)) df['V1'].plot(kind='hist', ax=axes[0,0], title='Distribution of V1') df['V10'].plot(kind='hist', ax=axes[0,1], title='Distribution of V10') df['V12'].plot(kind='hist', ax=axes[1,0], title='Distribution of V12') df['V23'].plot(kind='hist', ax=axes[1,1], title='Distribution of V23') plt.suptitle('Distribution of V1,V10,V12 and V23',size=14) plt.tight_layout() plt.show()
  • 13. Data Preparation Dividing Dataset into “X” and “Y” Shape of X (568630, 29) Shape of Y (568630) Let's standardize all our features to bring them on the same scale. #I have used standard scaler Feature Scaling sc = StandardScaler() x_scaled = sc.fit_transform(x) x_scaled_df = pd.DataFrame(x_scaled,columns=x.c olumns)
  • 14. Model Selection and Model Training Dividing dataset into Training Data and Testing Data # Lets Split our dataset into train and test x_train,x_test,y_train,y_test = train_test_split(x_scaled_df,y,test_size=0.25,random_state=15,stratify= y) Decision Tree Model Model Classification report Accuracy Score:- 99.80022228787687
  • 15. Random Forest Classifier Model Classification report Accuracy Score:- 96.44480085538626 Logistic regression Model Classification report Accuracy Score:- 99.98454426173694
  • 16. ALGORITHM ACCURA CY CONFUSION MATRIX CLASSIFICATTION REPORT Decision Tree Model 99.800222 % Random Forest Classifier 99.984544 2% Logistic Regression 96.444800 8%
  • 17. Conclusion •We have done Exploratory Data analysis for different features. •We prepared our Data and built different ML Models. •We have seen two different models, and how they are performing w.r.t Accuracy, Precision. • The Decision Tree Method has a higher accuracy score on the test dataset. •We have created the confusion matrix to see the details of the prediction accuracy of each model.