Data Science Project: Advancements in Fetal Health Classification

Fetal health classification is a critical task in obstetrics, as it can help to identify and manage fetal health problems
early on. Accurate assessment of fetal health is crucial for timely intervention and improved healthcare outcomes for
both mothers and their babies.
The healthcare industry (also called the medical industry or health economy) is an aggregation
and integration of sectors within the economic system that provides goods and services to treat
patients with curative, preventive, rehabilitative, and palliative care.
CAPSTONE PROJECT
Topic : Fetal Health Classification
Overview of the Project:

Problem statement:
To Classify fetal health in order to prevent child and maternal
mortality.
We will divide the result into 3 classes
1.Normal
2.Suspect
3.Pathological

The models which are used are as follows:
I have implemented 3 different Models all of them had Accuracies around 90%.
How my model will help :
Models:
K-Nearest Neighbours (KNN)
Support Vector Machine (SVM)
Random forest (RF)
I have used confusion matrix to understand and compare model performances.

Step to follow:
• Loading, Understanding Data
• EDA (Exploratory Data Analysis)
• Correlation, Class Distribution
• Train Test Split
• Over Sampling
• Visualization
• Data Normalization
• Algorithm
• Model Evaluation
• Comparison

Loading, Understanding the data-
1. Libraries imported:
 Numpy
 Pandas
 Matplotlib
 Seaborn
2. Loaded the Data:
3. Understanding the data:
df = There are 2126 rows and 22 colums.

df.info() =
To get information of data.
df.isnull().sum() =
There are no null values.

• EDA (Exploratory Data Analysis):
Heat map is a part of visualization it
shows the Highly correlation between
two or more independent variables.

After the class distribution we can see that
the class distribution is different from each
other.
We have a dataframe of colums . In which we have
count the amount of data in each class.

The imbalanced data we have balanced it and
we have done in over sampling method after
the over sampling we can see the difference
between class distribution and resampling class
distribution data.

Here we can see that we have visualize it by pie circle in which out of 100% there
are 78% normal, 14% suspect, 8% pathological.
Fetal Health classification pie circle

Model Building
Libraries imported for building models:
• Seperating the data into x and y .
• Divided the data into X_train, Y_train, X _test, Y_test and X_test to fit the model and also test the model on test
data.
# KNN / k – Nearest Neighbours
Train Test Split
Declare Feature Vector and Target Variable

Made prediction on test
data and calculated the
model accuracy score
which is 91%
Training KNN / k – Nearest
Neighbours model
Confusion Matrix, has provided insights into the model's
performance and helped me in evaluating various metrics such as
accuracy, precision, recall, and F1 score which are given below.
Model Accuracy Score

Support Vector
Machine (SVM)
• Seperating the data into x and y .
• Divided the data into X_train, Y_train, X _test, Y_test and X_test to fit the model and also
test the model on test data.

Made prediction on test
data and calculated the
model accuracy score which
is around 90%
Data Standardization using MinMaxScaler.
Support Vector
Machine (SVM)
Confusion Matrix, has provided insights into the model's performance and helped
me in evaluating various metrics such as accuracy, precision, recall, and F1 score
which are given below.

Made prediction on test data and calculated the model accuracy
score which is around 94%
Random forest (RF)
Divided the data into X_train, Y_train, X _test, Y_test and X_test to fit the model
and also test the model on test data.

Confusion Matrix, has provided insights into the model's performance and
helped me in evaluating various metrics such as accuracy, precision, recall, and
F1 score which are given below.
Data Standardization using MinMaxScaler.
Random forest (RF)
Made prediction on test data and
calculated the model accuracy score
which is around 94%

Conclusion
After analysing and calculating the performance of different classification models it has
been observed that the randon forest gives best result. Hence we can use randon forest as
our model.

Data Science Project: Advancements in Fetal Health Classification

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a Data Science Project: Advancements in Fetal Health Classification

Semelhante a Data Science Project: Advancements in Fetal Health Classification (20)

Mais de Boston Institute of Analytics

Mais de Boston Institute of Analytics (20)

Último

Último (20)

Data Science Project: Advancements in Fetal Health Classification