This use case presents how Machine Learning can help the Hotel Sector better plan ahead. The lecturer is Andrés González, Associate Professor at EOI Business School and CTO and co-founder at Cleverdata.io
*Machine Learning School for Business Schools 2021: Virtual Conference.
11. Transforming and Selecting
Transforming and Feature Engineering
Tasting the Dish
Evaluating Prediction Quality
Cooking
Training the Model
ML Phases
Go to the market to buy ingredients
Cleaning
Gathering RAW data
Cleaning Data
15. RAW Data
One year historical
reservation data
(.xlsx file)
Characteristics
•260.000 reservations
•80 fields
•57 categorical
•9 numeric
•10 date
•3 text
•1 incorrect field
•Size: 150 MB
18. Data Cleaning
• Reservations without
check-in
• Cancelled reservations
• Rows with errors
• IDs vs names
• Columns with very few
data
• Give dates a format
• Delete accents
• Transform .xlsx -> .csv
Row Deletion Column Deletion Other Actions
19. Dirty
•260.000 reservations
•80 fields
•57 categorical
•9 numeric
•10 date
•3 text
•1 incorrect field
•Size: 150 MB
Clean Dataset
Clean
•150.000 rows
•46 fields
•26 categorical
•9 numeric
•10 date
•1 text
•Size: 75MB
21. Transformations & FE
Country Grouping
•A lot of countries to predict
(210)
•Total number of groups: 20
New Fields
• RESERV_ANTICIPATION
(calculated):
(checkin date - reservation date)
• HOTEL_COUNTRY (name of the
country)
• HOTEL_STARS (1-5)