SlideShare uma empresa Scribd logo
1 de 17
1
Exploring New York Neighborhoods
for the best Italian Restaurants
Using Data Analytics
(The Battle of Neighborhoods)
CHIBUIKE OSIGWE
i
Exploring New York Neighborhoods
for the best Italian Restaurants Using
Data Analytics
(The Battle of Neighborhoods)
CHIBUIKE OSIGWE
ii
Preface
As a part of the IBM Data Science professional program Capstone Project, we
worked on the real datasets to get an experience of what a data scientist goes through
in real life. Main objectives of this project were to define a business problem, look
for data in the web and use Foursquare location data to compare different
neighborhoods of New York to figure out which neighborhood is suitable for starting
a new restaurant business. In this project, we will go through all the process in a step
by step manner from problem designing, data preparation to final analysis and finally
will provide a conclusion that can be leveraged by the business stakeholders to make
their decisions.
iii
Content
Preface....................................................................................................................... ii
Content..................................................................................................................... iii
Introduction................................................................................................................1
1.1 Background.......................................................................................................1
1.2 Problem.............................................................................................................2
1.3 Target Audience................................................................................................3
Data Acquisition and Methodology...........................................................................4
2.1 Data Source.......................................................................................................4
2.2 Methodology.....................................................................................................4
Exploratory Data Analysis.........................................................................................5
3.1 Number of Neighborhoods ...............................................................................5
3.2 Italian Restaurants Per Borough.......................................................................5
3.3 Italian Restaurants Per Neighborhood..............................................................9
Conclusion and Recommendation ...........................................................................12
4.1 Recommendation and Discussion...................................................................12
4.2 Conclusion ......................................................................................................13
1
Introduction
1.1 Background
New York City (NYC), often called the City of New York or simply New
York (NY), is the most populous city in the United States. With an estimated 2018
population of 8,398,748 distributed over about 302.6 square miles (784 km2
), New
York is also the most densely populated major city in the United States.[10]
Located
at the southern tip of the U.S. state of New York, the city is the center of the New
York metropolitan area, the largest metropolitan area in the world by urban
landmass.[11]
With almost 20 million people in its metropolitan statistical area and
approximately 23 million in its combined statistical area, it is one of the world's most
populous megacities. New York City has been described as the cultural, financial,
and media capital of the world, significantly influencing
commerce,[12]
entertainment, research, technology, education, politics, tourism, art,
fashion, and sports. Home to the headquarters of the United Nations,[13]
New York
is an important center for international diplomacy.[14][15]
Situated on one of the world's largest natural harbors, New York City is composed
of five boroughs, each of which is a county of the State of New York.[16]
The five
boroughs–Brooklyn, Queens, Manhattan, the Bronx, and Staten Island–were
consolidated into a single city in 1898.[17]
The city and its metropolitan area
constitute the premier gateway for legal immigration to the United States. As many
as 800 languages are spoken in New York,[18]
making it the
most linguistically diverse city in the world. New York is home to more than
3.2 million residents born outside the United States,[19]
the largest foreign-born
population of any city in the world as of 2016.[20][21]
As of 2019, the New York
2
metropolitan area is estimated to produce a gross metropolitan product (GMP) of
$2.0 trillion. If greater New York City were a sovereign state, it would have the 12th
highest GDP in the world.[22]
New York is home to the highest number of billionaires
of any city in the world.
Figure 1: A Typical Italian Restaurant
1.2 Problem
This final project explores the best locations for Italian restaurants throughout the
city of New York. Food Business News stated that worldwide pasta sales were up
for the second year in a row with the United Sates holding the largest market
(Donley, 2018). New York is a major metropolitan area with more than 8.4 million
(Quick Facts, 2018) people living within city limits. Most of the Italian immigration
3
into the United States occurred during the late 19th and early 20th century with over
two million immigrants between 1900 and 1910. Italian families first settled in Little
Italy’s neighborhood around Mulberry Street as has continued to thrive ever since.
Italy account for the largest black immigrants in the United State, with almost
100,000 Manhattan inhabitants reporting Italian ancestry, the need to find and enjoy
Italian cuisine is on the rise. This report explores which neighborhoods and boroughs
of New York City have the most as well as the best Italian restaurants. Additionally,
I will attempt to answer the questions “Where should I open a Italian Restaurant?”
and “Where should I stay If I want great Italian food?”
1.3 Target Audience
Who will be more interested in this project? What type of clients or a group of people
will benefit?
1. Business personnel who wants to invest or open a Italian restaurant in New
York. This analysis will be a comprehensive guide to start or expand
restaurants targeting the Italian crowd.
2. Freelancers who loves to have their own restaurant as a side business. This
analysis will give an idea, how beneficial it is to open a restaurant and what
are the pros and cons of this business.
3. Italian crowd who wants to find neighborhoods with lots of option for Italian
restaurants.
4. Business Analyst or Data Scientists, who wish to analyze the neighborhoods
of New York using Exploratory Data Analysis and other statistical & machine
learning techniques to obtain all the necessary data, perform some operations
on it and, finally be able to tell a story out of it.
4
Data Acquisition and Methodology
2.1 Data Source
In order to answer the above questions, data on New York City neighborhoods,
boroughs to include boundaries, latitude, longitude, restaurants, and restaurant
ratings and tips are required.
 New York City data containing the neighborhoods and boroughs, latitudes,
and longitudes will be obtained from the data
source: https://cocl.us/new_york_dataset
 New York City data containing neighborhood boundaries will be obtained
from the data source: https://data.cityofnewyork.us/City-
Government/Borough-Boundaries/tqmj-j8zm
 All data related to locations and quality of Italian restaurants will be
obtained via the FourSquare API utilized via the Request library in Python.
2.2 Methodology
Data will be collected from https://cocl.us/new_york_dataset and cleaned and
processed into a data frame. Foursquare be used to locate all venues and then filtered
by Italian restaurants. Ratings, tips, and likes by users will be counted and added to
the data frame. Data will be sorted based on rankings. Finally, the data be will be
visually assessed using graphing from various Python libraries.
5
Exploratory Data Analysis
3.1 Number of Neighborhoods
Foursquare API is very useful online application used my many developers & other
applications like Uber etc. In this project I have used it to retrieve information about
the places present in the neighborhoods of New York. The API returns a JSON file
and we need to turn that into a data-frame. Here I have chosen 100 popular spots for
each neighborhood within a radius of 1km.
From figure 1 below, it can be seen that the Manhattan have the lowest number of
neighborhood while Queens Borough have the highest number. Brooklyn and Staten
Island seem to have seem to be in pair. This shows a little bit of competitive attribute
between the two boroughs.
Using the Folium package, the coordinates of the various neighborhoods bbelonging
to the five boroughs were ascertained after requested. This can be found in Figure
two.
3.2 Italian Restaurants Per Borough
Total number of 233 restaurants were returned from the analysis, each belonging to
a particular borough and neighborhood.
6
Figure 2: Neigbourhood per borough
Figure 3 A Snapshot of the Boroughs and Neighborhood around New York
7
Figure 4: Italian Restuarants Per Borough
From Figure 3 above, it can be deduced that Manhattan have the highest number of
Italian restaurants despite having the least number of neighborhood. They have up
to 100 Italian restaurants in the borough. The Queen borough have the least number
with a total of 20. Additionally, Brooklyn and Staten Island are almost on pair
showing a high competition attribute between the two.
8
Figure 5: A picture of the Neighborhoods and Boroughs showing the total number
of Italian restaurants
Figure 6: Italian Restaurants Per Neighborhood
9
This shows that Manhattan borough accounts fo the highest number of Borough
despite having the smallest number of Neighbourhoods. Figure 4 shows a returned
value showing the total of Italian restaurants.
3.3 Italian Restaurants Per Neighborhood
From Figure 5, it can be deduced that the neighborhood of Belmont have the highest
number of Italian restaurant with over 16 numbers. This is followed by Greenwich
Village, then West Village to Lenox Hill which have the lowest. The range of
numbers of the Italian restaurant is highly skewed, showing that they are all
dispersed throughout the neighbourhoods.
From figure 6, it is evidently shown that Belmont Neighborhood belongs to Bronx
borough. This means that Bronx borough have the highest of restaurant of a
particular neighborhood
10
Figure 7: figure showing Belmont Neighborhood
.
11
Figure 8: Map Showing the restaurant density of the Neighbourhood and Borough
The map shows a high clustered visualization around Manhattan and Lenox Hill,
judging from their locations.
12
Conclusion and Recommendation
4.1 Recommendation and Discussion
Queens and The Bronx have the least amount of Italian restaurants per borough.
However, of note, Belmont of The Bronx is the neighborhood in all of NYC with
the most Italian Restaurants. Despite Manhattan having the least number of
neighborhoods in all five boroughs, it has the most Italian restaurants. Based on this
information, I would state that Manhattan and Queens are the best locations for
Italian cuisine in NYC. To have the best shot of success, I would open an Italian
restaurant in Queens. Queens has multiple neighborhoods and has the least number
of Italian restaurants making competition easier than in other boroughs.
According to this analysis, Queens’s borough will provide the least competition for
the new upcoming Italian restaurant, as there is very little Italian restaurants spread
or no Italian restaurants in few neighborhoods. Also looking at the population
distribution seems like it is densely populated with Italian crowd, which helps the
new restaurant by providing high customer visit possibility. Therefore, definitely
this region could potentially be a perfect place for starting quality Italian restaurants.
Some of the drawbacks of this analysis are — the clustering is completely based
only on data obtained from Foursquare API and the data about the Italian population
distribution in each neighborhood is also based on the 2016 census which is not up-
to date. Thus, there is a huge gap of around 3 years in the population distribution
data. Even Though there are many areas where it can be improved, yet this analysis
has certainly provided us with some good insights, preliminary information on
possibilities & a head start into this business problem by setting the step stones
properly.
13
4.2 Conclusion
Finally, to conclude this project, wwe have got a chance to solve a business problem
like how a real like data scientists would do. We have used many python libraries to
fetch the data, to manipulate the contents & to analyze and visualize those datasets.
We have made use of Foursquare API to explore the venues in neighborhoods of
New York, then get good amount of data from online. We also applied Visualization
technique for insights and used Folium to visualize it on a map.
Some of the drawbacks or areas of improvement shows us that this analysis can be
further improved with the help of more data and easy coding syntax. Similarly we
can use this project to analysis any scenario such as opening a different cuisine
restaurant or opening of a new gym and etc. I hope that this project helps as an initial
guidance to take more complex real-life challenges using data-science.
Find the code for this analysis on github .
Find me on LinkedIn!

Mais conteúdo relacionado

Semelhante a Explore NYC's Best Italian Restaurants

Port Dickson Essay. Online assignment writing service.
Port Dickson Essay. Online assignment writing service.Port Dickson Essay. Online assignment writing service.
Port Dickson Essay. Online assignment writing service.Inell Campbell
 
Sat Essay Blank Paper
Sat Essay Blank PaperSat Essay Blank Paper
Sat Essay Blank PaperTania Knapp
 
Graebel_CitySynopsis_Chicago_US
Graebel_CitySynopsis_Chicago_USGraebel_CitySynopsis_Chicago_US
Graebel_CitySynopsis_Chicago_USPat Liberati
 
City of San Antonio - Texas Digitization Expo 2010
City of San Antonio - Texas Digitization Expo 2010City of San Antonio - Texas Digitization Expo 2010
City of San Antonio - Texas Digitization Expo 2010Sarah Walch, CA
 
Essay On Apparel Industry. Online assignment writing service.
Essay On Apparel Industry. Online assignment writing service.Essay On Apparel Industry. Online assignment writing service.
Essay On Apparel Industry. Online assignment writing service.Amy Colantuoni
 
2018 LA Tech & Venture Scene | Amplify.LA
2018 LA Tech & Venture Scene | Amplify.LA2018 LA Tech & Venture Scene | Amplify.LA
2018 LA Tech & Venture Scene | Amplify.LAEric Pakravan
 
CONFERENCE PAPER.Explosive Economic Growth in the San Francisco Bay Area has ...
CONFERENCE PAPER.Explosive Economic Growth in the San Francisco Bay Area has ...CONFERENCE PAPER.Explosive Economic Growth in the San Francisco Bay Area has ...
CONFERENCE PAPER.Explosive Economic Growth in the San Francisco Bay Area has ...David Woltering
 
Ibm capstone assignment (part 2)ppt
Ibm capstone assignment (part 2)pptIbm capstone assignment (part 2)ppt
Ibm capstone assignment (part 2)pptArpitVasava1
 
Public commentdraftanalysis6292012
Public commentdraftanalysis6292012Public commentdraftanalysis6292012
Public commentdraftanalysis6292012cookcountyblog
 
E-Gov To We-Gov in Moscow. Best Practices In Open Government.
E-Gov To We-Gov in Moscow. Best Practices In Open Government.E-Gov To We-Gov in Moscow. Best Practices In Open Government.
E-Gov To We-Gov in Moscow. Best Practices In Open Government.The Glover Park Group
 
There is Something Going on in the LA Tech Market by Upfront Ventures
There is Something Going on in the LA Tech Market by Upfront VenturesThere is Something Going on in the LA Tech Market by Upfront Ventures
There is Something Going on in the LA Tech Market by Upfront VenturesMark Suster
 
Pay For Someone To Write Your Paper
Pay For Someone To Write Your PaperPay For Someone To Write Your Paper
Pay For Someone To Write Your PaperJackie Gold
 
Spatial Patterns of Urban Innovation and Productivity
Spatial Patterns of Urban Innovation and ProductivitySpatial Patterns of Urban Innovation and Productivity
Spatial Patterns of Urban Innovation and ProductivityRadu Stancut
 
Paid Writing Assignments -. Online assignment writing service.
Paid Writing Assignments -. Online assignment writing service.Paid Writing Assignments -. Online assignment writing service.
Paid Writing Assignments -. Online assignment writing service.Ashley Carter
 
DSDP Demographic Study 2016
DSDP Demographic Study 2016DSDP Demographic Study 2016
DSDP Demographic Study 2016Caroline Stevens
 
Capstone Project: The Battle of Neighborhoods (Week 2)
Capstone Project: The Battle of Neighborhoods (Week 2)Capstone Project: The Battle of Neighborhoods (Week 2)
Capstone Project: The Battle of Neighborhoods (Week 2)TewodrosTazeze
 

Semelhante a Explore NYC's Best Italian Restaurants (20)

Port Dickson Essay. Online assignment writing service.
Port Dickson Essay. Online assignment writing service.Port Dickson Essay. Online assignment writing service.
Port Dickson Essay. Online assignment writing service.
 
Sat Essay Blank Paper
Sat Essay Blank PaperSat Essay Blank Paper
Sat Essay Blank Paper
 
Graebel_CitySynopsis_Chicago_US
Graebel_CitySynopsis_Chicago_USGraebel_CitySynopsis_Chicago_US
Graebel_CitySynopsis_Chicago_US
 
City of San Antonio - Texas Digitization Expo 2010
City of San Antonio - Texas Digitization Expo 2010City of San Antonio - Texas Digitization Expo 2010
City of San Antonio - Texas Digitization Expo 2010
 
Essay On Apparel Industry. Online assignment writing service.
Essay On Apparel Industry. Online assignment writing service.Essay On Apparel Industry. Online assignment writing service.
Essay On Apparel Industry. Online assignment writing service.
 
2018 LA Tech & Venture Scene | Amplify.LA
2018 LA Tech & Venture Scene | Amplify.LA2018 LA Tech & Venture Scene | Amplify.LA
2018 LA Tech & Venture Scene | Amplify.LA
 
CONFERENCE PAPER.Explosive Economic Growth in the San Francisco Bay Area has ...
CONFERENCE PAPER.Explosive Economic Growth in the San Francisco Bay Area has ...CONFERENCE PAPER.Explosive Economic Growth in the San Francisco Bay Area has ...
CONFERENCE PAPER.Explosive Economic Growth in the San Francisco Bay Area has ...
 
Ibm capstone assignment (part 2)ppt
Ibm capstone assignment (part 2)pptIbm capstone assignment (part 2)ppt
Ibm capstone assignment (part 2)ppt
 
Public commentdraftanalysis6292012
Public commentdraftanalysis6292012Public commentdraftanalysis6292012
Public commentdraftanalysis6292012
 
E-Gov To We-Gov in Moscow. Best Practices In Open Government.
E-Gov To We-Gov in Moscow. Best Practices In Open Government.E-Gov To We-Gov in Moscow. Best Practices In Open Government.
E-Gov To We-Gov in Moscow. Best Practices In Open Government.
 
There is Something Going on in the LA Tech Market by Upfront Ventures
There is Something Going on in the LA Tech Market by Upfront VenturesThere is Something Going on in the LA Tech Market by Upfront Ventures
There is Something Going on in the LA Tech Market by Upfront Ventures
 
Pay For Someone To Write Your Paper
Pay For Someone To Write Your PaperPay For Someone To Write Your Paper
Pay For Someone To Write Your Paper
 
Woltering-PAPER
Woltering-PAPERWoltering-PAPER
Woltering-PAPER
 
Spatial Patterns of Urban Innovation and Productivity
Spatial Patterns of Urban Innovation and ProductivitySpatial Patterns of Urban Innovation and Productivity
Spatial Patterns of Urban Innovation and Productivity
 
Ibm slide
Ibm slideIbm slide
Ibm slide
 
Paid Writing Assignments -. Online assignment writing service.
Paid Writing Assignments -. Online assignment writing service.Paid Writing Assignments -. Online assignment writing service.
Paid Writing Assignments -. Online assignment writing service.
 
012411 Informe final_PTrivelli
012411 Informe final_PTrivelli012411 Informe final_PTrivelli
012411 Informe final_PTrivelli
 
DSDP Demographic Study 2016
DSDP Demographic Study 2016DSDP Demographic Study 2016
DSDP Demographic Study 2016
 
CAI-OthelloRetailAnalysisFinal2014
CAI-OthelloRetailAnalysisFinal2014CAI-OthelloRetailAnalysisFinal2014
CAI-OthelloRetailAnalysisFinal2014
 
Capstone Project: The Battle of Neighborhoods (Week 2)
Capstone Project: The Battle of Neighborhoods (Week 2)Capstone Project: The Battle of Neighborhoods (Week 2)
Capstone Project: The Battle of Neighborhoods (Week 2)
 

Último

RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxellehsormae
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 

Último (20)

RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptx
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 

Explore NYC's Best Italian Restaurants

  • 1. 1 Exploring New York Neighborhoods for the best Italian Restaurants Using Data Analytics (The Battle of Neighborhoods) CHIBUIKE OSIGWE
  • 2. i Exploring New York Neighborhoods for the best Italian Restaurants Using Data Analytics (The Battle of Neighborhoods) CHIBUIKE OSIGWE
  • 3. ii Preface As a part of the IBM Data Science professional program Capstone Project, we worked on the real datasets to get an experience of what a data scientist goes through in real life. Main objectives of this project were to define a business problem, look for data in the web and use Foursquare location data to compare different neighborhoods of New York to figure out which neighborhood is suitable for starting a new restaurant business. In this project, we will go through all the process in a step by step manner from problem designing, data preparation to final analysis and finally will provide a conclusion that can be leveraged by the business stakeholders to make their decisions.
  • 4. iii Content Preface....................................................................................................................... ii Content..................................................................................................................... iii Introduction................................................................................................................1 1.1 Background.......................................................................................................1 1.2 Problem.............................................................................................................2 1.3 Target Audience................................................................................................3 Data Acquisition and Methodology...........................................................................4 2.1 Data Source.......................................................................................................4 2.2 Methodology.....................................................................................................4 Exploratory Data Analysis.........................................................................................5 3.1 Number of Neighborhoods ...............................................................................5 3.2 Italian Restaurants Per Borough.......................................................................5 3.3 Italian Restaurants Per Neighborhood..............................................................9 Conclusion and Recommendation ...........................................................................12 4.1 Recommendation and Discussion...................................................................12 4.2 Conclusion ......................................................................................................13
  • 5. 1 Introduction 1.1 Background New York City (NYC), often called the City of New York or simply New York (NY), is the most populous city in the United States. With an estimated 2018 population of 8,398,748 distributed over about 302.6 square miles (784 km2 ), New York is also the most densely populated major city in the United States.[10] Located at the southern tip of the U.S. state of New York, the city is the center of the New York metropolitan area, the largest metropolitan area in the world by urban landmass.[11] With almost 20 million people in its metropolitan statistical area and approximately 23 million in its combined statistical area, it is one of the world's most populous megacities. New York City has been described as the cultural, financial, and media capital of the world, significantly influencing commerce,[12] entertainment, research, technology, education, politics, tourism, art, fashion, and sports. Home to the headquarters of the United Nations,[13] New York is an important center for international diplomacy.[14][15] Situated on one of the world's largest natural harbors, New York City is composed of five boroughs, each of which is a county of the State of New York.[16] The five boroughs–Brooklyn, Queens, Manhattan, the Bronx, and Staten Island–were consolidated into a single city in 1898.[17] The city and its metropolitan area constitute the premier gateway for legal immigration to the United States. As many as 800 languages are spoken in New York,[18] making it the most linguistically diverse city in the world. New York is home to more than 3.2 million residents born outside the United States,[19] the largest foreign-born population of any city in the world as of 2016.[20][21] As of 2019, the New York
  • 6. 2 metropolitan area is estimated to produce a gross metropolitan product (GMP) of $2.0 trillion. If greater New York City were a sovereign state, it would have the 12th highest GDP in the world.[22] New York is home to the highest number of billionaires of any city in the world. Figure 1: A Typical Italian Restaurant 1.2 Problem This final project explores the best locations for Italian restaurants throughout the city of New York. Food Business News stated that worldwide pasta sales were up for the second year in a row with the United Sates holding the largest market (Donley, 2018). New York is a major metropolitan area with more than 8.4 million (Quick Facts, 2018) people living within city limits. Most of the Italian immigration
  • 7. 3 into the United States occurred during the late 19th and early 20th century with over two million immigrants between 1900 and 1910. Italian families first settled in Little Italy’s neighborhood around Mulberry Street as has continued to thrive ever since. Italy account for the largest black immigrants in the United State, with almost 100,000 Manhattan inhabitants reporting Italian ancestry, the need to find and enjoy Italian cuisine is on the rise. This report explores which neighborhoods and boroughs of New York City have the most as well as the best Italian restaurants. Additionally, I will attempt to answer the questions “Where should I open a Italian Restaurant?” and “Where should I stay If I want great Italian food?” 1.3 Target Audience Who will be more interested in this project? What type of clients or a group of people will benefit? 1. Business personnel who wants to invest or open a Italian restaurant in New York. This analysis will be a comprehensive guide to start or expand restaurants targeting the Italian crowd. 2. Freelancers who loves to have their own restaurant as a side business. This analysis will give an idea, how beneficial it is to open a restaurant and what are the pros and cons of this business. 3. Italian crowd who wants to find neighborhoods with lots of option for Italian restaurants. 4. Business Analyst or Data Scientists, who wish to analyze the neighborhoods of New York using Exploratory Data Analysis and other statistical & machine learning techniques to obtain all the necessary data, perform some operations on it and, finally be able to tell a story out of it.
  • 8. 4 Data Acquisition and Methodology 2.1 Data Source In order to answer the above questions, data on New York City neighborhoods, boroughs to include boundaries, latitude, longitude, restaurants, and restaurant ratings and tips are required.  New York City data containing the neighborhoods and boroughs, latitudes, and longitudes will be obtained from the data source: https://cocl.us/new_york_dataset  New York City data containing neighborhood boundaries will be obtained from the data source: https://data.cityofnewyork.us/City- Government/Borough-Boundaries/tqmj-j8zm  All data related to locations and quality of Italian restaurants will be obtained via the FourSquare API utilized via the Request library in Python. 2.2 Methodology Data will be collected from https://cocl.us/new_york_dataset and cleaned and processed into a data frame. Foursquare be used to locate all venues and then filtered by Italian restaurants. Ratings, tips, and likes by users will be counted and added to the data frame. Data will be sorted based on rankings. Finally, the data be will be visually assessed using graphing from various Python libraries.
  • 9. 5 Exploratory Data Analysis 3.1 Number of Neighborhoods Foursquare API is very useful online application used my many developers & other applications like Uber etc. In this project I have used it to retrieve information about the places present in the neighborhoods of New York. The API returns a JSON file and we need to turn that into a data-frame. Here I have chosen 100 popular spots for each neighborhood within a radius of 1km. From figure 1 below, it can be seen that the Manhattan have the lowest number of neighborhood while Queens Borough have the highest number. Brooklyn and Staten Island seem to have seem to be in pair. This shows a little bit of competitive attribute between the two boroughs. Using the Folium package, the coordinates of the various neighborhoods bbelonging to the five boroughs were ascertained after requested. This can be found in Figure two. 3.2 Italian Restaurants Per Borough Total number of 233 restaurants were returned from the analysis, each belonging to a particular borough and neighborhood.
  • 10. 6 Figure 2: Neigbourhood per borough Figure 3 A Snapshot of the Boroughs and Neighborhood around New York
  • 11. 7 Figure 4: Italian Restuarants Per Borough From Figure 3 above, it can be deduced that Manhattan have the highest number of Italian restaurants despite having the least number of neighborhood. They have up to 100 Italian restaurants in the borough. The Queen borough have the least number with a total of 20. Additionally, Brooklyn and Staten Island are almost on pair showing a high competition attribute between the two.
  • 12. 8 Figure 5: A picture of the Neighborhoods and Boroughs showing the total number of Italian restaurants Figure 6: Italian Restaurants Per Neighborhood
  • 13. 9 This shows that Manhattan borough accounts fo the highest number of Borough despite having the smallest number of Neighbourhoods. Figure 4 shows a returned value showing the total of Italian restaurants. 3.3 Italian Restaurants Per Neighborhood From Figure 5, it can be deduced that the neighborhood of Belmont have the highest number of Italian restaurant with over 16 numbers. This is followed by Greenwich Village, then West Village to Lenox Hill which have the lowest. The range of numbers of the Italian restaurant is highly skewed, showing that they are all dispersed throughout the neighbourhoods. From figure 6, it is evidently shown that Belmont Neighborhood belongs to Bronx borough. This means that Bronx borough have the highest of restaurant of a particular neighborhood
  • 14. 10 Figure 7: figure showing Belmont Neighborhood .
  • 15. 11 Figure 8: Map Showing the restaurant density of the Neighbourhood and Borough The map shows a high clustered visualization around Manhattan and Lenox Hill, judging from their locations.
  • 16. 12 Conclusion and Recommendation 4.1 Recommendation and Discussion Queens and The Bronx have the least amount of Italian restaurants per borough. However, of note, Belmont of The Bronx is the neighborhood in all of NYC with the most Italian Restaurants. Despite Manhattan having the least number of neighborhoods in all five boroughs, it has the most Italian restaurants. Based on this information, I would state that Manhattan and Queens are the best locations for Italian cuisine in NYC. To have the best shot of success, I would open an Italian restaurant in Queens. Queens has multiple neighborhoods and has the least number of Italian restaurants making competition easier than in other boroughs. According to this analysis, Queens’s borough will provide the least competition for the new upcoming Italian restaurant, as there is very little Italian restaurants spread or no Italian restaurants in few neighborhoods. Also looking at the population distribution seems like it is densely populated with Italian crowd, which helps the new restaurant by providing high customer visit possibility. Therefore, definitely this region could potentially be a perfect place for starting quality Italian restaurants. Some of the drawbacks of this analysis are — the clustering is completely based only on data obtained from Foursquare API and the data about the Italian population distribution in each neighborhood is also based on the 2016 census which is not up- to date. Thus, there is a huge gap of around 3 years in the population distribution data. Even Though there are many areas where it can be improved, yet this analysis has certainly provided us with some good insights, preliminary information on possibilities & a head start into this business problem by setting the step stones properly.
  • 17. 13 4.2 Conclusion Finally, to conclude this project, wwe have got a chance to solve a business problem like how a real like data scientists would do. We have used many python libraries to fetch the data, to manipulate the contents & to analyze and visualize those datasets. We have made use of Foursquare API to explore the venues in neighborhoods of New York, then get good amount of data from online. We also applied Visualization technique for insights and used Folium to visualize it on a map. Some of the drawbacks or areas of improvement shows us that this analysis can be further improved with the help of more data and easy coding syntax. Similarly we can use this project to analysis any scenario such as opening a different cuisine restaurant or opening of a new gym and etc. I hope that this project helps as an initial guidance to take more complex real-life challenges using data-science. Find the code for this analysis on github . Find me on LinkedIn!