SlideShare uma empresa Scribd logo
1 de 31
Popular Text
Analytics
Algorithms
What is text analytics?
It is all about deriving high-quality structured
data for analysis from unstructured text.
Why is text analytics used?
It is used to measure customer opinions, product reviews,
feedback, to provide search facility, sentimental analysis and
entity modeling to support data-backed decision making.
What are the primary steps in text analytics?
Text acquisition and
preparation
Processing and analysis
Reporting
(visualization/presentation)
For instance, social media chatter around
brand can create a supremely spiraling
impact (remember the post which showed a
Kentucky man was violently removed from
his United Airlines seat on an overbooked
flight? And how it lead to a social media
disaster for the airline?).
In addition to social media data, other
examples include e-mail messages, call
center notes, and customer records.
In addition to social media data, other
examples include e-mail messages,
call center notes, and customer
records.
What type of
information
can be
extracted?
Terms
Named entity
Concept
Sentiment
Terms
These are extraction based on keywords (on own site
or competitor site)
Named entities
These are extracted to answer the ‘who’, ‘what’, or
‘where’. Some instances include name, location,
timestamp, or product.
Concept
These are extracted to answer the ‘about’ of a piece of
content. It describes the idea behind the content.
Sentiment
These are extracted to gauge the overall feeling around a
brand at the moment. The above United Airlines
example will be (evidently) negative sentiment, denoting
unhappy customers, and potential business losses.
What type of
tools/algorithms
are used for text
analytics?
Decision tree
Naive-Bayes
Support Vector Machine
K-nearest neighbours
Artificial Neural Networks
Fuzzy C-Means
LDA
Decision Trees
This is a classifier that seeks to
repeatedly group data into groups or
classes. It comes in handy for tasks
like classification or regression.
Popular
algorithms in
Decision trees
ID3: Iternative Dichotomizer builds a decision tree
that splits data based on highest information gain
(and lowest entropy) till every group has
homogenous data.
C4.5: This algorithm too uses information gain and
entropy to classify data (just like ID3). Unlike ID3, it
accepts continuous and discrete features and
handles incomplete data too.
CART: Classification and Regression Tree works just
like C4.5. One notable difference is that CART uses
Gini impurity (to assess ‘purity’ or homogeneity of
the node) instead of information gain/entropy used
by C4.5
Naive-Bayes
This is a popular technique to classify
text and documents based on a
category (whether to classify a
document as Sport or as Political
based on the occurrence of certain
words). It is a simple way to assign
class or category labels to instances
or cases.
Naive-Bayes
Rather than being a single distinct algorithm, it is a set of algorithms that work on
one underlying principle -- “the value of a given feature is independent of the
value of any other feature”.
Support
Vector
Machines
This is a supervised machine learning
algorithm. It can be applied on
classification and regression
problems. Its essential component is
kernel trick which transforms linear
data into non-linear data by replacing
its features by a kernel function.
It is used in hypertext categorization,
classification of images, and facial
recognition applications.
Applications of SVM
It is used in hypertext categorization, classification of images,
and facial recognition applications.
K Nearest
Neighbors
k-NN is used is search items where
you are looking for something similar.
You determine similarity by creating
a vector representation of the items
and then compare how similar or
dissimilar they are using a distance
metric like Euclidean distance.
Applications of k-NN
The best example of k-NN’s prowess is an e-commerce site’s
product recommendation feature. You can also utilize k-NN to
do Concept Search (finding semantically similar documents).
Artificial
Neural
Networks
ANNs are primarily utilized for non-
linear boundaries- based
classification. Much like the working
of the human brain, ANN operates on
hidden states (which correspond to
the neurons in the brain).
Algorithms to
train ANN
Gradient Descent
Evolutionary Algorithms
Genetic Algorithms
Applications of ANN
Image compression, handwriting analysis, and stock exchange
movement prediction are some sectors where ANN comes in
useful.
Fuzzy
C-Means
This is a useful form of clustering that
can add value when there are items
that can be a part of more than one
cluster. It works on the principle that
after the clustering is over, all items
in a cluster are as similar as possible
to each other.
Steps in Fuzzy
C-Means
Pick
Pick a number
of clusters
where the
items can be
categorized
Assign
Assign
coefficient to
each data point
for being
present inside
the cluster
Repeat
Repeat till the
coefficients’
value updates
between two
iterations is not
more than the
pre-defined
sensitivity
threshold value
Applications of Fuzzy C-Means
Disciplines like Bioinformatics, healthcare, and economics
make use of fuzzy c-means with great success.
Latent
Dirichlet
Allocation
(LDA)
It helps in finding a linear
combination of features that
distinguishes or characterizes
multiple classes of events or objects.
Primary steps
in LDA
01
Provide an
estimate of the
potential number
of topics
02
Algorithm assigns a
word to a topic
Algorithm will
check the accuracy
of topic assignment
in a loop
This helps in ensuring coherent topic clustering.
An example of LDA
Suppose there are three separate sentences.
1. I eat chicken and vegetables
2. Chicken are pets
3. My dog loves to eat chicken
With LDA, topic clustering for these 3 lines are done as follows –
• Sentence 1 = 100% Topic B
• Sentence 2 = 100% Topic A
• Sentence 3= 33% Topic A and 67% Topic B
Now we infer that there are two clusters for sentence classification –
Pets (Topic A) and Food (Topic B).
A pioneer is custom and large-scale web data extraction.
www.promptcloud.com | sales@promptcloud.com

Mais conteúdo relacionado

Mais procurados

Data Science - Part I - Sustaining Predictive Analytics Capabilities
Data Science - Part I - Sustaining Predictive Analytics CapabilitiesData Science - Part I - Sustaining Predictive Analytics Capabilities
Data Science - Part I - Sustaining Predictive Analytics CapabilitiesDerek Kane
 
Predictive Text Analytics
Predictive Text AnalyticsPredictive Text Analytics
Predictive Text AnalyticsSeth Grimes
 
Text Analytics Market Insights: What's Working and What's Next
Text Analytics Market Insights: What's Working and What's NextText Analytics Market Insights: What's Working and What's Next
Text Analytics Market Insights: What's Working and What's NextSeth Grimes
 
Data Science Full Course | Edureka
Data Science Full Course | EdurekaData Science Full Course | Edureka
Data Science Full Course | EdurekaEdureka!
 
Self Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docxSelf Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docxShanmugasundaram M
 
Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Simplilearn
 
Text Analytics Overview, 2011
Text Analytics Overview, 2011Text Analytics Overview, 2011
Text Analytics Overview, 2011Seth Grimes
 
Nuts and bolts
Nuts and boltsNuts and bolts
Nuts and boltsNBER
 
Module 9: Natural Language Processing Part 2
Module 9:  Natural Language Processing Part 2Module 9:  Natural Language Processing Part 2
Module 9: Natural Language Processing Part 2Sara Hooker
 
Introduction to machine learning and deep learning
Introduction to machine learning and deep learningIntroduction to machine learning and deep learning
Introduction to machine learning and deep learningShishir Choudhary
 
Module 1 introduction to machine learning
Module 1  introduction to machine learningModule 1  introduction to machine learning
Module 1 introduction to machine learningSara Hooker
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and AnalyticsSrinath Perera
 
Applications: Prediction
Applications: PredictionApplications: Prediction
Applications: PredictionNBER
 
Data Mining vs Statistics
Data Mining vs StatisticsData Mining vs Statistics
Data Mining vs StatisticsAndry Alamsyah
 

Mais procurados (20)

Data Science - Part I - Sustaining Predictive Analytics Capabilities
Data Science - Part I - Sustaining Predictive Analytics CapabilitiesData Science - Part I - Sustaining Predictive Analytics Capabilities
Data Science - Part I - Sustaining Predictive Analytics Capabilities
 
Predictive Text Analytics
Predictive Text AnalyticsPredictive Text Analytics
Predictive Text Analytics
 
Text Analytics Market Insights: What's Working and What's Next
Text Analytics Market Insights: What's Working and What's NextText Analytics Market Insights: What's Working and What's Next
Text Analytics Market Insights: What's Working and What's Next
 
Data Science Full Course | Edureka
Data Science Full Course | EdurekaData Science Full Course | Edureka
Data Science Full Course | Edureka
 
Self Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docxSelf Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docx
 
Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...
 
Text Analytics Overview, 2011
Text Analytics Overview, 2011Text Analytics Overview, 2011
Text Analytics Overview, 2011
 
Data Science in Action
Data Science in ActionData Science in Action
Data Science in Action
 
Nuts and bolts
Nuts and boltsNuts and bolts
Nuts and bolts
 
Module 9: Natural Language Processing Part 2
Module 9:  Natural Language Processing Part 2Module 9:  Natural Language Processing Part 2
Module 9: Natural Language Processing Part 2
 
Artificial Intelligence in Data Curation
Artificial Intelligence in Data CurationArtificial Intelligence in Data Curation
Artificial Intelligence in Data Curation
 
Data analytics
Data analyticsData analytics
Data analytics
 
From data lakes to actionable data (adventures in data curation)
From data lakes to actionable data (adventures in data curation)From data lakes to actionable data (adventures in data curation)
From data lakes to actionable data (adventures in data curation)
 
Introduction to machine learning and deep learning
Introduction to machine learning and deep learningIntroduction to machine learning and deep learning
Introduction to machine learning and deep learning
 
Bayesian reasoning
Bayesian reasoningBayesian reasoning
Bayesian reasoning
 
Module 1 introduction to machine learning
Module 1  introduction to machine learningModule 1  introduction to machine learning
Module 1 introduction to machine learning
 
Data analysis
Data analysisData analysis
Data analysis
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and Analytics
 
Applications: Prediction
Applications: PredictionApplications: Prediction
Applications: Prediction
 
Data Mining vs Statistics
Data Mining vs StatisticsData Mining vs Statistics
Data Mining vs Statistics
 

Semelhante a Popular Text Analytics Algorithms

Classification and decision tree classifier machine learning
Classification and decision tree classifier machine learningClassification and decision tree classifier machine learning
Classification and decision tree classifier machine learningFrancisco E. Figueroa-Nigaglioni
 
Unit-2-part-2Machine Learning.pptx
Unit-2-part-2Machine Learning.pptxUnit-2-part-2Machine Learning.pptx
Unit-2-part-2Machine Learning.pptxnehayarrapothu
 
Mis End Term Exam Theory Concepts
Mis End Term Exam Theory ConceptsMis End Term Exam Theory Concepts
Mis End Term Exam Theory ConceptsVidya sagar Sharma
 
Pattern recognition in ML.pdf
Pattern recognition in ML.pdfPattern recognition in ML.pdf
Pattern recognition in ML.pdfMatthewHaws4
 
Screening of Mental Health in Adolescents using ML.pptx
Screening of Mental Health in Adolescents using ML.pptxScreening of Mental Health in Adolescents using ML.pptx
Screening of Mental Health in Adolescents using ML.pptxNitishChoudhary23
 
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSEXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSeditorijettcs
 
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSEXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSeditorijettcs
 
Big Data Analytics.pptx
Big Data Analytics.pptxBig Data Analytics.pptx
Big Data Analytics.pptxKaviya452563
 
Distributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic WebDistributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic WebEditor IJCATR
 
Binary search query classifier
Binary search query classifierBinary search query classifier
Binary search query classifierEsteban Ribero
 
machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfPranavPatil822557
 
Top 40 Data Science Interview Questions and Answers 2022.pdf
Top 40 Data Science Interview Questions and Answers 2022.pdfTop 40 Data Science Interview Questions and Answers 2022.pdf
Top 40 Data Science Interview Questions and Answers 2022.pdfSuraj Kumar
 
Literature Survey: Clustering Technique
Literature Survey: Clustering TechniqueLiterature Survey: Clustering Technique
Literature Survey: Clustering TechniqueEditor IJCATR
 
Trading outlier detection machine learning approach
Trading outlier detection  machine learning approachTrading outlier detection  machine learning approach
Trading outlier detection machine learning approachEditorIJAERD
 
Introduction to feature subset selection method
Introduction to feature subset selection methodIntroduction to feature subset selection method
Introduction to feature subset selection methodIJSRD
 
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETSURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETEditor IJMTER
 
Data science technology overview
Data science technology overviewData science technology overview
Data science technology overviewSoojung Hong
 

Semelhante a Popular Text Analytics Algorithms (20)

Classification and decision tree classifier machine learning
Classification and decision tree classifier machine learningClassification and decision tree classifier machine learning
Classification and decision tree classifier machine learning
 
Chapter 1.pdf
Chapter 1.pdfChapter 1.pdf
Chapter 1.pdf
 
Unit-2-part-2Machine Learning.pptx
Unit-2-part-2Machine Learning.pptxUnit-2-part-2Machine Learning.pptx
Unit-2-part-2Machine Learning.pptx
 
Mis End Term Exam Theory Concepts
Mis End Term Exam Theory ConceptsMis End Term Exam Theory Concepts
Mis End Term Exam Theory Concepts
 
Pattern recognition in ML.pdf
Pattern recognition in ML.pdfPattern recognition in ML.pdf
Pattern recognition in ML.pdf
 
Screening of Mental Health in Adolescents using ML.pptx
Screening of Mental Health in Adolescents using ML.pptxScreening of Mental Health in Adolescents using ML.pptx
Screening of Mental Health in Adolescents using ML.pptx
 
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSEXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
 
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSEXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
 
Big Data Analytics.pptx
Big Data Analytics.pptxBig Data Analytics.pptx
Big Data Analytics.pptx
 
Distributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic WebDistributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic Web
 
Binary search query classifier
Binary search query classifierBinary search query classifier
Binary search query classifier
 
Machine Can Think
Machine Can ThinkMachine Can Think
Machine Can Think
 
Introduction
IntroductionIntroduction
Introduction
 
machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdf
 
Top 40 Data Science Interview Questions and Answers 2022.pdf
Top 40 Data Science Interview Questions and Answers 2022.pdfTop 40 Data Science Interview Questions and Answers 2022.pdf
Top 40 Data Science Interview Questions and Answers 2022.pdf
 
Literature Survey: Clustering Technique
Literature Survey: Clustering TechniqueLiterature Survey: Clustering Technique
Literature Survey: Clustering Technique
 
Trading outlier detection machine learning approach
Trading outlier detection  machine learning approachTrading outlier detection  machine learning approach
Trading outlier detection machine learning approach
 
Introduction to feature subset selection method
Introduction to feature subset selection methodIntroduction to feature subset selection method
Introduction to feature subset selection method
 
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETSURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
 
Data science technology overview
Data science technology overviewData science technology overview
Data science technology overview
 

Mais de PromptCloud

Big Data’s Potential for the Real Estate Industry: 2021
Big Data’s Potential for the Real Estate Industry: 2021Big Data’s Potential for the Real Estate Industry: 2021
Big Data’s Potential for the Real Estate Industry: 2021PromptCloud
 
All You Need to Know About Web Crawling.pdf
All You Need to Know About Web Crawling.pdfAll You Need to Know About Web Crawling.pdf
All You Need to Know About Web Crawling.pdfPromptCloud
 
Web Scraping Myths vs. Facts
Web Scraping Myths vs. FactsWeb Scraping Myths vs. Facts
Web Scraping Myths vs. FactsPromptCloud
 
Octoparse competitors.pdf
Octoparse competitors.pdfOctoparse competitors.pdf
Octoparse competitors.pdfPromptCloud
 
Parsehub and competitior ppt.pptx
Parsehub and competitior ppt.pptxParsehub and competitior ppt.pptx
Parsehub and competitior ppt.pptxPromptCloud
 
Product Visibility- What Is Seen First, Will ppt.pptx
Product Visibility- What Is Seen First, Will ppt.pptxProduct Visibility- What Is Seen First, Will ppt.pptx
Product Visibility- What Is Seen First, Will ppt.pptxPromptCloud
 
Data Trends in Fashion Industry
Data Trends in Fashion IndustryData Trends in Fashion Industry
Data Trends in Fashion IndustryPromptCloud
 
Data Standardization with Web Data Integration
Data Standardization with Web Data Integration Data Standardization with Web Data Integration
Data Standardization with Web Data Integration PromptCloud
 
Visualizing Marvel Cinematic Universe Movies
Visualizing Marvel Cinematic Universe MoviesVisualizing Marvel Cinematic Universe Movies
Visualizing Marvel Cinematic Universe MoviesPromptCloud
 
15 Key Metrics Every E-commerce Business Should Track
15 Key Metrics Every E-commerce Business Should Track15 Key Metrics Every E-commerce Business Should Track
15 Key Metrics Every E-commerce Business Should TrackPromptCloud
 
Top Amazon Services for Ecommerce Players
Top Amazon Services for Ecommerce PlayersTop Amazon Services for Ecommerce Players
Top Amazon Services for Ecommerce PlayersPromptCloud
 
The Birth of a Web Crawling Bot
The Birth of a Web Crawling BotThe Birth of a Web Crawling Bot
The Birth of a Web Crawling BotPromptCloud
 
Upcoming Applications of Artificial intelligence in 2019
Upcoming Applications of Artificial intelligence in 2019Upcoming Applications of Artificial intelligence in 2019
Upcoming Applications of Artificial intelligence in 2019PromptCloud
 
Zipcode based price benchmarking for retailers
Zipcode based price benchmarking for retailersZipcode based price benchmarking for retailers
Zipcode based price benchmarking for retailersPromptCloud
 
Analyzing Positiveness in 160+ Holiday Songs
Analyzing Positiveness in 160+ Holiday SongsAnalyzing Positiveness in 160+ Holiday Songs
Analyzing Positiveness in 160+ Holiday SongsPromptCloud
 
PromptCloud's Year in Review - 2019
PromptCloud's Year in Review - 2019PromptCloud's Year in Review - 2019
PromptCloud's Year in Review - 2019PromptCloud
 
Top Data Analytics Trends for 2019
Top Data Analytics Trends for 2019Top Data Analytics Trends for 2019
Top Data Analytics Trends for 2019PromptCloud
 
10 Mobile App Ideas that can be Fueled by Web Scraping
10 Mobile App Ideas that can be Fueled by Web Scraping10 Mobile App Ideas that can be Fueled by Web Scraping
10 Mobile App Ideas that can be Fueled by Web ScrapingPromptCloud
 
How Web Scraping Can Help Affiliate Marketers
How Web Scraping Can Help Affiliate MarketersHow Web Scraping Can Help Affiliate Marketers
How Web Scraping Can Help Affiliate MarketersPromptCloud
 
Hotel Review Data Analysis
Hotel Review Data AnalysisHotel Review Data Analysis
Hotel Review Data AnalysisPromptCloud
 

Mais de PromptCloud (20)

Big Data’s Potential for the Real Estate Industry: 2021
Big Data’s Potential for the Real Estate Industry: 2021Big Data’s Potential for the Real Estate Industry: 2021
Big Data’s Potential for the Real Estate Industry: 2021
 
All You Need to Know About Web Crawling.pdf
All You Need to Know About Web Crawling.pdfAll You Need to Know About Web Crawling.pdf
All You Need to Know About Web Crawling.pdf
 
Web Scraping Myths vs. Facts
Web Scraping Myths vs. FactsWeb Scraping Myths vs. Facts
Web Scraping Myths vs. Facts
 
Octoparse competitors.pdf
Octoparse competitors.pdfOctoparse competitors.pdf
Octoparse competitors.pdf
 
Parsehub and competitior ppt.pptx
Parsehub and competitior ppt.pptxParsehub and competitior ppt.pptx
Parsehub and competitior ppt.pptx
 
Product Visibility- What Is Seen First, Will ppt.pptx
Product Visibility- What Is Seen First, Will ppt.pptxProduct Visibility- What Is Seen First, Will ppt.pptx
Product Visibility- What Is Seen First, Will ppt.pptx
 
Data Trends in Fashion Industry
Data Trends in Fashion IndustryData Trends in Fashion Industry
Data Trends in Fashion Industry
 
Data Standardization with Web Data Integration
Data Standardization with Web Data Integration Data Standardization with Web Data Integration
Data Standardization with Web Data Integration
 
Visualizing Marvel Cinematic Universe Movies
Visualizing Marvel Cinematic Universe MoviesVisualizing Marvel Cinematic Universe Movies
Visualizing Marvel Cinematic Universe Movies
 
15 Key Metrics Every E-commerce Business Should Track
15 Key Metrics Every E-commerce Business Should Track15 Key Metrics Every E-commerce Business Should Track
15 Key Metrics Every E-commerce Business Should Track
 
Top Amazon Services for Ecommerce Players
Top Amazon Services for Ecommerce PlayersTop Amazon Services for Ecommerce Players
Top Amazon Services for Ecommerce Players
 
The Birth of a Web Crawling Bot
The Birth of a Web Crawling BotThe Birth of a Web Crawling Bot
The Birth of a Web Crawling Bot
 
Upcoming Applications of Artificial intelligence in 2019
Upcoming Applications of Artificial intelligence in 2019Upcoming Applications of Artificial intelligence in 2019
Upcoming Applications of Artificial intelligence in 2019
 
Zipcode based price benchmarking for retailers
Zipcode based price benchmarking for retailersZipcode based price benchmarking for retailers
Zipcode based price benchmarking for retailers
 
Analyzing Positiveness in 160+ Holiday Songs
Analyzing Positiveness in 160+ Holiday SongsAnalyzing Positiveness in 160+ Holiday Songs
Analyzing Positiveness in 160+ Holiday Songs
 
PromptCloud's Year in Review - 2019
PromptCloud's Year in Review - 2019PromptCloud's Year in Review - 2019
PromptCloud's Year in Review - 2019
 
Top Data Analytics Trends for 2019
Top Data Analytics Trends for 2019Top Data Analytics Trends for 2019
Top Data Analytics Trends for 2019
 
10 Mobile App Ideas that can be Fueled by Web Scraping
10 Mobile App Ideas that can be Fueled by Web Scraping10 Mobile App Ideas that can be Fueled by Web Scraping
10 Mobile App Ideas that can be Fueled by Web Scraping
 
How Web Scraping Can Help Affiliate Marketers
How Web Scraping Can Help Affiliate MarketersHow Web Scraping Can Help Affiliate Marketers
How Web Scraping Can Help Affiliate Marketers
 
Hotel Review Data Analysis
Hotel Review Data AnalysisHotel Review Data Analysis
Hotel Review Data Analysis
 

Último

convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfSubhamKumar3239
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingsocarem879
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 

Último (20)

convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdf
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 

Popular Text Analytics Algorithms

  • 2. What is text analytics? It is all about deriving high-quality structured data for analysis from unstructured text.
  • 3. Why is text analytics used? It is used to measure customer opinions, product reviews, feedback, to provide search facility, sentimental analysis and entity modeling to support data-backed decision making.
  • 4. What are the primary steps in text analytics? Text acquisition and preparation Processing and analysis Reporting (visualization/presentation)
  • 5. For instance, social media chatter around brand can create a supremely spiraling impact (remember the post which showed a Kentucky man was violently removed from his United Airlines seat on an overbooked flight? And how it lead to a social media disaster for the airline?).
  • 6. In addition to social media data, other examples include e-mail messages, call center notes, and customer records.
  • 7. In addition to social media data, other examples include e-mail messages, call center notes, and customer records.
  • 8. What type of information can be extracted? Terms Named entity Concept Sentiment
  • 9. Terms These are extraction based on keywords (on own site or competitor site)
  • 10. Named entities These are extracted to answer the ‘who’, ‘what’, or ‘where’. Some instances include name, location, timestamp, or product.
  • 11. Concept These are extracted to answer the ‘about’ of a piece of content. It describes the idea behind the content.
  • 12. Sentiment These are extracted to gauge the overall feeling around a brand at the moment. The above United Airlines example will be (evidently) negative sentiment, denoting unhappy customers, and potential business losses.
  • 13. What type of tools/algorithms are used for text analytics? Decision tree Naive-Bayes Support Vector Machine K-nearest neighbours Artificial Neural Networks Fuzzy C-Means LDA
  • 14. Decision Trees This is a classifier that seeks to repeatedly group data into groups or classes. It comes in handy for tasks like classification or regression.
  • 15. Popular algorithms in Decision trees ID3: Iternative Dichotomizer builds a decision tree that splits data based on highest information gain (and lowest entropy) till every group has homogenous data. C4.5: This algorithm too uses information gain and entropy to classify data (just like ID3). Unlike ID3, it accepts continuous and discrete features and handles incomplete data too. CART: Classification and Regression Tree works just like C4.5. One notable difference is that CART uses Gini impurity (to assess ‘purity’ or homogeneity of the node) instead of information gain/entropy used by C4.5
  • 16. Naive-Bayes This is a popular technique to classify text and documents based on a category (whether to classify a document as Sport or as Political based on the occurrence of certain words). It is a simple way to assign class or category labels to instances or cases.
  • 17. Naive-Bayes Rather than being a single distinct algorithm, it is a set of algorithms that work on one underlying principle -- “the value of a given feature is independent of the value of any other feature”.
  • 18. Support Vector Machines This is a supervised machine learning algorithm. It can be applied on classification and regression problems. Its essential component is kernel trick which transforms linear data into non-linear data by replacing its features by a kernel function. It is used in hypertext categorization, classification of images, and facial recognition applications.
  • 19. Applications of SVM It is used in hypertext categorization, classification of images, and facial recognition applications.
  • 20. K Nearest Neighbors k-NN is used is search items where you are looking for something similar. You determine similarity by creating a vector representation of the items and then compare how similar or dissimilar they are using a distance metric like Euclidean distance.
  • 21. Applications of k-NN The best example of k-NN’s prowess is an e-commerce site’s product recommendation feature. You can also utilize k-NN to do Concept Search (finding semantically similar documents).
  • 22. Artificial Neural Networks ANNs are primarily utilized for non- linear boundaries- based classification. Much like the working of the human brain, ANN operates on hidden states (which correspond to the neurons in the brain).
  • 23. Algorithms to train ANN Gradient Descent Evolutionary Algorithms Genetic Algorithms
  • 24. Applications of ANN Image compression, handwriting analysis, and stock exchange movement prediction are some sectors where ANN comes in useful.
  • 25. Fuzzy C-Means This is a useful form of clustering that can add value when there are items that can be a part of more than one cluster. It works on the principle that after the clustering is over, all items in a cluster are as similar as possible to each other.
  • 26. Steps in Fuzzy C-Means Pick Pick a number of clusters where the items can be categorized Assign Assign coefficient to each data point for being present inside the cluster Repeat Repeat till the coefficients’ value updates between two iterations is not more than the pre-defined sensitivity threshold value
  • 27. Applications of Fuzzy C-Means Disciplines like Bioinformatics, healthcare, and economics make use of fuzzy c-means with great success.
  • 28. Latent Dirichlet Allocation (LDA) It helps in finding a linear combination of features that distinguishes or characterizes multiple classes of events or objects.
  • 29. Primary steps in LDA 01 Provide an estimate of the potential number of topics 02 Algorithm assigns a word to a topic Algorithm will check the accuracy of topic assignment in a loop This helps in ensuring coherent topic clustering.
  • 30. An example of LDA Suppose there are three separate sentences. 1. I eat chicken and vegetables 2. Chicken are pets 3. My dog loves to eat chicken With LDA, topic clustering for these 3 lines are done as follows – • Sentence 1 = 100% Topic B • Sentence 2 = 100% Topic A • Sentence 3= 33% Topic A and 67% Topic B Now we infer that there are two clusters for sentence classification – Pets (Topic A) and Food (Topic B).
  • 31. A pioneer is custom and large-scale web data extraction. www.promptcloud.com | sales@promptcloud.com