Prediction - the future of game analytics - white paper

Prediction: The Future of
Game Analytics
(It’s Already Here--It’s Just Not Very
Evenly Distributed)
WHITE PAPER
© 2014. Ninja Metrics®
www.ninjametrics.com

www.ninjametrics.com | contact@ninjametrics.com
Ninja Metrics®
- White Paper Prediction: The Future of Game Analytics | Page 2
Table Of Contents
Get Distributed 3
The Value of Big Data 3
Understanding Historical Data 5
Predictive Analytics 6
Getting Your Action Items On 8
Red Ball vs. Blue: “Real-Time” Analytics 10
Predictive is Not Real-Time 11
Data Models and Why Your Education Probably Wasn’t Good Enough 12

Ninja Metrics®
Get Distributed
The futurist William Gibson wrote that “the future is
already here—it’s just not very evenly distributed.”
That’s both social commentary and cautionary tale in one. It tells us that some technological
advances feel like science fiction simply because most people don’t yet know about, or have
access to them.
In this white paper, we will outline some of those advances in the field of data analytics for
games, and encourage you to be among those with access.
The Value of Big Data
First, Understand Users
“Big Data” is the flavor of the moment, but it’s not always clear what the term means. We’ve
worked with dozens of game companies and more often than not “big data” means the kitchen
sink approach, i.e. let’s just collect everything in case we need it.
That’s fair enough, but using big data intelligently involves more than just a big database. It
means collecting the right kinds of data that answer specific questions addressing specific
business needs. It’s all doable, but it requires planning and smart decisions up front that take into
account the nature of your customers.
Doctor House Doesn’t Believe His Patients and Neither Should You
You may be familiar with the popular TV show -
House. The principal character - Dr. Greg House
– is a curmudgeonly mentat of a medical analyst
who never trusts his patients to tell him the
truth. Weighed down by their own egos, cultures,
superstitions and fears, House’s patients regularly
lie to themselves and to him; obscuring key facts
of their illness and making diagnosis difficult.
House teaches us that listening to your patients
may not always be the best way to help them.
Listening to players talk about your game is
flawed in the same way. First, there’s the problem

Ninja Metrics®
of sampling. Listening to players who scream the loudest, i.e. your bored, lonely, raging forum
trolls, does not yield a scientifically representative sample of all players. Their complaints are
proof that a phenomenon exists, but not that it’s common, or even important.
Secondly, these people may be flat out wrong in their observations—not because they’re evil,
stupid or angry (though those are always possible), but because they are a poor gauge of even
their own behavior.
Consider the following classic case from social science: How do you know how much TV someone
watches? The most common approach to answering this question is to simply ask them. The
problem is that they tend to be wrong.
Ask yourself how many hours of TV you watched last week or last month. Your estimate will
usually be wrong. Most will have watched 30 hours of TV, but will answer 25 hours, 32 hours,
35 hours, etc. Sometimes they are wrong due to ego and worrying about
appearance – like House’s patients - but sometimes they just can’t
estimate well.
Webb’s Unobtrusive Observation
Eugene Webb (1966) was a pioneering social scientist in the
1960s who invented the term “unobtrusive observation.”
His insight was that it’s a lot better to watch people do
something and measure it yourself than to ask them
to recall it. He also made sure that the watching didn’t
interfere with the behavior.
Webb’s classic case looked at the popularity of museum
exhibits. When asked which exhibit they liked the best,
museum visitors would consistently give answers that
made them sound intelligent and wise. For example, many
would say they liked the exhibit on atomic structures even
though this exhibit room was consistently devoid of visitors.
The visitors were clearly lying to feel better about themselves and
appear wiser.
So how could the museum learn the truth
without following the guests around?
Webb’s simple approach was novel and revolutionary. His team collected “unobtrusive” data by
counting the number of nose smudges on the glass cases and the wear of the floor tiles in front
of the exhibits. Those turned out to be far better gauges of actual popularity. It’s a simple insight,

Ninja Metrics®
and one that can be brought to bear in the realm of big data and games.
Game logs represent an analyst’s dream for data quality and purity. If a game is instrumented
intelligently, it will yield a flawless record of every action, transaction and interaction the players
make. And this is what powers good big data. Once it’s available, the challenge then becomes
how to make intelligent use of it.
Understanding Historical Data
What Just Happened?
The past is the best building block for understanding the present as well as the future. On the
simplest level, any game company needs to know how many customers it has, how much they’ve
spent, and what they’ve done.
Ninja Metrics labels this part of our dashboard “Basic Metrics.” It contains all of the
basic metrics based on past behavior. Factors such as Daily Active Users
(DAU), Average Revenue per User (ARPU) and K-factor can all be found
here.
Such basic metrics are invaluable for understanding
aggregate trends, and if they can be augmented with
add-on functionality like AB testing and segmentation,
they can become quite handy indeed. For example, if an
A group gets one kind of content (or mechanic, or CRM
intervention, etc.) and a B group another, then the ARPUs
of those two groups can be compared to see which
performed better.
Instrumenting for Historical Data
To get that historical data, ‘instrumenting’ your game is
critical. Instrumenting your games is all about making sure
these metrics are actually recorded as part of the game.
Analytics companies will tell you which events to capture and how to
report them - and it’s pretty straightforward. For example, when an event
happens—say a user logs in—the game system needs to be able to say “User 3482
logged in at time XX:XX:XX.” The analytics company supplies a piece of code that then fires this
event off to us, typically up in the cloud, where our algorithms and reporting are applied. This
code is supplied in a library that essentially says “wherever you have your log-in event happening,

Ninja Metrics®
put this line of code here.” If the events are already instrumented, this process shouldn’t take
more than a day at most.
Obviously, if your analytics software requires data for a metric you haven’t instrumented for,
you’ll need to work that into your development cycle. For example, if you want to know how
many players are on level 8 and you don’t collect an event like “advanced from level 7 to 8,”
that metric isn’t going to show up. But in general, this is not a complicated process and simply
requires time and planning.
So clearly, having a solid foundation of historical player data is hugely important. It supplies you
with data to populate the familiar basic metrics of DAU, ARPU, ARPPU, Churn Rate and many
others. But these metrics have one key limitation: they’re based on historical data and therefore
reactive by their very nature.
To be proactive, you need to peer into the future using big data and predictive analytics. It isn’t
easy but it’s not magic either.
Predictive Analytics
There’s a Minority Report For a Reason
If you’ve seen the movie version of Philip K. Dick’s
“Minority Report,” you saw a future society where the
police use people known as ‘precogs’ to predict the
future and stop crimes before they happen. It’s a fun
premise, and of course it goes spectacularly wrong. It
turns out that predictions made by the precogs aren’t
100% accurate, and when one of them screams out
“NO!” it’s considered a ‘minority report.’ The minority
report throws doubt on the validity of the prediction.
In other words, the police might be arresting the wrong
person.
It seems that with great power indeed comes great
responsibility. And sometimes we screw it up.
So let’s not screw this up.
Predictive Analytics is Not Magic But It’s Not 100% Either
With today’s advances in big data analytics, the ability to accurately predict player behavior is a
reality.

Ninja Metrics®
The science can get confusing but here’s a fairly simple way of understanding it.
Let’s say a computer watches all of the events that happen in a game and it starts to recognize
patterns. Some patterns are repeated, while others aren’t. When the patterns are repeated, the
machine “learns” that and starts looking for that pattern to occur again, this time starting to
make a prediction about what is going to come next.
Say the computer sees A-B-C-D, over and over. After a while it recognizes it. Then it sees A-B-C,
and you ask it what is going to happen next. It says “D,” of course, but it can also tell you how
likely this prediction is to be correct. How can it do that? Well, when it looked into the past, it
wasn’t always A-B-C-D. Occasionally it was A-B-C-Q. So the computer also starts to understand
likelihood, and can tell you how often that guess has turned out right. That’s the prediction. No
magic, really.
How Right Is It?
There’s a technique used to test and verify these predictions. It’s
called — bear with me here — cross fold validation.
Here’s how it works:
Let’s say you have a really big data set. The computer
takes the whole thing and splits it into two halves. With
the first half it looks for patterns and builds up its
model. By the end, it says “We see A-B-C and then D
happens 75% of the time.” Now let’s take that model
and see if it’s accurate in the second, totally untouched
half of the data. If A-B-C-D happens 75% in this data as
well, we all start feeling pretty good about the predictive
nature. But we start thinking of it being 75% accurate, not
100%.
Why not 100%?
The short answer is that the world has a lot of moving pieces. For
example, you might make a prediction that Bob Smith will spend $20
tomorrow. But what if poor Bob gets hit by a bus or has a big breakup with his fiancé? Those
events could very likely alter his behavior (especially the getting hit by a bus). And you certainly
didn’t plan for those data points in your model. So when Bob doesn’t show up and spend his $20,
you scratch your head. Was your model wrong? No, just incomplete.
Secondly, it’s very easy to cheat with these models. For example, by ignoring any false positive or

Ninja Metrics®
false negatives in data sets, anyone could just tell you that all of the players
will spend $20 tomorrow. And they guarantee that they’ll have covered
everyone who does. The ones they were accurate about they could
report as 100% accurate.
This happens more often than you’d think in scientific circles
and when it happens it’s labeled ‘junk science.’ And rightly
so.
Putting Trust in the F-Score
So, to be responsible, Ninja Metrics advocates using
something called an “F-score.” This takes one stat that
allows for false positives and another that allows for false
negatives and simply averages them. The result is expressed
as a percentage and is extremely trustworthy. It can’t cheat.
Now you need good performance.
For reference, F-scores have been used for a long time in churn models
by the telecommunication companies. For industries like these where the
data isn’t particularly detailed, high quality F-scores tend to hit the .35 to .45 range on average.
With game data, you can get much higher F-scores because the data detail in gaming is really,
really good. A .50 to .70 score is very good. A .80 to .90 is spectacular. Anything over that is
rare indeed. Our models tend to hit .85 to .90 on average, and that’s after 7 years of RD and
academic specialization in game player data algorithms.
Still, at the end of the day, seeing that confidence value in a predictive model is
key. Any consultant or company that gives you a predictive value without that
is no better than voodoo. Science is all about transparency and provability.
Insist on it.
Getting Your Action
Items On
Now that you have a value as well as a level of
confidence in it, how do you turn it into actions?
In other words, how do you turn predictive insight into targeted promotions or
player interventions?
F

Ninja Metrics®
Predicting Player Churn Rate
Say you’re trying to predict player churn rate. You run the numbers and it comes out with an
overall accuracy of .85 (We’ll say “%” from here on out, but an F-score is best). An overall number
is a great start but it’s not very actionable. What is actionable is a score for an individual player.
Let’s consider two players – A and B:
• Player A has a probability of quitting of 60% in the next week, and the model is 90% accurate.
• Player B has a probability of quitting of 80% in the next week, and the model is 20% accurate.
There are a couple ways of working with this.
One is simple. Just set a threshold percentage and say that anyone over this threshold is worth
taking action on, period. That’s fine, but it doesn’t take into account the fact that you have scarce
resources that you can devote to keeping that player. Also, it doesn’t prevent you from offering
an inappropriate promotion or intervention to a player that wasn’t even planning on quitting the
game in the first place. Imagine the ham-handedness of a “Don’t Go!” promotion aimed at a loyal
and happy player.
The second way of thinking about this is to multiply the two percentages together. Player
A is 60% likely, and we’re 90% sure. Taken together, if we had say 100 players
like this, 54% of them (90% of 60%) would quit next week. It’s a little like
opportunity cost. Just consider these players 54% likely to leave and react
appropriately.
Predicting Player Lifetime Value
Prediction of a player’s future spending has always been
elusive. But with predictive modeling, a player’s lifetime value,
or LTV, can be reduced to just another predictive metric. You
are predicting both spending and lifespan.
Let’s say there are two new players:
• Player C has an LTV of $150, a likelihood of churning out of
30%, and the model is saying it’s 90% accurate.
• Player D has an LTV of $80, a likelihood of churning out of 90%
and the model is likewise 90% accurate.
What are they each “worth”? First, both models are pretty accurate, so
let’s just take them as trustworthy and ignore the x .9 part.

Ninja Metrics®
Simply multiply the LTV by the churn probability. Player C is $150x30%, or $45 of expected value,
and player D is $80x90%, or $72 of expected value.
Which player do you want to spend resources to keep?
If all you’re going to do is send a zero-cost email, by all means send it to them both. But when
are you willing to give away something more costly like a free item, a free month of play or a fruit
basket in the mail? Compare the cost of those interventions with the expected value and make a
rational decision. If a fruit basket costs $50 and will save the player, then it makes rational sense
to send it to Player D, but not C.
Ninja Metrics’ Katana software provides all of these numbers to enable exactly these kinds of
transparent and rational decisions. We know that game developers will have their own tastes for
how they handle interventions—community managers, customer service reps, email campaigns,
push notification systems, etc.—and we support whatever systems we learn about. The best
possible analytics system will allow you to set your own thresholds, decide on your own course of
action, connect directly to the vendor or system for that action, then track the results.
Red Ball vs. Blue: “Real-
Time” Analytics
Back to Minority Report – you may recall that their system etched
the future criminals’ name on a little wooden ball. If the ball was
blue, the crime was going to occur some time off in the future.
The police had time to plan a proper response. If it was red, the
crime was imminent and the police had to react immediately.
In the game world, this would be like viewing a list of possible quitters with high probability of
quitting in the next 48 hours and then running a quick intervention to reduce churn.
With Ninja Metrics’ software, blue and red ball analysis is possible. You can use dashboards to

Ninja Metrics®
monitor and review and think big picture, but you can also automate these decisions.
For example, if you know that Player X is going to quit in two weeks, great, you have some time
for management to take action. But what about Player Y, who’s quitting tomorrow? How nimble
are you? You should have any “red ball” players set off a trigger that sends notification to the
right person.
For example, any time a player with an LTV over $5 is likely to quit at more than 60% and with a
confidence level of 70%, send an urgent email to Lisa Brown, the game’s community manager.
Predictive is Not Real-Time
Nearly every analytics company says they offer “real-time analytics.” To some
extent, this is statistical and marketing sleight of hand. Yes, we can all process
your users and tell you how many logged in, or are on level 5, or have spent
money. That’s easy, and it can be as fast as a (very) big Excel file spreadsheet.
But predictive modeling is definitely not instantaneous. It takes time to run
these models. How long depends on the number of players and the
number of data points you’re considering—not to mention how
well your analytics team deals with Hadoop and map reduce!
If your game has 10,000 players, these models are going
to be very fast. But if your game has 10 million players,
it might take a few hours. And heavy duty models like
Ninja Metrics’ Social Value system take longer still.
The key is to consider the trade-off between the cost
of running your models and your ability to quickly
act on their results. We suggest running them daily
because of the trade-offs of processing costs and
many companies’ inability to act on things any faster.
Remember the red ball. So, if you’re ninja nimble - and
can afford extra costs - consider running your models
more frequently.
It’s about actionability at the end of the day. Given a
specific result from the predictive models, would you act? If
you had more actionable data, more often, would it be worth
it to your business to run a promotion or intervention? If so,
consider springing for it. If not, don’t waste resources.

Ninja Metrics®
Data Models and Why
Your Education Probably
Wasn’t Good Enough
Most marketers are trained in a business school, or perhaps some
kind of social science program like communication, sociology,
etc. In these programs, we learn useful statistical tools such as
correlations, ANOVAs, and most often, regression models.
In a regression model, we have an outcome variable (dependent variable) and some number of
predictors (independent variables). Applying this to calculating player churn, we might have a
model that looks something like:
Quitting = Gender + Time Spent + Character Type + Error
And we’d take a look at the overall stats for the model and determine if we thought it was
trustworthy enough. Maybe we’d look at the standard error of the model,
maybe the r-squared, etc. If it’s “good enough,” we’ll take a look at the
coefficients on the independent variables, see which ones reached
statistical significance and how big they are, and in what
direction. Fair enough.
Here’s the problem.
Those models aren’t very good. They’re pretty good —
sometimes you get an r-squared of like .46 and you’re
reasonably confident given the tools you used, i.e.
statistical models based on sampling.
But if we want models that reach accuracy levels of over
.5, we’re going to have to leave the world of old-school
statistical modeling and get with big data.
And big data is the realm of computer science, not social
science.
Do You Want the Good News or the Bad News?
The good news is that computer science has models with power that,

Ninja Metrics®
frankly, beat the crap out of regular social science and b-school approaches. It’s just night and
day. You can now have models that hit 60%, 80%, sometimes 90%+ accuracy levels.
The bad news is that they don’t look like regressions anymore and you need new training to
understand them. The results come out in tables, if-then statements, rule sets and other long,
unfathomable formats. Literally no human being can intuit much of it. Ninja Metrics has spent 7
years figuring out best practices in this new area of science and it’s still challenging.
So how can a model be good if it’s not even understandable?
Fair Question. But do you really need to understand why it works, or is it good enough to
understand that it just ‘is’ with a high degree of certainty?
Imagine you want to know if Player A is going to spend money next month. There’s a black box
there that will tell you, with 85% accuracy, if she will. But you can’t know why. Or, there’s another
transparent box that will tell you, with 40% accuracy, and you can know why. Which box do you
want?
From a practical, actionable point of view, it’s actually an easy question to answer. If you’re going
to run interventions and test their effectiveness anyway, you’re going to get the “why” eventually.
And if you’re going to send everyone the same email anyway, it’s irrelevant.
Don’t get me wrong. I’m a long-time modeler who likes to know the “why”. But if I can get 80%+
confidence levels without knowing the “why,” I’m happy to give it up. I would prefer to have other
parts of my dashboard focus on “why” issues. It’s the smarter entrance to the rabbit hole. And
again, it’s actionable.
If you’re a larger gaming company, you may have a sharp analyst down in the BI department.
That’s great, but she’s probably not running an automated model every day. And even if she is,

Ninja Metrics®
does she have the question-asking and contextual skills (mostly right brain) of a social scientist as
well as the hard-core big data skills of a computer scientist (mostly left brain)? In this early era of
big data a person like this is rare and in extremely high demand.
And this is precisely why we built Ninja Metrics. It enables marketing, BI and the developers to
ask the right questions and then automate all the answers.
Our Katana system is essentially a team of PhDs in a box, working daily.
If you have some PhDs on staff already, fantastic. They will take the tool even farther. Ninja
Metrics supplies scads of new and powerful DVs and IVs for them to play with and investigate
further—all in an easy to use, automated system. They’ll have quick answers they can’t come up
with on their own, and will conceive of more uses for them than we will ever think up. Win-win.
References
Webb, E., D. Campbell, et al. (1966). Unobtrusive measures: Non-reactive research in the social
sciences. Chicago, Rand McNally and Company.

Prediction - the future of game analytics - white paper

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a Prediction - the future of game analytics - white paper

Semelhante a Prediction - the future of game analytics - white paper (20)

Último

Último (20)

Prediction - the future of game analytics - white paper