February 12, 2009

Data mining will save us all

Most people these days could tell you that making a videogame involves writing a design document and then making the game. What most people don't think about is that almost no game is identical to its original design doc by the time you play it. Designers think or hope players will respond in a certain way to a given set of mechanics, which is not always the case. This results in problems that need to be solved.

The better a designer you become, the more "correct" your designs will be on the first try, which means you can spend more time making cool stuff and less time remaking uncool stuff. But even the best designers will have some problems to address. Only bad designers ever think their game has no problems. Successfully identifying which problems to address, and how, are traits of good designers.

The scientific method

I've said before that it's important to seriously consider every reported problem, but most reports, especially those submitted by players, are so conflicting and subjective that they're impossible to easily verify. This is why a scientific approach to problem solving is very useful.

As you may remember, one common version of the scientific method involves 4 major steps:
  1. Gather data ( observations about something that is unknown, unexplained, or new )
  2. Hypothesize an explanation for those observations.
  3. Deduce a consequence of that explanation (a prediction). Formulate an experiment to see if the predicted consequence is observed.
  4. Wait for corroboration. If there is corroboration, go to step 3. If not, the hypothesis is falsified. Go to step 2.

Steps 1 and 4 of this process are very difficult to achieve manually, without devolving into subjective arguments.

Most games with large playerbases and complex systems are impossible for any one brain to completely comprehend. There are trends and patterns that are too subtle to notice based on anecdotal evidence. Without some sort of overview, it's impossible to prove that a problem is real, or that a problem has been successfully solved.

This where data mining comes in

All companies worth their salt recognize the need for objective evaluation of how well their game's design is working, and many implement some sort of large scale data mining to provide them with raw data to analyze. Some companies even make this data public:

CCP's economic analyses of EVE:

Valve's game demographic data, win/loss ratios and kill maps for TF2:

Players even have their own versions of data mining. Thottbot shows mob locations, drop rates resources, and lots of other data just by tracking where players are who have installed a special tool and what they see:

Which data should you be mining?

The short answer is, as much as you possibly can without noticeably slowing down your game. Here are some examples for an MMO:
  • Where players are killed, their class, their level, which enemy killed them, and which power killed them.
  • Where players are killing things, which enemies are killed (farmed) the most, by class and level.
  • XP gain rates per hour, by level, location, and class.
  • Item drops by mob, level and location.
  • Quests accomplished, by class and level
  • Abandoned quests, by class and level
  • All stats of all characters, per level and class
  • All chosen abilities of all characters, per level and class
  • Gear choices for all characters, per level and clafwhss
  • Group compositions by class, per level
  • Class distribution per faction and level
  • All the above data, as it pertains to PvP
  • All wealth for all characters, and deltas per hour
  • All player income and expenditures per level, by zone and level
You'll need a nice tool that can send out automatic reports of this data in various visualizations, and the ability to create charts based on any combination of data. Then this information can begin to tell you things that will help you make you game better:
If you're generating automated daily reports on all this data, and you have an alert system set up to catch and report any wildly out-of-whack numbers immediately, you'll be able to fix any terrible bugs before they become well-known or cause permanent damage to your game or playerbase.

A good model for a successful datamining/notification system should feel a bit like what your credit card company does. You can log in at any time to make sure things are going ok, and they'll periodically send you reports. When there's some suspect activity on your card, their fraud-prevention bells go off and they call you on the phone immediately to make sure the purchases are legitimate.

For more on data mining, check out Sara Jensen Schubert's blog; she has lots of interesting posts on it.


Ali I said...

finally, a post about my job! :P but seriously, its really funny how so many of your posts could easily apply to my work, or my company.

Mike Darga said...

Yeah, I guess a lot of this stuff isn't really specific to game design, even though that's the context I'm discussing them in.

Or maybe that means you should become a game developer =)