insight they all shareâis that data-based decision making doesnât need to be limited to the conscious preferences of the masses. Instead, it is possible to study the results of decisions and tease out from inside the data the factors that lead to success. This chapter is about how simple regressions are changing decisions by improving predictions. By sifting through aggregations of data, the regression technique can uncover the levers of causation that are hidden to casual and even expert observation. And even when experts feel that a particular factor is an important determinant of some outcome, the regression technique literally can price it out.
Just for fun, Garth Sundem, in his book
Geek Logik,
used a regression to create a formula to predict how long celebrity marriages will last. (It turns out that having more Google hits reduces a marriageâs chancesâespecially if the top Google hits include sexually suggestive photos!) eHarmony, Perfectmatch, and True.com are doing the same kind of thing, but theyâre doing it for profit. These services are engaged in a new kind of Super Crunching competition. The gameâs afoot and itâs a very different kind of game.
Harrahâs Feels Your Pain
The same kind of statistical matchmaking is also happening inside companies like Loweâs and Circuit City, which are using Super Crunching to select job applicants. Employers want to predict which job applicants are going to make a commitment to their job. Unlike traditional aptitude tests that try to suss out an applicantâs IQ, the modern tests are much closer to eHarmonyâs questionnaire in trying to evaluate three underlying personality traits of the applicants: their conscientiousness, agreeableness, and extroversion. Data mining shows that these personality traits are better predictors of worker productivity (especially turnover) than more traditional ability testing. Barbara Ehrenreich was appalled when she took an employment test at a Minneapolis Wal-Mart and was told that she had given the wrong answer when she agreed with the proposition âthere is room in every corporation for a non-conformist.â Yet regressions suggest that people who think Wal-Mart is for non-conformists arenât a good fit and are more likely to turn over. Itâs one thing to argue that Wal-Mart and other employers should reorganize their mind-numbing jobs to make them less boring. But in a world where mind-numbing jobs are legal, itâs hard for me to see whatâs wrong with a statistically validated test that helps match employees that are most compatible with those jobs.
Mining for non-obvious predictors is not just about hiring good applicants. Itâs also helping businesses keep their costs down, especially the costs of stagnant inventory. Businesses that can do a better job of predicting demand can do a better job of predicting when they are about to run out of something. And it can be just as important for businesses to know when theyâre
not
about to run out of something. Instead of bearing the costs of large inventories lying around, Super Crunching allows firms to move to just-in-time purchasing. Stores like Wal-Mart and Target try to get as close as possible to having no excess inventory on hand at all. âWhat they have on the shelf is what theyâve got,â said Scott Gnau, general manager of the data-mining company Teradata. âIf I buy six cans of yellow corn off the shelf, and there are now three cans left, somebody knows that happened immediately so they can make sure that the truck coming my way gets some more corn loaded on it. Itâs gotten to the point that as youâre putting stuff in your trunk, the store is loading the truck at the distribution center.â These prediction strategies can be based on highly specific details about likely demand. Before Hurricane Ivan hit Florida in 2004, Wal-Mart already had started rushing strawberry Pop-Tarts to