In an unpredictable world, can we rely on big data? Experiments suggest more information does not increase the accuracy of predictions.

by Alastair Dryburgh
Last Updated: 01 Apr 2016

Consider the following experiment, commissioned by the CIA, no less, in the late 1990s. Eight experienced horserace handicappers were given suggestions for 88 different pieces of information that they might use to predict a horse's performance.

The weight carried, how often the horse had placed first in the past, the number of days since its last race, etc. They were then asked to say which five pieces of information were the most important, that is, which they would use if they could only use five, then which 10, which 20 and which 40.

Next, they were given the data from 40 races and asked to predict the winning horse. They were also asked to say how confident they felt in their prediction.

The result? More information did not increase the accuracy of the prediction. Using 40 pieces of information was no better than using five. But using more information did increase the handicappers' confidence in their predictions.

Other studies found similar results with clinical psychologists and doctors.

Now, I trained as a mathematician, so you might expect me to be attracted by the idea of big data. But, actually, my studies left me with a properly humble view of the value of quantitative techniques in an unpredictable world.

Which is more valuable: a prediction that has a 40% chance of coming true, which you believe has a 40% chance of coming true, or a prediction that has a 50% chance of coming true, which you believe has a 70% chance of coming true?

The first is a realistic picture of reality that is useful (up to a point). The second is dangerous overconfidence. 'Mathematics', as one of my old teachers once said, 'requires no maturity of mind.' But success in business requires precisely that: judgement, intuition and grasp of nuance.

So, when big data come a-calling, ask yourself do you actually need more information? Is it improving the quality of your prediction, or merely your confidence in it?

