Science journalists have a really tough job. They must, in the little space afforded by a newspaper column, convey some measure of truth and a large helping of human interest to the interested reader. And readers are tricky beasts! Even when an article is not about science but about something that will affect directly their own lives, today – a story about the financial crisis, perhaps, or a crime committed near their area – readers have a tendency to stop reading part way through the text, and move on to the next gripping headline.
In order to convey the maximum of information in the minimum of time and effort on the reader’s part, one tool journalists love to use is statistics. What better way to interest a reader than to tell them that 50% of men will suffer from erectile dysfunction at some time in their lives, or that 1 in 10 humans are gay or bisexual? Bam! One easy sentence, such high impact.
Naturally, the reality of any given situation is rather more complex than newspaper statistics portray. It must be – you only got given one number to explain the all of the results! And of course, when politicians or salespeople get involved the situation gets even more twisty. If a statistic sounds like it supports their view, they won’t look too closely at its validity. I’m sure that all of you, having made it to university, have already figured this out, and found a way to deal with it. One can deal with the reporting of statistics in many ways, ranging from a flat refusal to lend credence to any statistic (which by the way is not too unjustified) all the way up to proper research, finding citations and methodology used. Most of us lack the tools or connections to carry out the latter, and besides which one would seldom have enough motivation to personally spend the required time doing so. Today I’d like to share with you a couple of shortcuts to help you interpret the meaning of a statistic.
When people quote you a statistic, it’s important to think about exactly what its meaning is. Almost always when people use statistics, they are trying to use them to prove some point or other. You should always double check whether or not the fact they use actually supports their point or not, because often enough it actually doesn’t. Occasionally the misrepresentation is malicious, but I think that most of the time it is simply because the quoter has not fully understood the meaning of the statistic themselves. The key word here is exactly.
I’ll give a simple example of this. Say there was a 50% year-over-year increase in cases of breast cancer reported in the year after breast cancer screenings were made available for free in some sector of the population1. One might quote this statistic as evidence that the rate of breast cancer cases was increasing, or that the system was failing. What exactly does this statistic tell us? The figure does not say directly that there is an increasing rate of breast cancer in the population. Nor does it say anything about the rate of success of treatments for breast cancer, or the survival rate of those suffering from it. The only thing this figure measures is the number of diagnoses of breast cancer. It is unjustified to use this figure by itself to claim an increase in the rate of breast cancer (though if more facts were compiled one might be able to argue a case). If one looks at the broader situation, it should be no surprise that an increase in breast cancer diagnoses follows a year of freely available screening
Secondly, it important to carefully consider the context of any given statistic. “Thanks to my intervention, sales have risen 13%!” This figure sounds good by itself. But you’ve got to then ask other relevant questions, such as “by how much have our competitors’ sales risen in the same period?” or “how have our expenditures changed?” Once the bigger picture is taken into account the figure might not be so great after all. At the end of the day, statistics are a tool to help us make decisions, and you are almost never in a situation where you are making a choice between doing one thing and doing nothing. Any new medical treatment these days must not only be better than no action, it must not only be better than a placebo – it must have a better result than the best available treatments in order to be a successful treatment (albeit that a cheaper treatment with similar results is also worthwhile).
Any improvement seems great when you compare it to nothing, and it’s hard to see how a statistic given in isolation in this manner could be misleading. The sample size can be huge and comprehensive, the methodology perfect and the results correctly interpreted, and yet it could still be meaningless without a comparison. For that reason, I think the single most important question to ask in the face of any figure is: how would this have been different had another choice been made?
Until next time, I hope you are now better equipped to apply some healthy skepticism to any figure quoted to you!
Acknowledgement – A lot of this week’s column is inspired by the work of Ben Goldacre. Try his book, “Bad Science,” or his blog at www.badscience.net.
 This is not an actual statistic. I made this up for the purpose of this example.