Let’s start off proceedings with a quote trying to explain one of the most pervasive yet still misunderstood usages of statistics today – polling. And with it, the whole philosophy of inference of a population by random sampling:
Polling a sample of the population has often been likened to tasting soup: if it is well stirred then you only need to have one spoonful to tell what the whole bowl is like. [Source: BBC News’ Election 2005 Guide to pollsters’ methodology]
Excellent. On the other hand, they go and mess it up by correctly – if necessarily obliquely – explaining the “3% margin of error”:
Polling companies generally claim that 95% of the time, a poll of 1,000 people will be accurate within a margin of error of +/-3%.
but then going on to say
This means that figure in the poll could be up to three percentage points higher or lower than that shown.
Well, it could also be 50% higher of lower than that shown, assuming a normal distribution for the likelihood, so that’s not really useful. This might be a minor example, but it shows how careful we must be when talking about statistics in natural, human language. All too often when the media pundits discuss these polls, someone will pipe up with something like “Oh, but Party A is only 4% ahead of Party B, so with the 3% margin of error, maybe Party B is actually ahead!”. Even if Party A has been 3-4% ahead of Party B for eight weeks in a row. Indeed, the BBC falls into this trap:
So if the Tories are on 32% and Labour is on 38%, there is a chance they could both be on 35%.
I think a Pollwatch should be set up for the most egregious implementations of or analyses of polls, if there isn’t one already. Or maybe there would be just too many examples.