Archive for Statistical blunders

Bad Bayes still bad

Tamino, a notorious “climate change” blogger, is alleged to also be a statistician. He certainly seems to know something about time series. (Thanks to this investigation, we know that Tamino is Grant Foster, writer of “blog diatribe”-style climate papers. His affiliation in the linked paper is “Tempo Analytics, Westbrook, Maine”, but I can’t find any other reference to it online).

Unfortunately he might be somewhat off-base when it comes to other statistical principles. His discussion of Bayesian analysis is so confused that I’ll leave it to Andrew Gelman, professor of statistics at Columbia University, to summarise it for us:

Kent Holsinger sends along this statistics discussion from a climate scientist. I don’t really feel like going into the details on this one, except to note that this appears to be a discussion between two physicists about statistics. The blog in question appears to be pretty influential, with about 70 comments on most of its entries. When it comes to blogging, I suppose it’s good to have strong opinions even (especially?) when you don’t know what you’re talking about.

Update: Gelman repeated himself on his academic blog, where he elaborates on his opinion in the comments. It’s strange that when I tried commenting (twice) on “Tamino”‘s blog to refer him to Gelman’s comments, I didn’t succeed; but when someone else did the same but with the qualifier that “[Gelman] comes around to Tamino’s side” [which not actually true] in his later comments the link appears.

At the time of writing the comment thread ends with “Tamino” abusing a commenter trying to correct one of his calculations until he eventually admits he was indeed wrong. Oh dear.

Leave a Comment

Another sampling from the great frequentist malpractice genre in the sky

That this isn’t well-known amongst the general public is a disgrace, but the “scientific method” as carried out by academic careerists has long been only a poor substitute for real science:

It’s science’s dirtiest secret: The “scientific method” of testing hypotheses by statistical analysis stands on a flimsy foundation. Statistical tests are supposed to guide scientists in judging whether an experimental result reflects some real effect or is merely a random fluke, but the standard methods mix mutually inconsistent philosophies and offer no meaningful basis for making such decisions. Even when performed correctly, statistical tests are widely misunderstood and frequently misinterpreted. As a result, countless conclusions in the scientific literature are erroneous, and tests of medical dangers or treatments are often contradictory and confusing.

From Then follows the usual errors relating to interpretation of hypothesis tests and other applied frequentist gunk. There is an interesting point made about how randomisation isn’t all that (although what the alternative should be is anyone’s guess), before… behold!

Such sad statistical situations suggest that the marriage of science and math may be desperately in need of counseling. Perhaps it could be provided by the Rev. Thomas Bayes.

A lovely line. Whether this latest example of the litany against the standard operating procedure of too many scientists from all disciplines will change anything more than the previous attempts to do so is moot.

Leave a Comment

Bayesian calculation of Bayesian calculation?

From Schneier’s Security blog, a lucid and highly readable commentary on security-related news, comes this comment:

The Home Secretary, John Reid, stated in December that an attempted terrorist attack in the UK over Christmas was “highly likely” … Since there wasn’t one, I think Bayes’ Theorem tells us that it is “highly likely” that Reid, and hence also MI5, either don’t know what they’re talking about, or else were lying.

From my limited experience, if nothing else, I can reason that this is not true, or at the very least not necessarily true. But what to do with such calculations, which one could argue are boundedly rational given ignorance about Bayesian matters and only a very limited amount of time to work through the logic? Is this a tolerable consequence of increasing awareness of Bayesianism?

Leave a Comment

Polls and mass confusion

Let’s start off proceedings with a quote trying to explain one of the most pervasive yet still misunderstood usages of statistics today – polling. And with it, the whole philosophy of inference of a population by random sampling:

Polling a sample of the population has often been likened to tasting soup: if it is well stirred then you only need to have one spoonful to tell what the whole bowl is like. [Source: BBC News’ Election 2005 Guide to pollsters’ methodology]

Excellent. On the other hand, they go and mess it up by correctly – if necessarily obliquely – explaining the “3% margin of error”:

Polling companies generally claim that 95% of the time, a poll of 1,000 people will be accurate within a margin of error of +/-3%.

but then going on to say

This means that figure in the poll could be up to three percentage points higher or lower than that shown.

Well, it could also be 50% higher of lower than that shown, assuming a normal distribution for the likelihood, so that’s not really useful. This might be a minor example, but it shows how careful we must be when talking about statistics in natural, human language. All too often when the media pundits discuss these polls, someone will pipe up with something like “Oh, but Party A is only 4% ahead of Party B, so with the 3% margin of error, maybe Party B is actually ahead!”. Even if Party A has been 3-4% ahead of Party B for eight weeks in a row. Indeed, the BBC falls into this trap:

So if the Tories are on 32% and Labour is on 38%, there is a chance they could both be on 35%.

I think a Pollwatch should be set up for the most egregious implementations of or analyses of polls, if there isn’t one already. Or maybe there would be just too many examples.

Leave a Comment