[R] Incorrect p value for binom.test?

Fri Feb 6 08:49:37 CET 2009

On Thu, 5 Feb 2009, Albyn Jones wrote:

> The computation 2*sum(dbinom(c(10:25),25,0.061)) does not correspond
> to any reasonable definition of p-value.  For a symmetric
> distribution, it is fine to use 2 times the tail area of one tail.
> For an asymetric distribution, this is silly.
>

"Silly" is much too strong. There is a perfectly good reason to compare 2*sum(dbinom(c(10:25),25,0.061)) to a two-sided test threshold.

The argument is that what we are really doing in usual two-sided location tests is two one-sided tests at alpha/2 rather than one two-sided test at alpha. The null hypothesis is being compared to two different alternatives (better or worse vs same) and the decisions about the future would be different depending on which tail we ended up using.

This argument says that we we should compare a one-sided tail area such as sum(dbinom(c(10:25),25,0.061)) to alpha/2; equivalently that we should compare 2*sum(dbinom(c(10:25),25,0.061)) to alpha [or to informal standards for strength of evidence or whatever you typically do with p-values]. I'm not saying that this is the only sensible way to handle and interpret p-values in two-sided tests, but I really don't think it can be dismissed as 'silly'.

      -thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

PS: Daniel Dennett has described as an occupational hazard for philosophers the tendency to go from "I can't imagine X" to "No one can imagine X" to "X is inconceivable".  The transition from "I can't imagine how X would be used" to "X is useless" is somewhat similar, as is the Extreme Bayesian transition from "X wasn't derived by a formal consideration of posterior expected loss" to "X can't be derived by a formal consideration of posterior expected loss" to "X is incoherent". Why, yes, I am grumpy about a reviewer. How did you guess?