[Rd] numerical issues in chisq.test(simulate=TRUE) (PR#8224)

dgrove@fhcrc.org dgrove at fhcrc.org
Thu Oct 20 08:08:22 CEST 2005


Hi,

This report deals with p-values coming from chisq.test using
the simulate.p=TRUE option.  The issue is numerical accuracy
and was brought up in previous bug reports 3486 and 3896.
The bug was considered fixed but apparently was only mostly
fixed.  Just the typical problem of two values that are
mathematically equal not ending up numerically equivalent.

Consider this series of three 2x2 tables:

[1,]    1    7
[2,]    0   15

[1,]    1    7
[2,]    0   16

[1,]    1    7
[2,]    0   17


The pvals returned from chisq.test(m, sim=TRUE)$p.value are
 0.3543228, 0.0004997501 and 0.3273363 respectively.

The 2nd seems a bit unlikely, huh?

I checked into it and the value I'm getting for the statistic
(calculated in R code) is 4*.Machine$double.eps less than the
value (which should be equal) that is returned from the C-code
that does the simulation.


Code for creating/testing the three matrices shown above:
m <- matrix(c(1,0,7,15),2,2) ; chisq.test(m, sim=TRUE)$p.value
m <- matrix(c(1,0,7,16),2,2) ; chisq.test(m, sim=TRUE)$p.value
m <- matrix(c(1,0,7,17),2,2) ; chisq.test(m, sim=TRUE)$p.value


Running SuSE9.3 on a AMD Athlon4000+


> version
platform i686-pc-linux-gnu
arch     i686
os       linux-gnu
system   i686, linux-gnu
status   Patched
major    2
minor    1.1
year     2005
month    07
day      29
language R


Thanks,
Doug


Douglas Grove
Statistical Research Associate
Fred Hutchinson Cancer Research Center
Seattle WA 98109



More information about the R-devel mailing list