[Rd] numerical issues in chisq.test(simulate=TRUE) (PR#8224)
dgrove@fhcrc.org
dgrove at fhcrc.org
Thu Oct 20 08:08:22 CEST 2005
Hi,
This report deals with p-values coming from chisq.test using
the simulate.p=TRUE option. The issue is numerical accuracy
and was brought up in previous bug reports 3486 and 3896.
The bug was considered fixed but apparently was only mostly
fixed. Just the typical problem of two values that are
mathematically equal not ending up numerically equivalent.
Consider this series of three 2x2 tables:
[1,] 1 7
[2,] 0 15
[1,] 1 7
[2,] 0 16
[1,] 1 7
[2,] 0 17
The pvals returned from chisq.test(m, sim=TRUE)$p.value are
0.3543228, 0.0004997501 and 0.3273363 respectively.
The 2nd seems a bit unlikely, huh?
I checked into it and the value I'm getting for the statistic
(calculated in R code) is 4*.Machine$double.eps less than the
value (which should be equal) that is returned from the C-code
that does the simulation.
Code for creating/testing the three matrices shown above:
m <- matrix(c(1,0,7,15),2,2) ; chisq.test(m, sim=TRUE)$p.value
m <- matrix(c(1,0,7,16),2,2) ; chisq.test(m, sim=TRUE)$p.value
m <- matrix(c(1,0,7,17),2,2) ; chisq.test(m, sim=TRUE)$p.value
Running SuSE9.3 on a AMD Athlon4000+
> version
platform i686-pc-linux-gnu
arch i686
os linux-gnu
system i686, linux-gnu
status Patched
major 2
minor 1.1
year 2005
month 07
day 29
language R
Thanks,
Doug
Douglas Grove
Statistical Research Associate
Fred Hutchinson Cancer Research Center
Seattle WA 98109
More information about the R-devel
mailing list