[Rd] numerical issues in chisq.test(simulate=TRUE) (PR#8224)
maechler@stat.math.ethz.ch
maechler at stat.math.ethz.ch
Thu Oct 20 11:00:02 CEST 2005
Thank you, Douglas (and Simone) for the bug report.
>>>>> "Simone" == Simone Giannerini <sgiannerini at gmail.com>
>>>>> on Thu, 20 Oct 2005 10:10:01 +0200 writes:
Simone> Hi,
Simone> I obtain the same result under Win. XP SP2 on AMD 64 3700+
Simone> platform i386-pc-mingw32
Simone> arch i386
Simone> os mingw32
Simone> system i386, mingw32
Simone> status
Simone> major 2
Simone> minor 2.0
Simone> year 2005
Simone> month 10
Simone> day 06
Simone> svn rev 35749
Simone> language R
>> m <- matrix(c(1,0,7,15),2,2) ; chisq.test(m, sim=TRUE)$p.value
Simone> [1] 0.3598201
>> m <- matrix(c(1,0,7,16),2,2) ; chisq.test(m, sim=TRUE)$p.value
Simone> [1] 0.0004997501
>> m <- matrix(c(1,0,7,17),2,2) ; chisq.test(m, sim=TRUE)$p.value
Simone> [1] 0.3403298
So, it's only the middle matrix giving problems for you.
It doesn't for me on a AMD 64-bit platform, even for 1000 simulations:
> m <- cbind(1:0, c(7,16))
> summary(p <- replicate(1000, chisq.test(m, sim=TRUE)$p.value))
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.3043 0.3268 0.3338 0.3337 0.3405 0.3663
but it does show the problem on 32-bit one.
A fix is easy and will be in R-patched (and R-devel of
course) soon.
Martin Maechler, ETH Zurich
Simone> Ciao
Simone> Simone
Simone> On 10/20/05, dgrove at fhcrc.org <dgrove at fhcrc.org> wrote:
>> Hi,
>>
>> This report deals with p-values coming from chisq.test using
>> the simulate.p=TRUE option. The issue is numerical accuracy
>> and was brought up in previous bug reports 3486 and 3896.
>> The bug was considered fixed but apparently was only mostly
>> fixed. Just the typical problem of two values that are
>> mathematically equal not ending up numerically equivalent.
>>
>> Consider this series of three 2x2 tables:
>>
>> [1,] 1 7
>> [2,] 0 15
>>
>> [1,] 1 7
>> [2,] 0 16
>>
>> [1,] 1 7
>> [2,] 0 17
>>
>>
>> The pvals returned from chisq.test(m, sim=TRUE)$p.value are
>> 0.3543228, 0.0004997501 and 0.3273363 respectively.
>>
>> The 2nd seems a bit unlikely, huh?
>>
>> I checked into it and the value I'm getting for the statistic
>> (calculated in R code) is 4*.Machine$double.eps less than the
>> value (which should be equal) that is returned from the C-code
>> that does the simulation.
>>
>>
>> Code for creating/testing the three matrices shown above:
>> m <- matrix(c(1,0,7,15),2,2) ; chisq.test(m, sim=TRUE)$p.value
>> m <- matrix(c(1,0,7,16),2,2) ; chisq.test(m, sim=TRUE)$p.value
>> m <- matrix(c(1,0,7,17),2,2) ; chisq.test(m, sim=TRUE)$p.value
>>
>>
>> Running SuSE9.3 on a AMD Athlon4000+
>>
>>
>> > version
>> platform i686-pc-linux-gnu
>> arch i686
>> os linux-gnu
>> system i686, linux-gnu
>> status Patched
>> major 2
>> minor 1.1
>> year 2005
>> month 07
>> day 29
>> language R
>>
>>
>> Thanks,
>> Doug
>>
>>
>> Douglas Grove
>> Statistical Research Associate
>> Fred Hutchinson Cancer Research Center
>> Seattle WA 98109
More information about the R-devel
mailing list