[Rd] chisq.test with simulate.p.value=TRUE (PR#13292)

Wed Nov 19 14:22:16 CET 2008

 <constant <at> unb.br> writes:

> For many tables, chisq.test with simulate.p.value=TRUE gives a p value that is
> obviously incorrect and inversely proportional to the number of replicates:
> 
> > data(HairEyeColor)
> > x <- margin.table(HairEyeColor, c(1, 2))
> > chisq.test(x,simulate.p.value=TRUE,B=2000)
>         Pearson's Chi-squared test with simulated p-value (based on 2000
>         replicates)
> data:  x
> X-squared = 138.2898, df = NA, p-value = 0.0004998
> 
> > chisq.test(x,simulate.p.value=TRUE,B=10000)
> X-squared = 138.2898, df = NA, p-value = 1e-04
> 
> > chisq.test(x,simulate.p.value=TRUE,B=100000)
> X-squared = 138.2898, df = NA, p-value = 1e-05
> 
> > chisq.test(x,simulate.p.value=TRUE,B=1000000)
> X-squared = 138.2898, df = NA, p-value = 1e-06
> ...
> 

  Tried to answer this the other day but the answer must
have gotten lost.  The standard analytical chi-squared test
here gives p<2.2e-16 (i.e. very very small).  The values given
above, up to limited display of significant digits, are
precisely 1/(B+1); that is, the simulated chi-squared values
are never less than the observed chi-squared statistic (the
observed value itself is included in the ensemble, so the
p-value is given as 1/(B+1) rather that <1/B; you can read
about the reasons for this elsewhere [?]).  Bottom line:
why do you think these results are "obviously incorrect"?

  Ben Bolker