[R] Why two chisq.test p values differ when the contingency
Peter Dalgaard BSA
p.dalgaard at biostat.ku.dk
Wed Jul 16 01:03:38 CEST 2003
"Shi, Tao" <shidaxia at yahoo.com> writes:
> Hi, Ted and Dennis:
>
> Thanks for your speedy replies! I don't think this happens just randomly, rather, I'm thinking it may be due to the way chisq.test function handles simulation. Here shows why: (Ted, I think there is an error in your code, "tx" should be t(x) )
>
> > x
> [,1] [,2]
> [1,] 149 151
> [2,] 1 8
> > c2x<-chisq.test(x, simulate.p.value=T, B=100000)$p.value
> > for(i in (1:20)){c2x<-c(c2x,chisq.test(x, simulate.p.value=T,
> + B=100000)$p.value)}
> > c2tx<-chisq.test(t(x), simulate.p.value=T, B=100000)$p.value
> > for(i in (1:20)){c2tx<-c(c2tx,chisq.test(t(x), simulate.p.value=T,
> + B=100000)$p.value)}
> > cbind(c2x,c2tx)
> c2x c2tx
> [1,] 0.03727 0.01629
> [2,] 0.03682 0.01662
I agree that this looks dodgy. The simulation is by taking samples of
tables consistent with the given marginals, so should be invariant
under transpose operations. I venture a guess that the algorithm is
somehow forgetting to count tables that are identical to the current
table. (Notice that there are really only ten tables consistent with
those marginals, with probabilities
> dhyper(0:9,150,159,9)
[1] 0.002258834 0.020194876 0.079185170 0.178727311 0.255905014
0.241046013
[7] 0.149366119 0.058713525 0.013284864 0.001318274
and the differences between c2x and c2tx look suspiciously close to
0.020194876...)
You might want to file this as a bug report.
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list