[R] Why two chisq.test p values differ when the contingency

Peter Dalgaard BSA p.dalgaard at biostat.ku.dk
Wed Jul 16 01:03:38 CEST 2003


"Shi, Tao" <shidaxia at yahoo.com> writes:

> Hi, Ted and Dennis:
>  
> Thanks for your speedy replies!  I don't think this happens just randomly, rather, I'm thinking it may be due to the way chisq.test function handles simulation.  Here shows why: (Ted, I think there is an error in your code, "tx" should be t(x)  )
> 
> > x
>      [,1] [,2]
> [1,]  149  151
> [2,]    1    8
> > c2x<-chisq.test(x, simulate.p.value=T, B=100000)$p.value
> > for(i in (1:20)){c2x<-c(c2x,chisq.test(x, simulate.p.value=T,
> +                        B=100000)$p.value)}
> > c2tx<-chisq.test(t(x), simulate.p.value=T, B=100000)$p.value
> > for(i in (1:20)){c2tx<-c(c2tx,chisq.test(t(x), simulate.p.value=T,
> +                         B=100000)$p.value)}
> > cbind(c2x,c2tx)
>           c2x    c2tx
>  [1,] 0.03727 0.01629
>  [2,] 0.03682 0.01662

I agree that this looks dodgy. The simulation is by taking samples of
tables consistent with the given marginals, so should be invariant
under transpose operations. I venture a guess that the algorithm is
somehow forgetting to count tables that are identical to the current
table. (Notice that there are really only ten tables consistent with
those marginals, with probabilities

> dhyper(0:9,150,159,9)
 [1] 0.002258834 0.020194876 0.079185170 0.178727311 0.255905014
 0.241046013
 [7] 0.149366119 0.058713525 0.013284864 0.001318274

and the differences between c2x and c2tx look suspiciously close to
0.020194876...)

You might want to file this as a bug report.

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907




More information about the R-help mailing list