[R] Chi-Square Test Disagreement

Berwin A Turlach berwin at maths.uwa.edu.au
Wed Nov 26 17:46:31 CET 2008


G'day Andy,

On Wed, 26 Nov 2008 14:51:50 +0000
Andrew Choens <andy.choens at gmail.com> wrote:

> I was asked by my boss to do an analysis on a large data set, and I am
> trying to convince him to let me use R rather than SPSS. 

Very laudable of you. :)

> This is the output from R:
> > chisq.test(test29)
> 
> 	Pearson's Chi-squared test
> 
> data:  test29
> X-squared = 9.593, df = 4, p-value = 0.04787
> 
> But, the same data in SPSS generates a p value of .051. It's a small
> but important difference. 

Chuck explained already the reason for this small difference.  I just
take issue about it being an important difference.  In my opinion, this
difference is not important at all.  It would only be important to
people who are still sticking to arbitrary cut-off points that are
mainly due to historical coincidences and the lack of computing power
at those time in history.  If somebody tells you that this difference
is important, ask him or her whether he or she will be willing to
finance you a room full of calculators (in the sense of Pearson's time)
and whether he or she wants you to do all your calculations and analyses
with these calculators in future.  Alternatively, you could ask the
person whether he or she would like the anaesthetist during his or her
next operation to use chloroform given his or her nostalgic penchant for
out-dated rituals/methods.

> I played around and rescaled things, and tried different values for
> B, but I never could get R to reach .051.

Well, I have no problem when using simulated p-values to get something
close to 0.051; look at the last try.  The second one might also be
noteworthy.  Unfortunately, I didn't save the seed beforehand.

> test29 <- matrix(c(110,358,71,312,29,139,31,77,13,32), byrow=TRUE,
> ncol=2) test29
     [,1] [,2]
[1,]  110  358
[2,]   71  312
[3,]   29  139
[4,]   31   77
[5,]   13   32
> chisq.test(test29, simul=TRUE)

	Pearson's Chi-squared test with simulated p-value (based on 2000
	replicates)

data:  test29 
X-squared = 9.593, df = NA, p-value = 0.04798

> chisq.test(test29, simul=TRUE)

	Pearson's Chi-squared test with simulated p-value (based on 2000
	replicates)

data:  test29 
X-squared = 9.593, df = NA, p-value = 0.05697

> chisq.test(test29, simul=TRUE, B=20000)

	Pearson's Chi-squared test with simulated p-value (based on
20000 replicates)

data:  test29 
X-squared = 9.593, df = NA, p-value = 0.0463

> chisq.test(test29, simul=TRUE, B=20000)

	Pearson's Chi-squared test with simulated p-value (based on
20000 replicates)

data:  test29 
X-squared = 9.593, df = NA, p-value = 0.0499

> chisq.test(test29, simul=TRUE, B=20000)

	Pearson's Chi-squared test with simulated p-value (based on
20000 replicates)

data:  test29 
X-squared = 9.593, df = NA, p-value = 0.0486

> chisq.test(test29, simul=TRUE, B=20000)

	Pearson's Chi-squared test with simulated p-value (based on
20000 replicates)

data:  test29 
X-squared = 9.593, df = NA, p-value = 0.05125


Cheers,

	Berwin

=========================== Full address =============================
Berwin A Turlach                            Tel.: +65 6516 4416 (secr)
Dept of Statistics and Applied Probability        +65 6516 6650 (self)
Faculty of Science                          FAX : +65 6872 3919       
National University of Singapore     
6 Science Drive 2, Blk S16, Level 7          e-mail: statba at nus.edu.sg
Singapore 117546                    http://www.stat.nus.edu.sg/~statba



More information about the R-help mailing list