[R] Pearson chi-square test

Meyners, Michael meyners.m at pg.com
Tue Sep 27 17:26:49 CEST 2011


I suspect that the chisquare-test might not be appropriate, as you have constraints (same number of observations for A in both contingency tables). I further suspect that there is no test readily available for that, but I might be wrong. Maybe randomization tests could help here, but it would require a bit of thinking AND programming to accomplish that. chisq.test might give you an approximate solution, but I can't say how good this will be (and it might also depend on the data, btw).
Best, Michael


From: Michael Haenlein
Sent: Tuesday, September 27, 2011 17:05
To: r-help at r-project.org
Cc: Meyners, Michael
Subject: RE: [R] Pearson chi-square test

Dear Michael,
 
Thanks very much for your answers!
 
The purpose of my analysis is to test whether the contingency table x is different from the contingency table y.
Or, to put it differently, whether there is a significant difference between the joint distribution A&B and A&C.
 
Based on your answer I'm wondering whether the best way to do this is really a chisq.test?
Or is there probably a different function or package I should use altogether?
 
Thanks,
 
Michael
 
 
 
-----Original Message-----
From: Meyners, Michael 
Sent: Dienstag, 27. September 2011 17:00
To: Michael Haenlein; r-help at r-project.org
Subject: RE: [R] Pearson chi-square test
 
Just for completeness: the manual calculation you'd want is most likely
 
sum((x-y)^2  / (x+y))
 
(that's one you can find on the Wikipedia link you provided). To get the same from chisq.test, try something like 
 
chisq.test(data.frame(x,y)[,c(3,6)])
 
(there are surely smarter ways, but at least it works here). Note that something like 
 
chisq.test(as.vector(x), as.vector(y)) 
 
will give a different test, i.e. based on a contingency table of x cross y).
M. 
 
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Meyners, Michael
> Sent: Tuesday, September 27, 2011 13:28
> To: Michael Haenlein; r-help at r-project.org
> Subject: Re: [R] Pearson chi-square test
> 
> Not sure what you want to test here with two matrices, but reading the
> manual helps here as well:
> 
> y   a vector; ignored if x is a matrix.
> 
> x and y are matrices in your example, so it comes as no surprise that
> you get different results. On top of that, your manual calculation is
> not correct if you want to test whether two samples come from the same
> distribution (so don't be surprised if R still gives a different
> value...).
> 
> HTH, Michael
> 
> > -----Original Message-----
> > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> > project.org] On Behalf Of Michael Haenlein
> > Sent: Tuesday, September 27, 2011 12:45
> > To: r-help at r-project.org
> > Subject: [R] Pearson chi-square test
> >
> > Dear all,
> >
> > I have some trouble understanding the chisq.test function.
> > Take the following example:
> >
> > set.seed(1)
> > A <- cut(runif(100),c(0.0, 0.35, 0.50, 0.65, 1.00), labels=FALSE)
> > B <- cut(runif(100),c(0.0, 0.25, 0.40, 0.75, 1.00), labels=FALSE)
> > C <- cut(runif(100),c(0.0, 0.25, 0.50, 0.80, 1.00), labels=FALSE)
> > x <- table(A,B)
> > y <- table(A,C)
> >
> > When I calculate the test statistic by hand I get a value of
> > approximately
> > 75.9:
> > http://en.wikipedia.org/wiki/Pearson's_chi-
> > square_test#Calculating_the_test-statistic
> > sum((x-y)^2/y)
> >
> > But when I do chisq.test(x,y) I get a value of 12.2 while
> > chisq.test(y,x)
> > gives a value of 10.3.
> >
> > I understand that I must be doing something wrong here, but I'm not
> > sure
> > what.
> >
> > Thanks,
> >
> > Michael



More information about the R-help mailing list