[R] chisq.test using amalgamation automatically (possible ?!?)

Gabor Grothendieck ggrothendieck at gmail.com
Sun Jun 26 18:23:05 CEST 2005


On 6/26/05, Mohammad Ehsanul Karim <wildscop at yahoo.com> wrote:
> Dear List,
> 
> 
> If any of observed and/or expected data has less than
> 5 frequencies, then  chisq.test (Pearson's Chi-squared
> Test for Count Data from package:stats) gives warning
> messages. For example,
> 
> x<-c(10, 14, 10, 11, 11, 7, 8, 4, 1, 4, 4, 2, 1, 1, 2,
> 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1)
> y<-c(9.13112391745095, 13.1626482033341,
> 12.6623267638188, 11.0130706413029, 9.16415925139016,
> 7.47441794889028, 6.03743388141852, 4.85350508692505,
> 3.89248001363859, 3.11803140037476, 2.49617540962629,
> 1.99774139023269, 1.5985926374167, 1.27909653584089,
> 1.02341602646530, 0.818828097315106,
> 0.655132353196336, 0.524159229418155,
> 0.418022824890164, 0.335528136508225,
> 0.268448671671046, 0.214779801990545,
> 0.171840507806838, 0.137485729582785,
> 0.109999238967747, 0.0880079144684513,
> 0.070413150156564)
> 
> Chi.Sq<-sum((c(x[1:7], sum(x[8:9]), sum(x[10:11]),
> sum(x[12:27]))-c(y[1:7], sum(y[8:9]), sum(y[10:11]),
> sum(y[12:27])))^2/c(y[1:7], sum(y[8:9]),
> sum(y[10:11]), sum(y[12:27]))) # using amalgamation
> pchisq(Chi.Sq, df=9, ncp=0, lower.tail = FALSE, log.p
> = FALSE) # result being 0.8830207
> 
> but chisq.test(x,y) gives the following output with
> incorrect df:
> 
>        Pearson's Chi-squared test
> 
> data:  x and y
> X-squared = 216, df = 208, p-value = 0.3373
> 
> Warning message:
> Chi-squared approximation may be incorrect in:
> chisq.test(x, y)
> 
> 
> 
> Is there any way that we can use directly chisq.test
> without having warning message in such cases (that is,
> using amalgamation conveniently so that we don't have
> to check each elements if they are less than 5 or not
> - the whole process being automatic, may be by means
> of programming)?
> 
> 
> 
> Any hint, help, support, references will be highly
> appreciated.
> Thank you for your time.
> 

Check out ?combine.levels in package Hmisc.
Also, in the chisq.test call above perhaps you meant this:
  chisq.test(x,p=y/sum(y))




More information about the R-help mailing list