[R] chisq.test() as a goodness of fit test

Thu Jan 13 21:12:36 CET 2005

(Ted Harding) <Ted.Harding at nessie.mcc.ac.uk> writes:

> This is not more difficult, since the hard work is in
> calculating the elements of p. After that, with E=N*p,
> 
>   X2 <- sum(((O-E)^2)/E)
> 
> has the chi-squared distribution with df=(k-r) d.f. where
> k is the number of "bins" and r is the number of parameters
> that have been estimated. So get 1-pchisq(X2,df).

As Achim indicated, this only works if you estimate the parameters
from the binned data (and I suspect that you in principle need to have
decided the bins in advance too.) My old Stat-1 notes had a claim that
if you used the mean and variance of unbinned data to estimate the
normal distribution, then the X2 would be between chi-squares with
k-2 and k-1 d.f.

Incidentally, my .02 DKK is that you're more likely to want a test
against smoother alternative than the omnibus alternative implied by
the chi-square. For instance, if you have digit-preference effects in the
distribution (some weight measurements rounded to nearest half kg,
e.g.), it can throw a highly significant X2, but the deviation is of a
character that has little importance for the validity of subsequent
analyses. I haven't ever seen any of those for the case of estimated
parameters, though...

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907