[Rd] Yates' correction for continuity in chisq.test (PR#8265)
Prof Brian Ripley
ripley at stats.ox.ac.uk
Tue Nov 1 11:58:20 CET 2005
On Mon, 31 Oct 2005, Prof Brian Ripley wrote:
> On Sun, 30 Oct 2005, P Ehlers wrote:
>
>> dih69530 at syd.odn.ne.jp wrote:
>>> Full_Name: foo ba baz
>>> Version: R2.2.0
>>> OS: Mac OS X (10.4)
>>> Submission from: (NULL) (219.66.32.183)
>>>
>>>
>>> chisq.test(matrix(c(9,10,9,11),2,2))
>>>
>>> Chi-square value must be 0, and, P value must be 0
>>> R does over correction
>>>
>>> when | a d - b c | < n / 2 ，chi-sq must be 0
>>
>> (Presumably, you mean P-value = 1.)
>> If you don't want the correction, set correct=FALSE. (The
>> results won't differ much.)
>>
>> A better example is
>>
>> chisq.test(matrix(c(9,10,9,10),2,2))
>>
>> for which R probably should return X-squared = 0.
>
> R is using the correction that almost all the sources I looked at suggest.
> You can't go around adjusting X^2 for just some values of the data: the claim
> is that the adjusted statistic has a more accurate chisq distribution under
> the null.
>
> I think at this remove it does not matter what Yates' suggested (although if
> I were writing a textbook I would find out), especially as the R
> documentation does not mention Yates.
I have now checked Yates (1934), Fisher's 'Statistical Methods for
Research Workers' and the Encyclopedia of Statistical Sciences. The first
two are vague (and could perhaps be read as not correcting O-E = 0), but
the latter agrees with R in giving a formula which always subtracts 1/2.
Also, it mentions that Pearson stated that the formula for the continuity
correction long preceded Yates' publication, so it is perhaps reasonable
not to mention Yates.
