[R] fisher exact vs. simulated chi-square
Thomas Lumley
tlumley at u.washington.edu
Tue Apr 22 16:00:54 CEST 2003
On Tue, 22 Apr 2003, Dirk Janssen wrote:
>
> Dear All,
>
> I have a problem understanding the difference between the outcome of a
> fisher exact test and a chi-square test (with simulated p.value).
>
> For some sample data (see below), fisher reports p=.02337. The normal
> chi-square test complains about "approximation may be incorrect",
> because there is a column with cells with very small values. I
> therefore tried the chi-square with simulated p-values, but this still
> gives p=.04037. I also simulated the p-value myself, using r2dtable,
> getting the same result, p=0.04 (approx).
>
> Why is this substantially higher than what the fisher exact says? Do
> the two tests make different assumptions? I noticed that the
> discrepancy gets smaller when I increase the number of observations
> for column A3. Does this mean that the simulated chi-square is still
> sensitive to cells with small counts, even though it does not give me
> the warning?
Both are exact. I beleive the difference is just the test statistic.
Imagine listing all the possible 3x3 tables with the same margins as
yours. A test has to sort them into some ordering of distance from the
null and then add up the probabilities for all possible tables further
from the null than yours.
There's more than one way to do this. Even in the 2x2 case this leads to
ambiguity about how define the two-sided test. In the 3x3 case it is
worse since there are so many more ways for tables to differ.
chisq.test orders tables according to the chisquare statistic and I think
fisher.test orders them according to their probability under the null
hypothesis.
-thomas
"An hypothesis that may be true is rejected because it has failed to
predict observable results that have not occurred." Jeffreys (1939)
More information about the R-help
mailing list