[R] fisher exact vs. simulated chi-square

Wed Apr 23 13:24:12 CEST 2003

The Chi-Square test is based upon the assumption that the sample is large enough
to allow approximation of a (nearly symetric) binomial by a normal distribution.
(Chi Sqare is z^2).  When expected (NOT observed) cells are too small, that
suggests a very asymetric binomial and, consequently a poor fit for the
assumption.  The exact test calculates the exact probability of the observed
values, or more extreme ones, given the assumed probabilities generating the
expected values.  As someone else noted, exact is exact, Chi-square is not
(unless, of course, assumptions are exactly met.)
Bob Porter,

Robert J. Porter, Ph.D.
Clinical and Consulting Psychologist
308 East Oak Street
Tampa, FL, 33603
Office Phone: 813-810-8110
813-225-5678 FAX
www.mindspring.com/~rjporter

-----Original Message-----
From: Dirk Janssen [mailto:dirkj at rz.uni-leipzig.de]
Sent: Tuesday, April 22, 2003 8:08 PM
To: r-help at stat.math.ethz.ch
Subject: [fisher exact vs. simulated chi-square (Dirk Janssen
Dear All,

I have a problem understanding the difference between the outcome of a
fisher exact test and a chi-square test (with simulated p.value).

For some sample data (see below), fisher reports p=.02337. The normal
chi-square test complains about "approximation may be incorrect",
because there is a column with cells with very small values. I therefore
tried the chi-square with simulated p-values, but this still gives
p=.04037. I also simulated the p-value myself, using r2dtable, getting
the same result, p=0.04 (approx).

Why is this substantially higher than what the fisher exact says? Do the
two tests make different assumptions? I noticed that the discrepancy
gets smaller when I increase the number of observations for column A3.
Does this mean that the simulated chi-square is still sensitive to cells
with small counts, even though it does not give me the warning?

Thanks in advance,
Dirk Janssen