[R] Basis of fisher.test

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Jan 13 03:00:37 CET 2006

```On Thu, 12 Jan 2006 Ted.Harding at nessie.mcc.ac.uk wrote:

> I want to ascertain the basis of the table ranking,
> i.e. the meaning of "extreme", in Fisher's Exact Test
> as implemented in 'fisher.test', when applied to RxC
> tables which are larger than 2x2.
>
> One can summarise a strategy for the test as
>
> 1) For each table compatible with the margins
>   of the observed table, compute the probability
>   of this table conditional on the marginal totals.
>
> 2) Rank the possible tables in order of a measure
>   of discrepancy between the table and the null
>   hypothesis of "no association".
>
> 3) Locate the observed table, and compute the sum
>   of the probabilties, computed in (1), for this
>   table and more "extreme" tables in the sense of
>   the ranking in (2).
>
> The question is: what "measure of discrepancy" is
> used in 'fisher.test' corresponding to stage (2)?
>
> (There are in principle several possibilities, e.g.
> value of a Pearson chi-squared, large values being
> discrepant; the probability calculated in (2),
> small values being discrepant; ... )
>
> "?fisher.test" says only:

[That following is not a quote from a current version of R.]

>     In the one-sided 2 by 2 cases, p-values are obtained
>     directly using the hypergeometric distribution.
>     Otherwise, computations are based on a C version of
>     the FORTRAN subroutine FEXACT which implements the
>     network developed by Mehta and Patel (1986) and
>     improved by Clarkson, Fan & Joe (1993). The FORTRAN
>     code can be obtained from
>     <URL: http://www.netlib.org/toms/643>.

No, it *also* says

Two-sided tests are based on the probabilities of the tables, and
take as 'more extreme' all tables with probabilities less than or
equal to that of the observed table, the p-value being the sum of
such probabilities.

which answers the question (there are only two-sided tests for such
tables).

Now, what does the posting guide say about stating the R version and
updating before posting?

--
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

```