[R] Basis of fisher.test
(Ted Harding)
Ted.Harding at nessie.mcc.ac.uk
Fri Jan 13 09:55:14 CET 2006
On 13-Jan-06 Prof Brian Ripley wrote:
> On Thu, 12 Jan 2006 Ted.Harding at nessie.mcc.ac.uk wrote:
>>[...]
>> "?fisher.test" says only:
>
> [That following is not a quote from a current version of R.]
>
>> In the one-sided 2 by 2 cases, p-values are obtained
>> directly using the hypergeometric distribution.
>> Otherwise, computations are based on a C version of
>> the FORTRAN subroutine FEXACT which implements the
>> network developed by Mehta and Patel (1986) and
>> improved by Clarkson, Fan & Joe (1993). The FORTRAN
>> code can be obtained from
>> <URL: http://www.netlib.org/toms/643>.
>
> No, it *also* says
>
> Two-sided tests are based on the probabilities of the tables, and
> take as 'more extreme' all tables with probabilities less than or
> equal to that of the observed table, the p-value being the sum of
> such probabilities.
>
> which answers the question (there are only two-sided tests for such
> tables).
Thanks for the above information, which is indeed the definitive
straightforward answer to my question!
(Not sure that I quite agree with the "two-sided" terminology, though,
since the ranking is unidirectional based on decreasing probability,
and the P-value is that of the least-probability tail -- i.e. analagous
to the "large (-2*loglik)" tail of a likelihood-ratio test -- which
I've always visualised as a 1-tailed test (depite the fact that
the "other tail" can on occasion be indicative of a fit "too good to
be true").
> Now, what does the posting guide say about stating the R version and
> updating before posting?
Well, I plead that in practice there is necessarily a grey area
here! My quotation was from "?fisher.test" in R-2.1.0beta of
2004/04/08, the most recent version installed on any of my machines.
Admittedly a bit behind the times, but not grossly; and that help
page has not changed in this respect since the earliest version I
have installed, which is R-1.2.3 of 2001/04/26.
Contents of help pages can change overnight as R evolves.
While it is better to be up-to-date than behind the times (even
slightly), there is a compromise to be struck between upgrading
to the latest R every time one has a question which might be
answered thereby, or going on-line to read the latest PDF
documentation from CRAN, on the one hand, and on the other asking
a straightforward question to the list.
Thanks again, and best wishes,
Ted.
