[R] Basis of fisher.test

Fri Jan 13 10:44:48 CET 2006

On Fri, 13 Jan 2006 Ted.Harding at nessie.mcc.ac.uk wrote:

> On 13-Jan-06 Prof Brian Ripley wrote:
>> On Thu, 12 Jan 2006 Ted.Harding at nessie.mcc.ac.uk wrote:
>>> [...]
>>> "?fisher.test" says only:
>>
>> [That following is not a quote from a current version of R.]
>>
>>>     In the one-sided 2 by 2 cases, p-values are obtained
>>>     directly using the hypergeometric distribution.
>>>     Otherwise, computations are based on a C version of
>>>     the FORTRAN subroutine FEXACT which implements the
>>>     network developed by Mehta and Patel (1986) and
>>>     improved by Clarkson, Fan & Joe (1993). The FORTRAN
>>>     code can be obtained from
>>>     <URL: http://www.netlib.org/toms/643>.
>>
>> No, it *also* says
>>
>>       Two-sided tests are based on the probabilities of the tables, and
>>       take as 'more extreme' all tables with probabilities less than or
>>       equal to that of the observed table, the p-value being the sum of
>>       such probabilities.
>>
>> which answers the question (there are only two-sided tests for such
>> tables).
>
> Thanks for the above information, which is indeed the definitive
> straightforward answer to my question!
>
> (Not sure that I quite agree with the "two-sided" terminology, though,
> since the ranking is unidirectional based on decreasing probability,
> and the P-value is that of the least-probability tail -- i.e. analagous
> to the "large (-2*loglik)" tail of a likelihood-ratio test -- which
> I've always visualised as a 1-tailed test (depite the fact that
> the "other tail" can on occasion be indicative of a fit "too good to
> be true").

As statistics is usually taught, significance tests are always one-tailed. 
The two-sided t-test is one-tailed, the test statistic being |T|.

In any case, the `two-sided' is part of the arguments given to the 
function, so this para is just using the already-established terminology.

>> Now, what does the posting guide say about stating the R version and
>> updating before posting?
>
> Well, I plead that in practice there is necessarily a grey area
> here! My quotation was from "?fisher.test" in R-2.1.0beta of
> 2004/04/08, the most recent version installed on any of my machines.
> Admittedly a bit behind the times, but not grossly; and that help
> page has not changed in this respect since the earliest version I
> have installed, which is R-1.2.3 of 2001/04/26.
>
> Contents of help pages can change overnight as R evolves.
> While it is better to be up-to-date than behind the times (even
> slightly), there is a compromise to be struck between upgrading
> to the latest R every time one has a question which might be
> answered thereby, or going on-line to read the latest PDF
> documentation from CRAN, on the one hand, and on the other asking
> a straightforward question to the list.

Well, if you had given the R version number the problem would have been 
much more obvious.

> Thanks again, and best wishes,
> Ted.
>
> --------------------------------------------------------------------
> E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
> Fax-to-email: +44 (0)870 094 0861
> Date: 13-Jan-06                                       Time: 08:55:11
> ------------------------------ XFMail ------------------------------
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595