[R] Millions of calls to fisher.test (was (no subject))

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Nov 18 18:20:54 CET 2005


Setting conf.int=FALSE will help.  Looking at the code of fisher.test and 
extracting just the bit you need will help more.

Do you actually need a two-sided test?  Fisher did not, and if not, the 
computations can be reduced to a call to phyper which is vectorized.

On Fri, 18 Nov 2005, Anna Pluzhnikov wrote:

> Hi,
> I need to run a Fisher's exact test on thousands of 2x2 contingency tables, and
> repeat this process several thousand times (this is a part of the permutation
> test for a genome-wide association study).
>
> How can I run this process most efficiently? Is there any way to optimize R code?
>
> I have my data in a 2x2xN array (N ~ 5 K; eventually N will be ~ 500 K), and use
> apply inside the loop:
>> for (iter in 1:1000) {
>    apply(data,3,fisherPval)
>  }

Why are you calling the same thing 1000 times?

>  fisherPval <- function(x) {
>     fisher.test(x)$p.value
>  }
> Right now, it takes about 30 sec per iteration on an Intel Xeon 3.06GHz processor.

[Disclaimer etc removed]

> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

PLEASE do, and use a meaningful subject line.


-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list