[R] Millions of calls to fisher.test (was (no subject))
Prof Brian Ripley
ripley at stats.ox.ac.uk
Fri Nov 18 18:20:54 CET 2005
Setting conf.int=FALSE will help. Looking at the code of fisher.test and
extracting just the bit you need will help more.
Do you actually need a two-sided test? Fisher did not, and if not, the
computations can be reduced to a call to phyper which is vectorized.
On Fri, 18 Nov 2005, Anna Pluzhnikov wrote:
> Hi,
> I need to run a Fisher's exact test on thousands of 2x2 contingency tables, and
> repeat this process several thousand times (this is a part of the permutation
> test for a genome-wide association study).
>
> How can I run this process most efficiently? Is there any way to optimize R code?
>
> I have my data in a 2x2xN array (N ~ 5 K; eventually N will be ~ 500 K), and use
> apply inside the loop:
>> for (iter in 1:1000) {
> apply(data,3,fisherPval)
> }
Why are you calling the same thing 1000 times?
> fisherPval <- function(x) {
> fisher.test(x)$p.value
> }
> Right now, it takes about 30 sec per iteration on an Intel Xeon 3.06GHz processor.
[Disclaimer etc removed]
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
PLEASE do, and use a meaningful subject line.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list