[R] Is there a fast way to do several hundred thousand ANOVA tests?

hadley wickham h.wickham at gmail.com
Mon Aug 24 04:20:48 CEST 2009


You might find the article "Computing Thousands of Test Statistics
Simultaneously in R" in
http://stat-computing.org/newsletter/issues/scgn-18-1.pdf helpful.

Hadley

On Sun, Aug 23, 2009 at 7:55 PM, big permie<bigpermie at gmail.com> wrote:
> Dear R users,
>
> I have a matrix a and a classification vector b such that
>
>> str(a)
> num [1:50, 1:800000]
> and
>> str(b)
> Factor w/ 3 levels "cond1","cond2","cond3"
>
> I'd like to do an anova on all 800000 columns and record the F statistic for
> each test; I currently do this using
>
> f.stat.vec <- numeric(length(a[1,])
>
> for (i in 1:length(a[1,]) {
>  f.test.frame <- data.frame(nums = a[,i], cond = b)
>  aov.vox <- aov(nums ~ cond, data = f.test.frame)
>  f.stat <- summary(aov.vox)[[1]][1,4]
>  f.stat.vec[i] <- f.stat
> }
>
> The problem is that this code takes about 70 minutes to run.
>
> Is there a faster way to do an anova & record the F stat for each column?
>
> Any help would be appreciated.
>
> Thanks
> Heath
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
http://had.co.nz/




More information about the R-help mailing list