[R] Quickly calculating the mean results over a collection of data sets?

Michael R. Head burner at suppressingfire.org
Tue Aug 12 13:44:40 CEST 2008


On Tue, 2008-08-12 at 12:04 +0100, Dan Davison wrote:
> > testRuns <- lapply(inputFiles, 
> > 		function(x) {
> > 			read.table(x, header=TRUE)})
> 
> (Just BTW lapply(inputFiles, read.table, header=TRUE) is slightly nicer to look at)

Yes, that does look much nicer :-)

> How about rbind()ing all the data frames together, and working with
> the combined data frame? Say that testRuns is
> 
> > testRuns
> > allRuns <- do.call("rbind", testRuns)
> > aggregate(allRuns$Z, by=allRuns[c("W","X","Y")], mean)

Oh, that does simplify things quite a bit.

I just compared the time to do your version vs. mine on one of my larger
data sets. 

My version takes about 2 minutes, yours takes about 1 second. 
Fantastic!

I'll have to learn about rbind and aggregate...

Thanks,
mike

> Dan

-- 
Michael R. Head <burner at suppressingfire.org>
http://www.cs.binghamton.edu/~mike/



More information about the R-help mailing list