[R] Reading and coalescing many datafiles.
Roger D. Peng
rpeng at jhsph.edu
Thu Apr 14 18:20:21 CEST 2005
In my experience, using 'do.call("rbind", ...)' after storing all the
data files in a list is much better than 'rbind'-ing on the fly.
-roger
asr at ufl.edu wrote:
> Greetings.
>
>
> I've got some analysis problems I'm trying to solve, the raw data for which
> are accumulated in a bunch of time-and-date-based files.
>
> /some/path/2005-01-02-00-00-02
>
> etc.
>
>
> The best 'read all these files' method I've seen in the r-help archives comes
> down to
>
> for (df in my_list_of_filenames )
> {
> dat <- rbind(dat,my_read_function(df))
> }
>
> which, unpleasantly, is O(N^2) w.r.t. the number of files.
>
> I'm fiddling with other idioms to accomplish the same goal. Best I've come up
> with so far, after extensive reference to the mailing list archives, is
>
>
> my_read_function.many<-function(filenames)
> {
> filenames <- filenames[file.exists(filenames)];
> rv <- do.call("rbind", lapply(filenames,my_read_function))
> row.names(rv) = c(1:length(row.names(rv)))
> rv
> }
>
>
> I'd love to have some stupid omission pointed out.
>
>
> - Allen S. Rout
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
--
Roger D. Peng
http://www.biostat.jhsph.edu/~rpeng/
More information about the R-help
mailing list