[R] Reading large files quickly; resolved

Rob Steele freenx.10.robsteele at xoxy.net
Mon May 11 17:54:44 CEST 2009


Rob Steele wrote:
> I'm finding that readLines() and read.fwf() take nearly two hours to
> work through a 3.5 GB file, even when reading in large (100 MB) chunks.
>  The unix command wc by contrast processes the same file in three
> minutes.  Is there a faster way to read files in R?
> 
> Thanks!
> 

readChar() is fast.  I use strsplit(..., fixed = TRUE) to separate the
input data into lines and then use substr() to separate the lines into
fields.  I do a little light processing and write the result back out
with writeChar().  The whole thing takes thirty minutes where read.fwf()
took nearly two hours just to read the data.

Thanks for the help!




More information about the R-help mailing list