R-beta: read.table and large datasets
Douglas Bates
bates at stat.wisc.edu
Mon Mar 9 19:56:02 CET 1998
Rick White <rick at stat.ubc.ca> writes:
> I find that read.table cannot handle large datasets. Suppose data is a
> 40000 x 6 dataset
>
> R -v 100
>
> x_read.table("data") gives
> Error: memory exhausted
> but
> x_as.data.frame(matrix(scan("data"),byrow=T,ncol=6))
> works fine.
>
> read.table is less typing ,I can include the variable names in the first
> line and in Splus executes faster. Is there a fix for read.table on the
> way?
You probably need to increase -n as well as -v to read in this table.
Try setting
gcinfo(TRUE)
to see what is happening with the garbage collector. Most likely it
is running out of cons cells long before it runs out of heap storage.
The reason I suspect this is because I encountered exactly the same
situation several weeks ago and Thomas Lumley pointed this out to me.
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list