[R] scan

Adrian Trapletti Adrian.Trapletti at wu-wien.ac.at
Thu Jul 29 12:00:22 CEST 1999

Is there a way to efficiently read large datasets directly into a matrix
byrow? I know data.frame, but for large datasets it doesn't efficiently
work, also if I increase the cons memory.

R --nsize 1000k --vsize 90M
> x<-read.table("pendler.luft.txt")
Error: cons memory (1000000 cells) exhausted
       See "help(Memory)" on how to increase the number of cons cells.

Also the following is problematic:

R --nsize 1000k --vsize 90M
> x<-scan("pendler.luft.txt",skip=1)
Read 3164832 items
> x<-matrix(x,nrow=3164832/6,ncol=6,byrow=T)
Error: heap memory (92160 Kb) exhausted [needed 24725 Kb more]
       See "help(Memory)" on how to increase the heap size.

The following works but is not very elegant I think

> x<-matrix(NA,nrow=6,ncol=3164832/6)
> x[,]<-scan("pendler.luft.txt",skip=1)
Read 3164832 items
> x<-t(x)
> x[1,]
[1] 10101 10405 10349  3945    89     0

Is there a better way to do that? How can I avoid copying of such large
objects? E.g., does x<-t(x) copy x or not?


Adrian Trapletti, Vienna University of Economics and Business
Administration, Augasse 2-6, A-1090 Vienna, Austria
Phone: ++43 1 31336 4561, Fax: ++43 1 31336 708,
Email: adrian.trapletti at wu-wien.ac.at

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list