[R] package for saving large datasets in ASCII

Ott Toomet siim at obs.ee
Sat Aug 10 10:28:22 CEST 2002


Hi,

I have made a tiny package for saving dataframes in ASCII format.  The
package contains functions save.table() and save.delim(), the first
mimics (not completely) write.table() and the second uses just
different default values, suitable for read.delim().

The reason I have written the functions is that I have had problems
with saving large dataframes in ASCII form.  write.table() essentially
makes a huge string in memory from the dataframe.  I am not sure about
write.matrix() (in MASS), but in my practice it is too
memory-intensive also.  My approach was to write the whole thing in C
in this way that the function takes the values from the dataframe, one
scalar value by time, and writes them immediately to the file.  This,
of course, puts certain limitations on the contents of dataframe and
output format.

Here is an example of the result:

> dim(e2000)
[1] 7505 1197
> library(savetable)
> system.time(save.table(e2000, "e2000"))
[1] 38.04  0.48 48.75  0.00  0.00
> library(MASS)
> system.time(write.matrix(e2000, "e2000", sep=",", 1))

 -- killed after 10 minutes swapping.

And now a smaller example:

> dim(e2000s)
[1]  100 1197
> library(savetable)
> system.time(save.table(e2000s, "e2000s"))
[1] 0.45 0.00 0.56 0.00 0.00
> system.time(write.table(e2000s, "e2000s"))
[1] 31.21  0.11 38.99  0.00  0.00
> library(MASS)
> system.time(write.matrix(e2000s, "e2000s", sep=",", 1))
[1] 4.01 0.66 5.45 0.00 0.00

None of the functions started swapping now, but as you can see,
save.table() is still around 10 times as fast as write.matrix().
Examples are on my 128MB PII-400 linux system and R 1.4.0.

I am not sure if there is much interest for such a package, so I put
it on my own website instead of CRAN
(http://www.obs.ee/~siim/savetable_0.1.0.tar.gz).  Any feedback is
appreciated.

Many thanks to Brian Ripley and the others, who helped me accessing R
objects in C.


Best wishes,

Ott Toomet


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list