[R] package for saving large datasets in ASCII

ripley@stats.ox.ac.uk ripley at stats.ox.ac.uk
Sat Aug 10 20:54:05 CEST 2002


?write.matrix  will tell you what you have overlooked, a sensible
blocksize.

If `I am not sure about write.matrix()', surely reading the help page is a
first step?

On Sat, 10 Aug 2002, Ott Toomet wrote:

> Hi,
>
> I have made a tiny package for saving dataframes in ASCII format.  The
> package contains functions save.table() and save.delim(), the first
> mimics (not completely) write.table() and the second uses just
> different default values, suitable for read.delim().
>
> The reason I have written the functions is that I have had problems
> with saving large dataframes in ASCII form.  write.table() essentially
> makes a huge string in memory from the dataframe.  I am not sure about
> write.matrix() (in MASS), but in my practice it is too
> memory-intensive also.  My approach was to write the whole thing in C
> in this way that the function takes the values from the dataframe, one
> scalar value by time, and writes them immediately to the file.  This,
> of course, puts certain limitations on the contents of dataframe and
> output format.
>
> Here is an example of the result:
>
> > dim(e2000)
> [1] 7505 1197
> > library(savetable)
> > system.time(save.table(e2000, "e2000"))
> [1] 38.04  0.48 48.75  0.00  0.00
> > library(MASS)
> > system.time(write.matrix(e2000, "e2000", sep=",", 1))
>
>  -- killed after 10 minutes swapping.
>
> And now a smaller example:
>
> > dim(e2000s)
> [1]  100 1197
> > library(savetable)
> > system.time(save.table(e2000s, "e2000s"))
> [1] 0.45 0.00 0.56 0.00 0.00
> > system.time(write.table(e2000s, "e2000s"))
> [1] 31.21  0.11 38.99  0.00  0.00
> > library(MASS)
> > system.time(write.matrix(e2000s, "e2000s", sep=",", 1))
> [1] 4.01 0.66 5.45 0.00 0.00
>
> None of the functions started swapping now, but as you can see,
> save.table() is still around 10 times as fast as write.matrix().
> Examples are on my 128MB PII-400 linux system and R 1.4.0.
>
> I am not sure if there is much interest for such a package, so I put
> it on my own website instead of CRAN
> (http://www.obs.ee/~siim/savetable_0.1.0.tar.gz).  Any feedback is
> appreciated.
>
> Many thanks to Brian Ripley and the others, who helped me accessing R
> objects in C.
>
>
> Best wishes,
>
> Ott Toomet
>
>
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list