[R] package for saving large datasets in ASCII
ripley@stats.ox.ac.uk
ripley at stats.ox.ac.uk
Sat Aug 10 20:54:05 CEST 2002
?write.matrix will tell you what you have overlooked, a sensible
blocksize.
If `I am not sure about write.matrix()', surely reading the help page is a
first step?
On Sat, 10 Aug 2002, Ott Toomet wrote:
> Hi,
>
> I have made a tiny package for saving dataframes in ASCII format. The
> package contains functions save.table() and save.delim(), the first
> mimics (not completely) write.table() and the second uses just
> different default values, suitable for read.delim().
>
> The reason I have written the functions is that I have had problems
> with saving large dataframes in ASCII form. write.table() essentially
> makes a huge string in memory from the dataframe. I am not sure about
> write.matrix() (in MASS), but in my practice it is too
> memory-intensive also. My approach was to write the whole thing in C
> in this way that the function takes the values from the dataframe, one
> scalar value by time, and writes them immediately to the file. This,
> of course, puts certain limitations on the contents of dataframe and
> output format.
>
> Here is an example of the result:
>
> > dim(e2000)
> [1] 7505 1197
> > library(savetable)
> > system.time(save.table(e2000, "e2000"))
> [1] 38.04 0.48 48.75 0.00 0.00
> > library(MASS)
> > system.time(write.matrix(e2000, "e2000", sep=",", 1))
>
> -- killed after 10 minutes swapping.
>
> And now a smaller example:
>
> > dim(e2000s)
> [1] 100 1197
> > library(savetable)
> > system.time(save.table(e2000s, "e2000s"))
> [1] 0.45 0.00 0.56 0.00 0.00
> > system.time(write.table(e2000s, "e2000s"))
> [1] 31.21 0.11 38.99 0.00 0.00
> > library(MASS)
> > system.time(write.matrix(e2000s, "e2000s", sep=",", 1))
> [1] 4.01 0.66 5.45 0.00 0.00
>
> None of the functions started swapping now, but as you can see,
> save.table() is still around 10 times as fast as write.matrix().
> Examples are on my 128MB PII-400 linux system and R 1.4.0.
>
> I am not sure if there is much interest for such a package, so I put
> it on my own website instead of CRAN
> (http://www.obs.ee/~siim/savetable_0.1.0.tar.gz). Any feedback is
> appreciated.
>
> Many thanks to Brian Ripley and the others, who helped me accessing R
> objects in C.
>
>
> Best wishes,
>
> Ott Toomet
>
>
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272860 (secr)
Oxford OX1 3TG, UK Fax: +44 1865 272595
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list