[R] Data storage in R vs. S

Thomas Lumley thomas at biostat.washington.edu
Tue Jul 20 17:41:26 CEST 1999

On Tue, 20 Jul 1999, Brahm, David wrote:

> Hi,
>    Another newbie question; I only just now downloaded and tried the R
> software.  My big initial objection is the way R stores the entire working
> database in one file between sessions.  In S, I have always stored many
> large parallel objects in a database (e.g. stock-by-date matrices of close,
> high, low, volume, ...), knowing that S would only read the one or two files
> I wanted to work with at any given time.  Also, I'd attach and detach many
> different databases (e.g. one per year's worth of data).  I don't think R
> can handle that, and I'd guess that scanning a lot of ASCII files each time
> would be slow.  Can anybody share their experience with large datasets?  Are
> the developers of R planning to eventually change the way data is stored to
> be more like S?

I think in the moderately distant future we would like to save things
automatically in separate files with some degree of caching, but this is
not at all straightforward because of the lexical scoping rules. 

At the moment you can do this by hand, by saving the data sets in
binary format using save(), loading them using load(), and deleting them
with rm() when not in use. It should also be possible to put them in a
package so that they don't ever end up in the global environment and won't
be saved in your workspace accidentally. Making each one into a package
would also allow autoload()ing. 

Thomas Lumley
Assistant Professor, Biostatistics
University of Washington, Seattle

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list