[R] vsize and nsize

Z. Todd Taylor Todd.Taylor at pnl.gov
Tue May 18 16:41:44 CEST 1999

jlindsey at alpha.luc.ac.be wrote:

> > R's poor handling of large datasets is half the reason I have
> > not moved more of my work from S(plus) to R (the other half
> > being the absence of trellis).  I love its lexical closures, but
> > they're not worth the memory penalty if you have huge datasets.
> I am wondering what you mean by "R's poor handling of large datasets".
> How large is large? I have often been working simultaneously with a
> fair number of vectors of say 40,000 using my libraries (data objects
> and functions) with no problems. They use the R scoping rules. On the
> other hand, if you use dataframes and/or standard functions like glm,
> then you are restricted to extremely small (toy) data sets. But then
> maybe you are thinking of gigabytes of data.

My "large" datasets generally consist of high-time-resolution
energy consumption data.  One example is a dataset of electric
utility distribution feeder measurements I analyzed a few years
ago.  It was two years worth of half-hourly data (about 17,500
rows) from over 3000 metering points (columns).  At double
precision, that's almost a half-Gig of data.  I've had other
datasets that totalled several Gigs.

In S I can't put all that into a data frame, but I *can* create
3000 independent (but parallel) vectors and toss them in a
separate directory that I attach as needed.  Since I tend to
analyze the columns sequentially (or a few at a time) rather
than all at once, the memory penalty for having lots of columns
in the dataset is basically negligible.  R, on the other hand,
wants to hold everything in memory unless I really work hard
to break up the dataset into smaller chunks.

I'm very glad to hear that the R developers are working on a
mechanism to get around R's memory hunger.

Z. Todd Taylor
Pacific Northwest National Laboratory
Todd.Taylor at pnl.gov
Why do you say the alarm went off, when really it came on?
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list