R-beta: directory of functions

Z. Todd Taylor zt_taylor at pnl.gov
Fri Jun 20 18:59:02 CEST 1997

Thomas Lumley <thomas at biostat.washington.edu> wrote:

> > Accommodating the new scoping rules has required R to completely
> > take over the administration of "databases."  It is no longer
> > easy to maintain a "directory" of similar objects.  And if I do
> > have such a collection, R must load *all* of them into memory in
> > order to use just one of them.  Also, I can no longer have
> > transparent access to foreign data via S's user-defined database
> > mechanism.
> This is not a necessary consequence of the scoping rules -- they require
> only that R has access to the objects, not that they reside in memory. The
> S scoping rules would similarly require that all of an attached directory
> is *available* but not that it actually resides in memory.  

That's good to know.

> As I think has been pointed out in the past when this issue was raised, it
> is theoretically possible to implement a very similar database mechanism
> where the scoping availability is handled by storing "promises to load"
> rather than by loading the data directly. This could be implemented in a
> variety of ways, including the user-defined methods that S-PLUS provides. 

I very much hope that happens (hence I bring this topic
up from time to time... :-)

> > The change was made in the interest of speed, but that will only
> > happen for "small" datasets.  I'm not sure the benefits of cleaner
> > random number generator functions is worth what we lost.
> This is only true when "small" is interpreted in a fairly liberal fashion.
> For data sets I consider at least moderate in size (a few thousands of
> records) R is still substantially faster.  In fact the speedup is greatest
> when the data set occupies a non-negligible fraction of available memory,
> which on today's computers can easily be 16 or more megabytes even in a
> multi-user system. For really big datasets you may be right.

Yes, my definition of small is liberal.  My "large" datasets are
sometimes measured in Gigabytes rather than Megabytes.

Z. Todd Taylor
Pacific Northwest National Laboratory
zt_taylor at pnl.gov
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list