[R] Reading and writing to S-like databases

David Brahm a215020 at agate.fmr.com
Mon Oct 1 21:02:04 CEST 2001


On 9/27 I asked:
> In S-Plus, I build databases of many large objects.  In any given analysis,
> I only need a few of those objects, but attach'ing the whole database is fine
> since objects are only read as needed.  How can I do the same thing in R,
> without reading the entire database?

Here are the latest versions of my functions to accomplish this.  Note I now
use system-independent file descriptors (as suggested by Martin Maechler), I
have eliminated the need to eliminate conflicts first (I "eval" in the proper
environment), and I now store the loaded objects in the package (overwriting
the promise objects) rather than in the global environment.

##### Code: #####

# Save objects in position "pos" to a delayed-evaluation data package:
g.data.save <- function(dir, obj=z, pos=2) {
  z <- objects(pos, all.names=T)
  if (is.character(pos)) pos <- match(pos, search())
  pkg <- basename(dir)
  for (i in file.path(dir,c("","data","R"))) if (!file.exists(i)) dir.create(i)
  for (i in obj) {
    file <- file.path(dir, "data", paste(i,"RData",sep="."))
    expr <- parse(text=paste("save(list=\"",i,"\", file=\"",file,"\")",sep=""))
    eval(expr, pos.to.env(pos))
  }
  code <- paste(z," <- delay(g.data.load(\"", z, "\", \"", pkg, "\"))", sep="")
  cat(code, file=file.path(dir, "R", pkg), sep="\n")
}

# Routine used in data packages, e.g.  x <- delay(g.data.load("x", "newdata"))
g.data.load <- function(i, pkg) {
  load(system.file("data", paste(i,"RData",sep="."), package=pkg),
       pos.to.env(match(paste("package",pkg,sep=":"), search())))
  get(i)
}

# Attach a delayed-evaluation data package:
g.data.attach <- function(dir)
  library(basename(dir), lib.loc=dirname(dir), char=T)

# Get data from an unattached package (like get(item,dir) in S-plus):
g.data.get <- function(item, dir) {
  env <- new.env()
  load(file.path(dir, "data", paste(item,"RData",sep=".")), env)
  get(item, envir=env)
}


##### Example: #####

attach(NULL, name="newdata")
assign("x1", matrix(1, 1000, 1000), 2)
assign("x2", matrix(2, 1000, 1000), 2)
g.data.save("/tmp/newdata")
detach(2)
g.data.attach("/tmp/newdata")
objects(2)                           # These are promise objects
system.time(print(dim(x1)))          # Takes time to load up
system.time(print(dim(x1)))          # Second time is faster!
objects(2)                           # Now x1 is a real object
find("x1")                           # It's in package:newdata
detach(2)
unlink("/tmp/newdata", recursive=T)  # Clean up

#####################

			-- David Brahm (a215020 at agate.fmr.com)
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list