[R] Reading through a group of .RData files

Henrik Bengtsson hb at stat.berkeley.edu
Tue Dec 11 21:51:09 CET 2007


Hi,

depending on what you do and how (and why) you save objects in RData
files in the first place, you might be interested in knowing of the
loadObject()/saveObject() methods of R.utils, as well as
loadCache()/saveCache() in R.cache.

The R.utils methods are basically "clever" wrappers around
load()/save() in the 'base' package that does not rely on saving and
loading the variable name but rather the object.  To save multiple
objects you have wrap them up in a list structure or in an
environment.  Example:

x <- 1:100
saveObject(x, file="foo.RData")
y <- loadObject("foo.RData")
stopifnot(identical(x,y))

u <- list(x=x, y=y)
saveObject(u, file="bar.RData")
v <- loadObject("bar.RData")
stopifnot(identical(u,v))

The R.cache methods let you store objects/results to a file cache
without having to worry about filenames.  Instead the objects are
identified by lookup keys generated from other R objects.  This is
useful for temporary/semi-temporary storing of results, especially
computationally expensive results.  The file cache is persistent
between sessions.  Example:

x <- 1:100
key <- list("x")
saveCache(x, key=key)
y <- loadCache(key)
stopifnot(identical(x,y))

u <- list(x=x, y=y)
key <- list("u")
saveCache(u, key=key)
v <- loadCache(key)
stopifnot(identical(u,v))

Although not of immediate interest, the pathname of the above cache
files can be found by
findCache(key), e.g.
"~/.Rcache/78488a47006df5d333db9e200fc539c5.Rcache".  There are
methods for specifying the root of the file cache, and having
different subdirectories for different projects.

The above example is not showing the full power of using R.cache.
Instead consider this example:

slowFcn <- function(x, y, force=FALSE) {
  # Cached results?
  key <- list(x=x, y=y)
  if (!force) {
    res <- loadCache(key=key)
    if (!is.null(res))
      return(res);
  }

  # Emulate a computational expensive calculation
  Sys.sleep(10)

  res <- list(x=x, y=y, xy=x*y)

  # Save to cache
  saveCache(res, key=key)

  res
}

# First call takes time
> system.time(res1 <- slowFcn(x=1, y=2))
   user  system elapsed
      0       0      10

# All successive calls with the same arguments are instant
> system.time(res2 <- slowFcn(x=1, y=2))
   user  system elapsed
   0.02    0.00    0.01

> stopifnot(identical(res1, res2))

Cheers

Henrik

On 10/12/2007, Talbot Katz <topkatz at msn.com> wrote:
>
> Hi.
>
> I have a procedure that reads a directory, loops through a set of particular .RData files, loading each one, and feeding its object(s) into a function, as follows:
>
> cvListFiles<-list.files(fnDir);
> for(i in grep(paste("^",pfnStub,".*\\.RData$",sep=""),cvListFiles)){
> load(paste(fnDir,cvListFiles[i],sep="/"));
> myFunction(rliObject);
> rm(rliObject);
> };
>
> where fnDir is the directory I'm reading, and pfnStub is a string that begins the name of each of the files I want to load.  As you can see, I'm assuming that each of the selected .RData files contains an object named "rliObject" and I'm hoping that nothing in any of the files I'm loading overwrites an object in my environment.  I'd like to clean this up so that I can extract the object(s) from each data file, and feed them to my function, whatever their names are, without corrupting my environment.  I'd appreciate any assistance.  Thanks!
>
> --  TMK  --212-460-5430 home917-656-5351 cell
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list