[R] Dynamic Dictionary Data Type?

Gabor Grothendieck ggrothendieck at gmail.com
Thu Jun 2 20:36:45 CEST 2005


On 6/2/05, Kjetil Brinchmann Halvorsen <kjetil at acelerate.com> wrote:
> Prof Brian Ripley wrote:
> 
> > On Thu, 2 Jun 2005, hadley wickham wrote:
> >
> >>> An environment is a hash table, and copying occurs only on
> >>> modification,
> >>> when any language would have to copy in this context.
> >>
> >
> > Caveat: the default is new.env(hash=FALSE), so an environment is a
> > hash table in one sense but not necessarily so in the strict sense.
> >
> >> Yes, I'm aware that copying only occurs on modification.  However, it
> >> was my understanding that
> >>
> >> a <- list(a =1)
> >> a$b <- 4
> >>
> >> would create a new copy of a,
> >
> >
> > Thomas was talking about environments, not lists (and $ works
> > differently for them).
> >
> >> whereas in Java say
> >>
> >> HashMap a = new HashMap();
> >> a.put("a", 1);
> >> a.put("b", 2);
> >>
> >> wouldn't create a copy of a. (please bear in mind my java syntax is
> >> very rusty!)
> >>
> >> Caching data implies updating at each step, thus possibly creating n
> >> copies of the list.  Is that wrong?
> >
> >
> > It depends what you mean by `copy'.  If you expand a hash table you at
> > some point need to re-arrange the hashing of the entries.  That's what
> > will happen with an R environment or an R list too.  The possible
> > difference is that you might expect the table to have some room for
> > expansion, and in your list example you did not give any.
> >
> > R's operations on lists make more copies than are strictly necessary,
> > but it is not clear that this is more costly than the housekeeping
> > that would otherwise be necessary.  In a$b <- 4, the wrapper VECSXP is
> > recreated and the pointers copied across, but the list elements are
> > not copied.  For
> > a$a <- 4 it is probable that no copying is done (although if a2 <- a
> > had been done previously, the pending recursive copy would then be done).
> > (It is also possible that I have overlooked something in the rather
> > complex code used to do these subassignments.)
> >
> The original poster asked for caching of results. here is an example
> using new.env():
> 
> memo <- function(fun) {
>   mem <- new.env()
>   function(x) {
>       if (exists(as.character(x), envir=mem)) get(as.character(x),
> envir=mem, inherits=FALSE) else {
>       val <- fun(x)
>       assign(as.character(x), value=val, envir=mem)
>       val }
>   }
> }
> 
>  > fib <- memo(function(n) if(n<=1) 1 else fib(n-1)+fib(n-2))
>  > system.time( fib(300) )
> [1] 0.01 0.00 0.02   NA   NA
> 
> ls(get("mem", env=environment(fib)))
>   *output supressed*
> 
> To compare:
> system.time( {fib2 <- function(n)if(n<=1)1 else
> fib2(n-1)+fib2(n-2);fib2(30)})
> [1]  8.07  0.08 12.75    NA    NA
> 
> 
> (there is (at least) one problem with this solution: if the global
> workspace contains a
>  variable `6`, it gives error. Why?)

I think you need to add the argument inherits = FALSE to the call
to exists.




More information about the R-help mailing list