[Rd] unset() function?

Henrik Bengtsson henrik.bengtsson at ucsf.edu
Sat Aug 22 19:15:55 CEST 2015


Hi,

I was playing around with this idea earlier this year. This would
allow you to remove a variable with NAMED<2 while still passing it's
value, e.g.

    x1 <- log(r(x1))

where the returned value/variable has NAMED<=1.  At first I was quite
excited about the results, but it turned out that it only worked for a
few functions.  If you want to play around with it, I've created the
'recycle' package:

    https://github.com/HenrikBengtsson/recycle

Have a look at the package tests for examples and what works and what
doesn't work:

    https://github.com/HenrikBengtsson/recycle/tree/master/tests

However, basically due to what Luke says, I've decided not to pursue
this any further for now.

But, I certainly agree that if the internals of R could be made less
conservative (not force NAMED=2), this idea would certainly be worth
pursuing and could save quite a bit of memory.  The downside would be
that code would be cluttered up with lots of explicit r() statements.
On the other hand, maybe those could be added automatically by code
compilers, e.g.

    x1 <- log(x1)

would become

    x1 <- log(r(x1))

/Henrik

On Sat, Aug 22, 2015 at 4:50 PM,  <luke-tierney at uiowa.edu> wrote:
> This wouldn't actually work at present as evaluating a promise always
> sets NAMED to 2. With reference counting it would work so might be
> worth considering when we switch.
>
> Going forward it would be best to use MAYBE_REFERENCED to test whether
> a duplicate is needed -- this macro is defined appropriately whether R
> is compiled to use NAMED or reference counting.
>
> Best,
>
> luke
>
>
> On Fri, 21 Aug 2015, William Dunlap wrote:
>
>> Does R have a function like the S/S++ unset() function?
>> unset(name) would remove 'name' from the current evaluation
>> frame and return its value.  It allowed you to safely avoid
>> some memory copying when calling .C or .Call.
>>
>> E.g., suppose you had C code like
>>  #include <R.h>
>>  #include <Rinternals.h>
>>  SEXP add1(SEXP pX)
>>  {
>>      int nProtected = 0;
>>      int n = Rf_length(pX);
>>      int i;
>>      double* x;
>>      Rprintf("NAMED(pX)=%d: ", NAMED(pX));
>>      if (NAMED(pX)) {
>>          Rprintf("Copying pX before adding 1\n");
>>          PROTECT(pX = duplicate(pX)); nProtected++;
>>      } else {
>>          Rprintf("Changing pX in place\n");
>>      }
>>      x = REAL(pX);
>>      for(i=0 ; i<n ; i++) {
>>        x[i] = x[i] + 1.0;
>>      }
>>      UNPROTECT(nProtected);
>>      return pX;
>>  }
>>
>> If I call this from an R function
>>  add1 <- function(x) {
>>      stopifnot(inherits(x, "numeric"))
>>     .Call("add1", x)
>>  }
>> it will will always copy 'x', even though not copying would
>> be safe (since add1 doesn't use 'x' after calling .Call()).
>>  > add1(c(1.2, 3.4))
>>  NAMED(pX)=2: Copying pX before adding 1
>>  [1] 2.2 4.4
>> If I make the .Call directly, without a nice R function around it
>> then I can avoid the copy
>>  > .Call("add1", c(1.2, 3.4))
>>  NAMED(pX)=0: Changing pX in place
>>  [1] 2.2 4.4
>>
>> If something like S's unset() were available I could avoid the copy,
>> when safe to do so, by making the .Call in add1
>>   .Call("add1", unset(x))
>>
>> If you called this new add1 with a named variable from another
>> function the copying would be done, since NAMED(x) would be
>> 2 even after the local binding was removed.  It actually requires some
>> care to to eliminate the copying, as all the functions in the call
>> chain would have to use unset() when possible.
>>
>> I ask this because I ran across a function in the 'bit' package that
>> does not have its C code call duplicate but instead assumes the
>> x[1] <- x[1] will force x to be copied:
>>  "!.bit" <- function(x){
>>    if (length(x)){
>>      ret <- x
>>      ret[1] <- ret[1]  # force duplication
>>      .Call("R_bit_not", ret, PACKAGE="bit")
>>    }else{
>>      x
>>    }
>>  }
>> If you optimize things so that 'ret[1] <- ret[1]' does not copy 'ret',
>> then this function alters its input.  It a function like unset()
>> were there then the .Call could be
>>     .Call("R_bit_not", unset(x))
>>
>> I suppose the compiler could analyze the code and see that
>> x was not used after the .Call and thus feel free to avoid the
>> copy.
>>
>> In any case bit's maintainer should add something like
>>    if(NAMED(x) {
>>        PROTECT(x=duplicate(x));
>>        nProtect++;
>>    }
>>    ...
>>    UNPROTECT(nProtect);
>> in the C code, but unset() would help avoid unneeded duplications.
>>
>>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa                  Phone:             319-335-3386
> Department of Statistics and        Fax:               319-335-3017
>    Actuarial Science
> 241 Schaeffer Hall                  email:   luke-tierney at uiowa.edu
> Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list