[Rd] Changing arguments inside .Call. Wise to encourage "const" on all arguments?

Paul Johnson pauljohn32 at gmail.com
Mon Dec 10 07:51:10 CET 2012


I'm continuing my work on finding speedups in generalized inverse
calculations in some simulations.  It leads me back to .C and .Call,
and some questions I've never been able to answer for myself.  It may
be I can push some calculations to LAPACK in or C BLAS, that's why I
realized again I don't understand the call by reference or value
semantics of .Call

Why aren't users of .Call encouraged to "const" their arguments, and
why doesn't .Call do this for them (if we really believe in return by
value)?

R Gentleman's R Programming for Bioinformatics is the most
understandable treatment I've found on .Call. It appears to me .Call
leaves "wiggle room" where there should be none.  Here's Gentleman on
p. 185. "For .Call and .External, the return value is an R object (the
C functions must return a SEXP), and for these functions the values
that were passed are typically not modified.  If they must be
modified, then making a copy in R, prior to invoking the C code, is
necessary."

I *think* that means:

.Call allows return by reference, BUT we really wish users would not
use it. Users can damage R data structures that are pointed to unless
they really truly know what they are doing on the C side. ??

This seems dangerous. Why allow return  by reference at all?

On p. 197, there's a similar comment  "Any function that has been
invoked by either .External or .Call will have all of its arguments
protected already. You do not need to protect them. .... [T]hey were
not duplicated and should be treated as read-only values."

"should be ... read-only" concerns me. They are "protected" in the
garbage collector sense, but they are not protected from "return by
reference" damage. Right?

Why doesn't the documentation recommend function writers to mark
arguments to C functions as const?  Isn't that what the return by
value policy would suggest?

Here's a troublesome example in  R src/main/array.c:

/* DropDims strips away redundant dimensioning information. */
/* If there is an appropriate dimnames attribute the correct */
/* element is extracted and attached to the vector as a names */
/* attribute.  Note that this function mutates x. */
/* Duplication should occur before this is called. */

SEXP DropDims(SEXP x)
{
    SEXP dims, dimnames, newnames = R_NilValue;
    int i, n, ndims;

   PROTECT(x);
   dims = getAttrib(x, R_DimSymbol);
[... SNIP ....]
    setAttrib(x, R_DimNamesSymbol, R_NilValue);
    setAttrib(x, R_DimSymbol, R_NilValue);
    setAttrib(x, R_NamesSymbol, newnames);
[... SNIP ....]

return x;
}

Well, at least there's a warning comment with that one.  But it does
show .Call allows call by reference.

Why allow it, though? DropDims could copy x, modify the copy, and return it.

I wondered why DropDims bothers to return x at all. We seem to be
using modify and return by reference there.

I also wondered why x is PROTECTED, actually. Its an argument, wasn't
it automatically protected? Is it no longer  protected after the
function starts modifying it?

Here's an  example with similar usage in Writing R Extensions, section
5.10.1 "Calling .Call".  It protects the arguments a and b (needed
??), then changes them.

#include <R.h>
#include <Rdefines.h>

     SEXP convolve2(SEXP a, SEXP b)
     {
         R_len_t i, j, na, nb, nab;
         double *xa, *xb, *xab;
         SEXP ab;

         PROTECT(a = AS_NUMERIC(a)); /* PJ wonders, doesn't this alter
"a"  in calling code*/
         PROTECT(b = AS_NUMERIC(b));
         na = LENGTH(a); nb = LENGTH(b); nab = na + nb - 1;
         PROTECT(ab = NEW_NUMERIC(nab));
         xa = NUMERIC_POINTER(a); xb = NUMERIC_POINTER(b);
         xab = NUMERIC_POINTER(ab);
         for(i = 0; i < nab; i++) xab[i] = 0.0;
         for(i = 0; i < na; i++)
              for(j = 0; j < nb; j++) xab[i + j] += xa[i] * xb[j];
         UNPROTECT(3);
         return(ab);
     }


Doesn't

        PROTECT(a = AS_NUMERIC(a));

have the alter the data structure "a" both inside the C function and
in the calling R code as well? And, if a was PROTECTED automatically,
could we do without that PROTECT()?

pj

-- 
Paul E. Johnson
Professor, Political Science      Assoc. Director
1541 Lilac Lane, Room 504      Center for Research Methods
University of Kansas                 University of Kansas
http://pj.freefaculty.org               http://quant.ku.edu



More information about the R-devel mailing list