[Rd] Changing arguments inside .Call. Wise to encourage "const" on all arguments?
Simon Urbanek
simon.urbanek at r-project.org
Mon Dec 10 20:05:23 CET 2012
On Dec 10, 2012, at 1:51 AM, Paul Johnson wrote:
> I'm continuing my work on finding speedups in generalized inverse
> calculations in some simulations. It leads me back to .C and .Call,
> and some questions I've never been able to answer for myself. It may
> be I can push some calculations to LAPACK in or C BLAS, that's why I
> realized again I don't understand the call by reference or value
> semantics of .Call
>
> Why aren't users of .Call encouraged to "const" their arguments, and
> why doesn't .Call do this for them (if we really believe in return by
> value)?
>
Because there is a difference between the *data* part of the SEXP and the object itself. Internal structure of the object may need to be modified (e.g. the NAMED ref counting increased when you assign it) in a call to R API. You can't flag the data part as const separately, so you have to use non-const SEXP.
> R Gentleman's R Programming for Bioinformatics is the most
> understandable treatment I've found on .Call. It appears to me .Call
> leaves "wiggle room" where there should be none. Here's Gentleman on
> p. 185. "For .Call and .External, the return value is an R object (the
> C functions must return a SEXP), and for these functions the values
> that were passed are typically not modified. If they must be
> modified, then making a copy in R, prior to invoking the C code, is
> necessary."
>
> I *think* that means:
>
> .Call allows return by reference, BUT we really wish users would not
> use it. Users can damage R data structures that are pointed to unless
> they really truly know what they are doing on the C side. ??
>
> This seems dangerous. Why allow return by reference at all?
>
Because it is completely legal to do things like
SEXP last(SEXP bar) {
if (TYPEOF(bar) = VECSXP && LENGTH(bar) > 0)
return VECTOR_ELT(bar, LENGTH(bar) - 1);
Rf_error("sorry, I only work on lists");
}
There is no point in duplicating the element.
> On p. 197, there's a similar comment "Any function that has been
> invoked by either .External or .Call will have all of its arguments
> protected already. You do not need to protect them. .... [T]hey were
> not duplicated and should be treated as read-only values."
>
> "should be ... read-only" concerns me. They are "protected" in the
> garbage collector sense,
Yes
> but they are not protected from "return by
> reference" damage. Right?
>
There is no "return by reference damage".
The only problem is if you modify input arguments while someone else holds a reference, but there is no way in C to prevent that while still allowing them to be useful. Note that it is legal to modify input arguments if there are no references to it.
Cheers,
Simon
> Why doesn't the documentation recommend function writers to mark
> arguments to C functions as const? Isn't that what the return by
> value policy would suggest?
>
> Here's a troublesome example in R src/main/array.c:
>
> /* DropDims strips away redundant dimensioning information. */
> /* If there is an appropriate dimnames attribute the correct */
> /* element is extracted and attached to the vector as a names */
> /* attribute. Note that this function mutates x. */
> /* Duplication should occur before this is called. */
>
> SEXP DropDims(SEXP x)
> {
> SEXP dims, dimnames, newnames = R_NilValue;
> int i, n, ndims;
>
> PROTECT(x);
> dims = getAttrib(x, R_DimSymbol);
> [... SNIP ....]
> setAttrib(x, R_DimNamesSymbol, R_NilValue);
> setAttrib(x, R_DimSymbol, R_NilValue);
> setAttrib(x, R_NamesSymbol, newnames);
> [... SNIP ....]
>
> return x;
> }
>
> Well, at least there's a warning comment with that one. But it does
> show .Call allows call by reference.
>
> Why allow it, though? DropDims could copy x, modify the copy, and return it.
>
> I wondered why DropDims bothers to return x at all. We seem to be
> using modify and return by reference there.
>
> I also wondered why x is PROTECTED, actually. Its an argument, wasn't
> it automatically protected? Is it no longer protected after the
> function starts modifying it?
>
> Here's an example with similar usage in Writing R Extensions, section
> 5.10.1 "Calling .Call". It protects the arguments a and b (needed
> ??), then changes them.
>
> #include <R.h>
> #include <Rdefines.h>
>
> SEXP convolve2(SEXP a, SEXP b)
> {
> R_len_t i, j, na, nb, nab;
> double *xa, *xb, *xab;
> SEXP ab;
>
> PROTECT(a = AS_NUMERIC(a)); /* PJ wonders, doesn't this alter
> "a" in calling code*/
> PROTECT(b = AS_NUMERIC(b));
> na = LENGTH(a); nb = LENGTH(b); nab = na + nb - 1;
> PROTECT(ab = NEW_NUMERIC(nab));
> xa = NUMERIC_POINTER(a); xb = NUMERIC_POINTER(b);
> xab = NUMERIC_POINTER(ab);
> for(i = 0; i < nab; i++) xab[i] = 0.0;
> for(i = 0; i < na; i++)
> for(j = 0; j < nb; j++) xab[i + j] += xa[i] * xb[j];
> UNPROTECT(3);
> return(ab);
> }
>
>
> Doesn't
>
> PROTECT(a = AS_NUMERIC(a));
>
> have the alter the data structure "a" both inside the C function and
> in the calling R code as well? And, if a was PROTECTED automatically,
> could we do without that PROTECT()?
>
> pj
>
> --
> Paul E. Johnson
> Professor, Political Science Assoc. Director
> 1541 Lilac Lane, Room 504 Center for Research Methods
> University of Kansas University of Kansas
> http://pj.freefaculty.org http://quant.ku.edu
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
More information about the R-devel
mailing list