[Rd] must .Call C functions return SEXP?
Andrew Piskorski
atp at piskorski.com
Thu Oct 28 15:48:21 CEST 2010
On Thu, Oct 28, 2010 at 12:15:56AM -0400, Simon Urbanek wrote:
> > Reason I ask, is I've written some R code which allocates two long
> > lists, and then calls a C function with .Call. My C code writes to
> > those two pre-allocated lists,
> That's bad! All arguments are essentially read-only so you should
> never write into them!
I don't see how. (So, what am I missing?) The R docs themselves
state that the main point of using .Call rather than .C is that .Call
does not do any extra copying and gives one direct access to the R
objects. (This is indeed very useful, e.g. to reorder a large matrix
in seconds rather than hours.)
I could allocate the two lists in my C code, but so far it was more
convenient to so in R. What possible difference in behavior can there
be between the two approaches?
> R has pass-by-value(!) semantics, so semantically you code has
> nothing to do with the result.1 and result.2 variables since only
> their *values* are guaranteed to be passed (possibly a copy).
Clearly C code called from .Call must be allowed to construct R
objects, as that's how much of R itself is implemented, and further
down, it's what you recommend I should do instead.
But why does it follow that C code must never modify an object
initially allocated by R code? Are you saying there is some special
magic difference in the state of an object allocated by R's C code
vs. one allocated by R code? If so, what is it?
What is the potential problem here, that the garbage collector will
suddenly run while my C code is in the middle of writing to an R list?
Yes, if the gc is going to move the object elsewhere, that would be
very bad. But it looks to me like that cannot happen, because lots of
the R implementation itself would fail badly if it did.
E.g.: The PROTECT call is used to increment reference counts, but I
see no guarantees that it is atomic with the operations that allocate
objects. I see no mutexes or other barriers in C code to prevent the
gc from running, thus implying that it *can't* run until the C
function completes.
And R is single threaded, of course. But what about signal handlers,
could they ever invoke R's gc?
Also, I was initially surprised not to find any matrix C APIs, but
grepping for examples (sorry, I don't remember exactly which
functions) showed me that the apparently accepted way to do matrix
operations from C is to simply assume R's column-first dense matrix
order, and access the 2D matrix as a flat 1D vector. (Which is easy.)
> The fact that internally R attempts to avoid copying for performance
> reasons is the only reason why your code may have appeared to work,
> but it's invalid!
I will probably change my code to allocate a new list from the C code
and return that, as you recommend. My main reason for doing the
allocation in R was just that it was simpler, especially given the
very limited documentation of R's C API.
But, I didn't see anything in the "Writing R Extensions" doc saying
that what my code is doing is "invalid", and more importantly, I don't
see why it would or should be invalid...
I'd still like to better understand why you think doing the initial
allocation of an object in R rather than C code is such a problem. So
far, I don't see any way that the R interpreter could ever tell the
difference.
Wait, or is the only objection here that I'm using C in a way that
makes pass-by-reference semantics visible to my R code? Which will
work completely correctly, but is not the The Proper R Way?
I don't actually need pass-by-reference behavior here at all, but I
can imagine cases where I might want it, so I'd like to understand
your objections better. Is using C to implement pass-by-reference
actually Broken, or merely Ugly? From my reasons above, I think it
will always work correctly and thus is not Broken. But of course
given R's devotion to pass-by-value, it could be considered
unacceptably Ugly.
--
Andrew Piskorski <atp at piskorski.com>
http://www.piskorski.com/
More information about the R-devel
mailing list