[Rd] number of copies
Simon Urbanek
simon.urbanek at r-project.org
Mon Oct 3 17:38:54 CEST 2011
Terry,
On Oct 3, 2011, at 10:32 AM, Terry Therneau wrote:
> I'm looking at memory efficiency for some of the survival code. The
> following fragment appears in coxph.fit
> coxfit <- .C("coxfit2", iter=as.integer(maxiter),
> as.integer(n),
> as.integer(nvar), stime,
> sstat,
> x= x[sorted,] ,
> ...
>
> Does this make a second copy of x to pass to the routine (my
> expectation) or will I end up with 3: x and x[sorted,] in the local
> frame of reference, and another due to dup=TRUE?
>
I'm not sure I'm counting your copies right, but I'd say the latter (although the sorting cannot be technically called a copy ;)).
There are 4 distinct, separate objects:
x -> x[sorted,] -> double-array to pass to C -> result vector
If you care about speed, you should definitely use .Call().
Note for debugging: tracemem is actually smart and flags the intermediate memory object created inside .C for passing as a proper duplication even though it is not a real one (no duplicate() involved) since the object is not an R object at all. It then also flags the allocation of the result object as a duplication from the intermediate object, so in summary tracemem gives you the true number of copies.
As far as I remember .C is a legacy left-over from the ancient Fortran interface in original S (it's not really a C interface at all - it is a Fortran interface that happens to not care about source language and C can be used to create Fortran-looking object code) so unless one needs Fortran, one should not be using .C ;). It can be used, but should not be used for anything but maybe didactic purposes IMHO.
Cheers,
Simon
More information about the R-devel
mailing list