[Rd] Return values from .Call and garbage collection
Sklyar, Oleg (London)
osklyar at maninvestments.com
Tue Jan 27 13:25:12 CET 2009
- R is not multithreaded (or so it was) and thus race condition cannot
occur
- I would think there is no call to GC at the time of assignment of the
return value to a variable. GC is only called within other R calls as R
as mentioned above is not multithreaded
Most likely issue is your code itself, out of range indexing, failure to
initialise all elements of the allocated structure correctly, 1 and not
0-based indexing, use of other R variables for initialisation that
should have been protected but were not etc.
Dr Oleg Sklyar
Research Technologist
AHL / Man Investments Ltd
+44 (0)20 7144 3107
osklyar at maninvestments.com
> -----Original Message-----
> From: r-devel-bounces at r-project.org
> [mailto:r-devel-bounces at r-project.org] On Behalf Of Jon Senior
> Sent: 27 January 2009 12:09
> To: r-devel at r-project.org
> Subject: [Rd] Return values from .Call and garbage collection
>
> Hi all,
>
> I'm posting this here as it discusses an issue with an
> external C library. If it would be better in R-Help, then I'll repost.
>
> I'm using an external library which I've written, which
> provides a large set of data (>500MB in a highly condensed
> format) and the tools to return values from the data. The
> functionality has been tested call by call and using valgrind
> and works fine, with no memory leaks. After retrieval, I
> process the data in R. A specific function is causing a
> problem that appears to be related to the garbage collector
> (judging by symptoms).
>
> In the C code, a Matrix is created using
>
> PROTECT(retVal = allocMatrix(INTSXP, x, y));
>
> Values are written into this matrix using
>
> INTEGER(retVal)[translatedOffset]=z;
>
> where "translatedOffset" is a conversion from a row/column
> pair to an offset as shown in R-exts.pdf.
>
> The last two lines of the function call are:
>
> UNPROTECT(1);
> return retVal;
>
> The shared library was compiled with R CMD SHLIB and is
> called using .Call.
>
> Which returns our completed SEXP object to R where processing
> continues.
>
> In R, we continue to process the data, replacing -1s with NAs
> (I couldn't find a way to do that in that would make it back
> into R), sorting it, and trimming it. All of these operations
> are carried out on the original data.
>
> If I carry out the processing step by step from the
> interpreter, everything is fine and the data comes out how I
> would expect. But when I run the R code to carry out those
> steps, every now and again (Around 1/5th of the time), the
> returned data is garbage. I'm expecting to receive a bias per
> iteration that should be -5 <= bias <= 5, but for the
> garbaged data, I'm getting results of the order of 100s of
> thousands out (eg. -220627.7). If I call the routine which
> carries out the processing for one iteration from the
> intepreter, sometimes I get the correct data, sometimes (with
> the same frequency) I get garbage.
>
> There are two possibilities that I can envisage.
> 1) Race condition: R is starting to execute the R code after
> the .Call before the .Call has returned, thus the data is corrupted.
> 2) Garbage collector: the GC is collecting my data between
> the UNPROTECT(1); call and the assignment to an R variable.
>
> The created matrices can be large (where x > 1000, y >
> 100000), but the garbage doesn't appear to be related to the
> size of the matrix.
>
> Any ideas what steps I could take to proceed with this? Or
> other possibilities than those I've suggested? For reasons of
> confidentiality I'm unable to release test code, and the
> large dataset might make testing difficult.
>
> Thanks in advance
>
> --
> Jon Senior <jon at restlesslemon.co.uk>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
**********************************************************************
Please consider the environment before printing this email or its attachments.
The contents of this email are for the named addressees ...{{dropped:19}}
More information about the R-devel
mailing list