[Rd] Inconsistent results from .C()
Prof Brian Ripley
ripley at stats.ox.ac.uk
Thu May 23 08:08:17 CEST 2013
lev2 is malloc-ed in the code you supplied. The new option Bill refers
to is about input variables in .C, i.e. lev and not lev2. Other array
overruns can be found by other tools: Bill mentioned valgrind, and
AddressSanitizer finds a different set of overruns. See
http://cran.r-project.org/doc/manuals/r-release/R-exts.html#Using-gctorture-and-memory-access
.
On 23/05/2013 01:18, William Dunlap wrote:
> Had you tried using the new-to-3.0.0 options(CBoundCheck=TRUE)?
>
> [from the NEWS file]
> There is a new option, options(CBoundsCheck=), which controls how .C()
> and .Fortran() pass arguments to compiled code. If true (which can be
> enabled by setting the environment variable R_C_BOUNDS_CHECK to yes),
> raw, integer, double and complex arguments are always copied, and checked
> for writing off either end of the array on return from the compiled code
> (when a second copy is made). This also checks individual elements of character
> vectors passed to .C().
>
> valgrind helps find memory misuse also.
>
> When you write on memory that you didn't allocate, expect to get mystifying
> behavior. Nonrepeatable results are a sign of memory misuse.
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
>> -----Original Message-----
>> From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org] On Behalf
>> Of Robin Evans
>> Sent: Wednesday, May 22, 2013 4:16 PM
>> To: R Devel List
>> Subject: Re: [Rd] Inconsistent results from .C()
>>
>> Update: I did eventually discover an error in the C code which is probably
>> ultimately responsible for the bug (the variable eff_size is allowed to get
>> too large, and overrun the end of lev2[]). However the behaviour of R (or
>> Rstudio) in response to this is still somewhat mystifying to me!
>>
>> Robin
>>
>> On 21 May 2013 14:10, Robin Evans <rje42 at cam.ac.uk> wrote:
>>
>>> Sure! C code is below if it helps. The gist is that the function
>>> oneMargin forms two matrices M and C, mostly by repeatedly taking
>>> Kronecker products.
>>>
>>> Robin
>>>
>>> void kronecker (int *A, int *B, int *dima, int *dimb, int *C) {
>>> int k = 0;
>>> int i1,i2,j1,j2;
>>>
>>> for (i2 = 0; i2 < dima[1]; i2++) {
>>> for (j2 = 0; j2 < dimb[1]; j2++) {
>>> for (i1 = 0; i1 < dima[0]; i1++) {
>>> for (j1 = 0; j1 < dimb[0]; j1++) {
>>> C[k] = A[i1 + dima[0]*i2]*B[j1 + dimb[0]*j2];
>>> k++;
>>> }}}}
>>> }
>>>
>>> void iterate (int *x, int *lev) {
>>> bool ok = FALSE;
>>> int i=0;
>>>
>>> do {
>>> if(x[i] < lev[i]-1) {
>>> x[i]++;
>>> ok = TRUE;
>>> }
>>> else {
>>> x[i] = 0;
>>> }
>>> i++;
>>> }
>>> while (!ok);
>>> }
>>>
>>> void oneMargin(int *Mar, int *Eff, int *neff, int *lev, int *nvar, int
>>> *M, int *C) {
>>>
>>> int *M2, *mult, *Cj, *Cj2;
>>> int i,j=1,k,l,m;
>>>
>>> /* determine size of M to output */
>>> for (i = 0; i < nvar[0]; i++) {
>>> j *= (Mar[i] == 1 ? lev[i]*lev[i] : lev[i]);
>>> }
>>> M2 = malloc(sizeof(int)*j);
>>> mult = malloc(sizeof(int)*j);
>>>
>>> M[0] = 1;
>>> int M_size[2] = {1,1}, mult_size[2], Cj_size[2], C_size[2] = {0,0};
>>>
>>> for (i = nvar[0]-1; i >= 0; i--) {
>>> if (Mar[i] == 1) {
>>> for (j = 0; j < lev[i]*lev[i]; j++) {
>>> mult[j] = ((j % (lev[i]+1)) == 0) ? 1 : 0;
>>> }
>>> mult_size[0] = mult_size[1] = lev[i];
>>> }
>>> else {
>>> for (j = 0; j < lev[i]; j++) {
>>> mult[j] = 1;
>>> }
>>> mult_size[0] = 1;
>>> mult_size[1] = lev[i];
>>> }
>>>
>>> kronecker(M, mult, M_size, mult_size, M2);
>>> M_size[0] *= mult_size[0];
>>> M_size[1] *= mult_size[1];
>>>
>>> /* copy M2 over to M */
>>> for (j=0; j < M_size[0]*M_size[1]; j++) {
>>> M[j] = M2[j];
>>> }
>>> }
>>>
>>> free(M2);
>>>
>>> /* Create C matrix */
>>> int index[nvar[0]];
>>> int prod;
>>> int eff_size = 0;
>>> /*swapped = 0; */
>>> mult_size[0] = 1;
>>>
>>> C_size[0] = 1;
>>> for (j = 0; j < nvar[0]; j++) {
>>> C_size[0] *= (Mar[j] == 1 ? lev[j] : 1);
>>> }
>>>
>>> Cj = malloc(sizeof(int)*C_size[0]);
>>> Cj2 = malloc(sizeof(int)*C_size[0]);
>>> int *lev2;
>>> lev2 = malloc(sizeof(int)*nvar[0]);
>>>
>>> /* loop over effects */
>>> for (i = 0; i < neff[0]; i++) {
>>> prod = 1;
>>> for (j = 0; j < nvar[0]; j++) {
>>> if (Eff[j + i*nvar[0]]) {
>>> prod *= lev[j]-1;
>>> lev2[eff_size] = lev[j]-1;
>>> index[eff_size] = 0;
>>> eff_size++;
>>> }
>>> }
>>>
>>> /* loop over states of effect (excluding corner points) */
>>> for (j = 0; j < prod; j++) {
>>> Cj[0] = 1;
>>> Cj_size[0] = Cj_size[1] = 1;
>>>
>>> k = eff_size;
>>> /* loop over variables */
>>> for (l = nvar[0]-1; l >= 0; l--) {
>>> /* skip variables not in margin */
>>> if (Mar[l] == 0) continue;
>>> mult_size[1] = lev[l];
>>>
>>> /* kronecker factor depends on whether or not variable is in
>>> effect */
>>> if (Eff[i*nvar[0]+l] == 0) {
>>> /* for variables not in effect, just repeat matrix */
>>> for (m = 0; m < lev[l]; m++) {
>>> mult[m] = 1;
>>> }
>>> }
>>> else {
>>> /* otherwise multiply based on state */
>>> k--;
>>> for (m = 0; m < lev[l]; m++) {
>>> mult[m] = (m == index[k]) ? lev[l] - 1 : -1;
>>> }
>>> }
>>> kronecker(Cj, mult, Cj_size, mult_size, Cj2);
>>>
>>> Cj_size[1] *= mult_size[1];
>>>
>>> /* copy Cj2 over to Cj */
>>> for (k=0; k < Cj_size[0]*Cj_size[1]; k++) {
>>> Cj[k] = Cj2[k];
>>> }
>>> if (Cj_size[0]*Cj_size[1] > C_size[0]) Rprintf("pointer screwup");
>>> }
>>>
>>> /* copy row over to C */
>>> for (m = 0; m < C_size[0]; m++) C[C_size[0]*C_size[1]+m] = Cj[m];
>>> C_size[1] += 1;
>>> iterate(index, lev2);
>>> }
>>> }
>>>
>>> free(lev2);
>>> free(Cj);
>>> free(Cj2);
>>> free(mult);
>>> }
>>>
>>>
>>> On 21 May 2013 14:04, R. Michael Weylandt <michael.weylandt at gmail.com>
>>> wrote:
>>>>
>>>> It might also help if you can point us to the C code to help debug.
>>>>
>>>> MW
>>>>
>>>> On Tue, May 21, 2013 at 10:53 AM, Robin Evans <rje42 at cam.ac.uk> wrote:
>>>>> I should add to this that I'm running on Scientific Linux 6. I later
>>>>> noticed that the bug only seems to occur when I run the code from
>>> Rstudio,
>>>>> and not if I use the terminal directly, so this may be the key to the
>>>>> problem.
>>>>>
>>>>> Robin
>>>>>
>>>>> On 20 May 2013 16:12, Robin Evans <rje42 at cam.ac.uk> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I've run into a problem which is both maddening and rather hard to
>>>>>> replicate, therefore I wondered if someone might know of a plausible
>>>>>> explanation for it. I couldn't find anything in the archives, though
>>>>>> maybe I'm searching for the wrong thing.
>>>>>>
>>>>>> I'm calling some C code using .C, and get the vector I'm interested in
>>>>>> back as the 7th location in the returned list. However I find that if
>>>>>> I try to inspect the contents of this entry in the list in some ways,
>>>>>> I get one answer, and if I look at it in others I get a different
>>>>>> answer. It's quite possible that there's something wrong with the C
>>>>>> code, but this doesn't seem to explain why this phenomenon would occur
>>>>>> in R.
>>>>>>
>>>>>> The problem does not always occur - I have to run the code a few times
>>>>>> and then call the console when it does, but the commands below show
>>>>>> what can happen when it does. I apologise for not being able to get a
>>>>>> cleaner example. Full code and output is below, but here is a
>>>>>> stylised version:
>>>>>>
>>>>>> The following all give one answer (which is the wrong answer as far as
>>>>>> I'm concerned) :
>>>>>> * printing the whole list :
>>>>>> .C(...) # and looking at the 7th entry
>>>>>> * applying c() to the 7th element of the list
>>>>>> c(.C(...)[[7]])
>>>>>> * assigning the 7th element to a vector:
>>>>>> x = .C(...)[[7]];
>>>>>> x
>>>>>>
>>>>>> these give a different answer (which is the answer I would hope the C
>>>>>> code returns):
>>>>>> * using dput on the 7th entry:
>>>>>> dput(.C(...)[[7]])
>>>>>> * applying c() and then dput()
>>>>>> dput(c(.C(...)[[7]]))
>>>>>> * just printing the 7th entry of the list
>>>>>> .C(...)[[7]]
>>>>>>
>>>>>> The answers are consistent in the sense that I always get the same
>>>>>> answers from running the same command in the console. I have tried
>>>>>> inspecting the returned objects to see if the objects are somehow of a
>>>>>> different class than I expect, or are just being printed oddly, but
>>>>>> have not found anything untoward.
>>>>>>
>>>>>> Any suggestions would be much appreciated!
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Robin
>>>>>>
>>>>>>
>>>>>> # THESE COMMANDS GIVE ONE ANSWER
>>>>>> # [the correct answer always begins with 1, the incorrect with -1]
>>>>>>
>>>>>>> .C("oneMargin", c(1L,1L,1L,1L,1L), c(1L,1L,1L,1L,1L), 1L,
>>>>>> c(2L,2L,2L,2L,2L), 5L, ptr1, ptr2)[[7]]
>>>>>> [1] 1 -1 -1 1 -1 1 1 -1 -1 1 1 -1 1 -1 -1 1 -1 1 1 -1 1 -1
>>>>>> -1 1 1 -1 -1 1 -1 1 1 -1
>>>>>>
>>>>>>> dput(.C("oneMargin", c(1L,1L,1L,1L,1L), c(1L,1L,1L,1L,1L), 1L,
>>>>>> c(2L,2L,2L,2L,2L), 5L, ptr1, ptr2)[[7]])
>>>>>> c(1L, -1L, -1L, 1L, -1L, 1L, 1L, -1L, -1L, 1L, 1L, -1L, 1L, -1L,
>>>>>> -1L, 1L, -1L, 1L, 1L, -1L, 1L, -1L, -1L, 1L, 1L, -1L, -1L, 1L,
>>>>>> -1L, 1L, 1L, -1L)
>>>>>>
>>>>>>> x=dput(c(.C("oneMargin", c(1L,1L,1L,1L,1L), c(1L,1L,1L,1L,1L), 1L,
>>>>>> c(2L,2L,2L,2L,2L), 5L, ptr1, ptr2)[[7]]))
>>>>>> c(1L, -1L, -1L, 1L, -1L, 1L, 1L, -1L, -1L, 1L, 1L, -1L, 1L, -1L,
>>>>>> -1L, 1L, -1L, 1L, 1L, -1L, 1L, -1L, -1L, 1L, 1L, -1L, -1L, 1L,
>>>>>> -1L, 1L, 1L, -1L)
>>>>>>> x
>>>>>> [1] 1 -1 -1 1 -1 1 1 -1 -1 1 1 -1 1 -1 -1 1 -1 1 1 -1 1 -1
>>>>>> -1 1 1 -1 -1 1 -1 1 1 -1
>>>>>>
>>>>>> # THESE ALL GIVE A DIFFERENT ONE!
>>>>>>
>>>>>>> .C("oneMargin", c(1L,1L,1L,1L,1L), c(1L,1L,1L,1L,1L), 1L,
>>>>>> c(2L,2L,2L,2L,2L), 5L, ptr1, ptr2)
>>>>>>
>>>>>> # (OTHER ELEMENTS OF LIST REMOVED)
>>>>>> [[7]]
>>>>>> [1] -1 -1 1 1 1 1 -1 -1 1 1 -1 -1 -1 -1 1 1 1 1 -1 -1 -1 -1
>>>>>> 1 1 -1 -1 1 1 1 1 -1 -1
>>>>>>
>>>>>>> c(.C("oneMargin", c(1L,1L,1L,1L,1L), c(1L,1L,1L,1L,1L), 1L,
>>>>>> c(2L,2L,2L,2L,2L), 5L, ptr1, ptr2)[[7]])
>>>>>> [1] -1 -1 1 1 1 1 -1 -1 1 1 -1 -1 -1 -1 1 1 1 1 -1 -1 -1 -1
>>>>>> 1 1 -1 -1 1 1 1 1 -1 -1
>>>>>>> x = .C("oneMargin", c(1L,1L,1L,1L,1L), c(1L,1L,1L,1L,1L), 1L,
>>>>>> c(2L,2L,2L,2L,2L), 5L, ptr1, ptr2)[[7]]
>>>>>>> x
>>>>>> [1] -1 -1 1 1 1 1 -1 -1 1 1 -1 -1 -1 -1 1 1 1 1 -1 -1 -1 -1
>>>>>> 1 1 -1 -1 1 1 1 1 -1 -1
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Robin Evans
>>>>>> Statistical Laboratory
>>>>>> University of Cambridge
>>>>>> blog: itsastatlife.blogspot.com
>>>>>> web: www.statslab.cam.ac.uk/~rje42
>>>>>>
>>>>>> Causal Inference Workshop July 15th:
>>>>>> http://www.statslab.cam.ac.uk/~rje42/uai13/main.htm
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Robin Evans
>>>>> Statistical Laboratory
>>>>> University of Cambridge
>>>>> blog: itsastatlife.blogspot.com
>>>>> web: www.statslab.cam.ac.uk/~rje42 <
>>> http://www.stat.washington.edu/~rje42>
>>>>>
>>>>> Causal Inference Workshop July 15th:
>>>>> http://www.statslab.cam.ac.uk/~rje42/uai13/main.htm
>>>>>
>>>>> [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> R-devel at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list