[Rd] Garbage collection of seemingly PROTECTed pairlist

Martin Morgan mtmorg@n@b|oc @end|ng |rom gm@||@com
Sat Sep 12 12:44:14 CEST 2020


I put your code into a file tmp.R and eliminated the need for a package by compiling this to a shared object

  R CMD SHLIB tmp.c

I'm then able to use a simple script 'tmp.R'

  dyn.load("/tmp/tmp.so")

  fullocate <- function(int_mat)
    .Call("C_fullocate", int_mat)

  int_mat <- rbind(c(5L, 6L), c(7L, 10L), c(20L, 30L))

  while(TRUE)
    res <- fullocate(int_mat)

to generate a segfault.

Looking at your code, it seemed like I could get towards a simpler reproducible example by eliminating most of the 'while' loop and then functions and code branches that are not used

  #include <Rinternals.h>
  
  SEXP C_int_mat_nth_row_nrnc(int *int_mat_int, int nr, int nc, int n) {
      SEXP out = PROTECT(Rf_allocVector(INTSXP, nc));
      int *out_int = INTEGER(out);
      for (int i = 0; i != nr; ++i) {
         out_int[i] = int_mat_int[n - 1 + i * nr];
      }
      UNPROTECT(1);
      return out;}
  
  SEXP C_fullocate(SEXP int_mat) {
      int nr = Rf_nrows(int_mat), *int_mat_int = INTEGER(int_mat);
      int row_num = 2;  // row_num will be 1-indexed
      SEXP prlst0cdr = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, 1));
      SEXP prlst = PROTECT(Rf_list1(prlst0cdr));
      SEXP row = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, row_num));
      Rf_PrintValue(prlst);  // This is where the error occurs
      UNPROTECT(3);
  
      return R_NilValue;
  }
  
my script still gives an error, but not a segfault, and the values printed sometimes differ between calls

...

[[1]]
[1] 5 6

.
[[1]]
NULL

...

Error in FUN(X[[i]], ...) :
  cannot coerce type 'NULL' to vector of type 'character'
Calls: message -> .makeMessage -> lapply
Execution halted

The differing values in particular, and the limited PROTECTion in the call and small allocations (hence limited need / opportunity for garbage collection), suggest that you're corrupting memory, rather than having a problem with garbage collection. Indeed,

        SEXP prlst0cdr = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, 1));

allocates a vector of length 2 at

        SEXP out = PROTECT(Rf_allocVector(INTSXP, nc));

but writes three elements (the 0th, 1st, and 2nd) at

          for (int i = 0; i != nr; ++i) {
              out_int[i] = int_mat_int[n - 1 + i * nr];
          }

Martin Morgan

On 9/11/20, 9:30 PM, "R-devel on behalf of Rory Nolan" <r-devel-bounces using r-project.org on behalf of rorynoolan using gmail.com> wrote:

    I want to write an R function using R's C interface that takes a 2-column
    matrix of increasing, non-overlapping integer intervals and returns a list
    with those intervals plus some added intervals, such that there are no
    gaps. For example, it should take the matrix rbind(c(5L, 6L), c(7L, 10L),
    c(20L, 30L)) and return list(c(5L, 6L), c(7L, 10L), c(11L, 19L), c(20L,
    30L)). Because the output is of variable length, I use a pairlist (because
    it is growable) and then I call Rf_PairToVectorList() at the end to make it
    into a regular list.

    I'm getting a strange garbage collection error. My PROTECTed pairlist prlst
    gets garbage collected away and causes a memory leak error when I try to
    access it.

    Here's my code.

    #include <Rinternals.h>


    SEXP C_int_mat_nth_row_nrnc(int *int_mat_int, int nr, int nc, int n) {
      SEXP out = PROTECT(Rf_allocVector(INTSXP, nc));
      int *out_int = INTEGER(out);
      if (n <= 0 | n > nr) {
        for (int i = 0; i != nc; ++i) {
          out_int[i] = NA_INTEGER;
        }
      } else {
        for (int i = 0; i != nr; ++i) {
          out_int[i] = int_mat_int[n - 1 + i * nr];
        }
      }
      UNPROTECT(1);
      return out;}

    SEXP C_make_len2_int_vec(int first, int second) {
      SEXP out = PROTECT(Rf_allocVector(INTSXP, 2));
      int *out_int = INTEGER(out);
      out_int[0] = first;
      out_int[1] = second;
      UNPROTECT(1);
      return out;}

    SEXP C_fullocate(SEXP int_mat) {
      int nr = Rf_nrows(int_mat), *int_mat_int = INTEGER(int_mat);
      int last, row_num;  // row_num will be 1-indexed
      SEXP prlst0cdr = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, 1));
      SEXP prlst = PROTECT(Rf_list1(prlst0cdr));
      SEXP prlst_tail = prlst;
      last = INTEGER(prlst0cdr)[1];
      row_num = 2;
      while (row_num <= nr) {
        Rprintf("row_num: %i\n", row_num);
        SEXP row = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, row_num));
        Rf_PrintValue(prlst);  // This is where the error occurs
        int *row_int = INTEGER(row);
        if (row_int[0] == last + 1) {
          Rprintf("here1");
          SEXP next = PROTECT(Rf_list1(row));
          prlst_tail = SETCDR(prlst_tail, next);
          last = row_int[1];
          UNPROTECT(1);
          ++row_num;
        } else {
          Rprintf("here2");
          SEXP next_car = PROTECT(C_make_len2_int_vec(last + 1, row_int[0] - 1));
          SEXP next = PROTECT(Rf_list1(next_car));
          prlst_tail = SETCDR(prlst_tail, next);
          last = row_int[0] - 1;
          UNPROTECT(2);
        }
        UNPROTECT(1);
      }
      SEXP out = PROTECT(Rf_PairToVectorList(prlst));
      UNPROTECT(3);
      return out;}

    As you can see, I have some diagnostic print statements in there. The
    offending line is line 40, which I have marked with a comment of // This is
    where the error occurs. I have a minimal reproducible package at
    https://github.com/rorynolan/testpkg and I have run R CMD CHECK with
    valgrind using GitHub actions, the results of which are at
    https://github.com/rorynolan/testpkg/runs/1076595757?check_suite_focus=true.
    That's where I found out which line is causing the error. This function
    works as expected sometimes, and then sometimes this issue appears. This
    lends weight to the suspicion that it's a garbage collection issue.

    I really want to know what my mistake is. I'm not that interested in
    alternative implementations; I want to understand the mistake that I'm
    making so that I can avoid making it in future.

    I have asked the question on stackoverflow to little avail, but the
    discussion there may prove helpful.
    https://stackoverflow.com/questions/63759604/garbage-collection-of-seemingly-protected-pairlist.



    Thanks,

    Rory

    	[[alternative HTML version deleted]]

    ______________________________________________
    R-devel using r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-devel


More information about the R-devel mailing list