[Bioc-devel] PROTECT errors in Bioconductor packages

luke-tierney at uiowa.edu luke-tierney at uiowa.edu
Fri Apr 7 21:46:15 CEST 2017


On Fri, 7 Apr 2017, Hervé Pagès wrote:

> On 04/07/2017 05:37 AM, luke-tierney at uiowa.edu wrote:
>>  On Fri, 7 Apr 2017, Hervé Pagès wrote:
>> 
>> >  On 04/06/2017 03:29 AM, Michael Lawrence wrote:
>> > >  On Thu, Apr 6, 2017 at 2:59 AM, Martin Morgan
>> > >  <martin.morgan at roswellpark.org> wrote:
>> > > >  On 04/06/2017 05:33 AM, Aaron Lun wrote:
>> > > > > > 
>> > > > > >  The tool is not perfect, so assess each report carefully.
>> > 
>> >  I get a lot of warnings because the tool seems to consider that
>> >  extracting an attribute (with getAttrib(x, ...)) or extracting the
>> >  slot of an S4 object (with GET_SLOT(x, ...) or R_do_slot(x, ...))
>> >  returns an SEXP that needs protection. I always assumed that it
>> >  didn't because my understanding is that the returned SEXP is pointing
>> >  to a part of a pre-existing object ('x') and not to a newly created
>> >  one. So I decided I could treat it like the SEXP returned by
>> >  VECTOR_ELT(), which, AFAIK, doesn't need protection.
>> > 
>> >  So I suspect these warnings are false positives but I'm not 100% sure.
>>
>>  If you are not 100% sure then you should protect :-)
>>
>>  There are some cases, in particular related to compact row names on
>>  data frames, where getAttrib will allocate.
>
> Seriously? So setAttrib(x, ..., getAttrib) is not going to be a no-op
> anymore? Should I worry that VECTOR_ELT() will also expand some sort
> of compact list element? Why not keep these things low-level
> getters/setters that return whatever the real thing is and use
> higher-level accessors for returning the expanded version of the thing?

Seriously: it's	been that way since r37807 in 2006.

If you want to read about some related future directions you can look at
https://svn.r-project.org/R/branches/ALTREP/ALTREP.html.

luke


>
> Thanks,
> H.
>
>>
>>  Best,
>>
>>  luke
>> 
>> > 
>> > > > > 
>> > > > > 
>> > > > >  I also get a warning on almost every C++ function I've written,
>> > > > >  because
>> > > > >  I use the following code to handle exceptions:
>> > > > > 
>> > > > >       SEXP output=PROTECT(allocVector(...));
>> > > > >       try {
>> > > > >           // do something that might raise an exception
>> > > > >       } catch (std::exception& e) {
>> > > > >           UNPROTECT(1);
>> > > > >           throw; // break out of this part of the function
>> > > > >       }
>> > > > >       UNPROTECT(1);
>> > > > >       return output;
>> > > > > 
>> > > > >  Presumably the check doesn't account for transfer of control to 
>> > > > >  the
>> > > > >  catch block. I find that R itself is pretty good at complaining 
>> > > > >  about
>> > > > >  stack imbalances during execution of tests, examples, etc.
>> > > > > 
>> > > > > >  'My' packages
>> > > > > >  (Rsamtools, DirichletMultinomial) had several false positives 
>> > > > > >  (all
>> > > > > >  associated with use of an attribute of a protected SEXP), one 
>> > > > > >  subtle
>> > > > > >  problem (a symbol from a PROTECT'ed package name space; the 
>> > > > > >  symbol
>> > > > > >  could
>> > > > > >  in theory be an active binding and the value obtained not
>> > > > > >  PROTECTed by
>> > > > > >  the name space), and a genuine bug
>> > > > > > 
>> > > > > >                  tag = NEW_CHARACTER(n);
>> > > > > >                  for (int j = 0; j < n; ++j)
>> > > > > >                      SET_STRING_ELT(tag, j, NA_STRING);
>> > > > > >                  if ('A' == aux[0]) {
>> > > > > >                      buf_A = R_alloc(2, sizeof(char));  # <<- bug
>> > > > > >                      buf_A[1] = '\0';
>> > > > > >                  }
>> > > > > >                  ...
>> > > > > >                  SET_VECTOR_ELT(tags, i, tag); # PROTECT tag, too
>> > > > > >  late!
>> > > > > 
>> > > > > 
>> > > > >  I assume the bug refers to the un-PROTECT'd NEW_CHARACTER here - 
>> > > > >  the
>> > > > >  R_alloc call looks okay to me...
>> > > > 
>> > > > 
>> > > >  yes, tag needs protection.
>> > > > 
>> > > >  I attributed the bug to R_alloc because I failed to reason that 
>> > > >  R_alloc
>> > > >  (obviously) allocates and hence can trigger a garbage collection.
>> > > > 
>> > > >  Somehow it reflects my approach to PROTECTion, probably not shared 
>> > > >  by
>> > > >  everyone. I like to PROTECT only when necessary, rather than
>> > > >  indiscriminately. Probably this has no practical consequence in
>> > > >  terms of
>> > > >  performance, making the code a little easier to read at the expense 
>> > > >  of
>> > > >  exposing me to bugs like this.
>> > > > 
>> > > 
>> > >  I guess it's a tradeoff between syntactic complexity and logical
>> > >  complexity. You have to think pretty hard to minimize use of the
>> > >  protect stack.
>> > 
>> >  I prefer to call it logical obscurity ;-)
>> > 
>> >  The hard thinking consists in assessing whether or not the code between
>> >  the line where a new SEXP is allocated (line X) and the line where
>> >  it's put in a safe place (line Y) can trigger garbage collection.
>> >  Hard to figure out in general indeed, but not impossible! Problem
>> >  is that the result of this assessment is valid at a certain point
>> >  in time but might change in the future, even if your code has not
>> >  changed.
>> > 
>> >  So a dangerous game for virtually zero benefits.
>> > 
>> > > 
>> > >  One recommendation might be to UNPROTECT() as soon as the pointer on
>> > >  the top is unneeded, rather than trying to figure out the number to
>> > >  pop just before returning to R.
>> > 
>> >  If you PROTECT() in a loop, you definitely want to do that. Otherwise,
>> >  does it make a big difference?
>> > 
>> > > 
>> > >  One thing that got me is that the order in which C evaluates function
>> > >  call arguments is undefined. I did a lot of R_setAttrib(x,
>> > >  install("foo"), allocBar()), thinking that the symbol would be
>> > >  automatically protected, and allocBar() would not need protection,
>> > >  since it happened last. Unfortunately, it might be evaluated first.
>> > 
>> >  I got hit by this too long time ago but with defineVar() instead of
>> >  R_setAttrib():
>> > 
>> >   https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_pipermail_r-2Ddevel_2008-2DJanuary_048040.html&d=DwID-g&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FscW1HcPCwUqtMwKVFDfd1NyW0oHh0tJOPdFb3C1IWk&s=O3CcB-Z_OkVKaC1aV0aIc5SCDNqGQrkvGSmPf0TRAsw&e=
>> > 
>> >  H.
>> > 
>> > > 
>> > >  Btw, I think my package RGtk2 got the record: 1952 errors. Luckily
>> > >  almost all of them happened inside a few macros and autogenerated
>> > >  code.
>> > > 
>> > > >  Martin
>> > > > 
>> > > > > 
>> > > > >  Cheers,
>> > > > > 
>> > > > >  Aaron
>> > > > >  _______________________________________________
>> > > > >  Bioc-devel at r-project.org mailing list
>> > > > >  https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=su20YzStywMa_pdWblzF0RnK8ATRw-t61lOIOsi0xTU&s=JhBOw1ac5wXfV1BSjFuidxFiBTx43J7iEvZG4G0_0uU&e=
>> > > > > 
>> > > > > 
>> > > > 
>> > > > 
>> > > >  This email message may contain legally privileged
>> > > >  and/or...{{dropped:2}}
>> > > > 
>> > > >  _______________________________________________
>> > > >  Bioc-devel at r-project.org mailing list
>> > > >  https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=su20YzStywMa_pdWblzF0RnK8ATRw-t61lOIOsi0xTU&s=JhBOw1ac5wXfV1BSjFuidxFiBTx43J7iEvZG4G0_0uU&e=
>> > > > 
>> > > 
>> > >  _______________________________________________
>> > >  Bioc-devel at r-project.org mailing list
>> > >  https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=su20YzStywMa_pdWblzF0RnK8ATRw-t61lOIOsi0xTU&s=JhBOw1ac5wXfV1BSjFuidxFiBTx43J7iEvZG4G0_0uU&e=
>> > > 
>> > > 
>> > 
>> > 
>> 
>
>

-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   luke-tierney at uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu


More information about the Bioc-devel mailing list