[Rd] [External] Mitigating Stalls Caused by Call Deparse on Error

brodie gaslam brod|e@g@@|@m @end|ng |rom y@hoo@com
Wed Jul 17 02:55:56 CEST 2019


As per Luke's instructions I've updated the [wishlist item][1]
 to include the deparse-on-error issue, and also renamed it to
something more appropriate for its broader scope.

I do share Lionel's concern that the deparse-on-error issue
could get unnecessarily delayed in the hopes of finding a more
comprehensive issue for all the problems. I would really appreciate
 it if and when you all have a chance to look at this issue 
you could consider the possibility of resolving the deparse-on-error
 one separately if a global solution seems out of reach.

Thank you for all your work making R better and more
performant.

Best,

Brodie.


[1]: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17580


On Tuesday, July 16, 2019, 4:53:37 AM EDT, Lionel Henry <lionel using rstudio.com> wrote: 





We also have a few other suggestions and wishes about backtrace
storage and display on the one hand, and display of constructed calls
on the other hand. Perhaps it would be better to open a different
wishlist item for traceback() to keep the discussions focused?

FWIW I think deparsing backtraces lazily is a great idea. Displaying 1
line per call by default in interactive sessions, while being able to
get a more exhaustive output by calling `traceback(max.lines = Inf)`,
offers the best of both worlds. Understanding the structure of the
backtrace quickly is often more helpful than having immediate access
to all the information.

Best,
Lionel


> On 15 Jul 2019, at 16:04, Tierney, Luke <luke-tierney using uiowa.edu> wrote:
> 
> Better to add this to the wishlist item. This all needs to be looked
> at together, and nothing is likely to happen until after
> vacation/conference season.  It will disappear from everyone's radar
> if it is just in R_devel.
> 
> Best,
> 
> luke
> 
> On Sun, 14 Jul 2019, brodie gaslam wrote:
> 
>> Luke, thanks for considering the issue.  I would like to
>> try to separate the problem into two parts, as I _think_
>> your comments address primarily part 2 below:
>> 
>> 1. How can we avoid significant and possibly crippling
>>    stalls on error with these non-standard calls.
>> 2. What is the best way to view these non-standard calls.
>> 
>> I agree that issue 2. requires further thought and
>> discussion under a wishlist issue ([on bugzilla now][1]). 
>> While I did raise issue 2., the patch itself makes no
>> attempt to resolve it.
>> 
>> The proposed patch resolves issue 1., which is a big
>> usability problem.  Right now if you have the misfortune of
>> using `do.call` with a big object and trigger an error, you
>> have the choice of waiting a possibly long time for
>> the deparse to complete, or killing your entire R session
>> externally.
>> 
>> It seems a shame to allow a big usability issue for `do.call`
>> to remain when there is a simple solution at hand, especially
>> since the complete deparse of large objects likely serves no
>> purpose in this case. Obviously, if storing the actual calls
>> instead of their deparsed equivalents in .Traceback causes
>> problems I'm not anticipating, then that's different. 
>> Is that the case?
>> 
>> Best,
>> 
>> Brodie.
>> 
>> [1]: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17580
>> 
>> On Sunday, July 14, 2019, 8:52:45 AM EDT, Tierney, Luke <luke-tierney using uiowa.edu> wrote:
>> 
>> 
>> 
>> 
>> 
>> This is probably best viewed in the context of other issue with
>> displaying calls, such as issues arising from calls constructed in
>> non-standard evaluation contexts. Might be good to move to a wishlist
>> item in bugzilla.
>> 
>> Best,
>> 
>> luke
>> 
>> On Sat, 13 Jul 2019, brodie gaslam via R-devel wrote:
>> 
>>> When large calls cause errors R may stall for extended periods.  This
>>> is particularly likely to happen with `do.call`, as in this example
>>> with a 24 second stall:
>>> 
>>>    x <- runif(1e7)
>>>    system.time(do.call(paste0, list(abs, x)))  # intentional error
>>>    ## Error in (function (..., collapse = NULL)  :
>>>    ##  cannot coerce type 'builtin' to vector of type 'character'
>>>    ## Calls: system.time -> do.call -> <Anonymous>
>>>    ## Timing stopped at: 23.81 0.149 24.04
>>> 
>>>    str(.Traceback)
>>>    ## Dotted pair list of 3
>>>    ##  $ : chr [1:2500488] "(function (..., collapse = NULL) " ".Internal(paste0(list(...), collapse)))(.Primitive(\"abs\"), c(0.718117154669017, " "0.494785501621664, 0.1453434410505, 0.635028422810137, 0.0353180423844606, " "0.688418723642826, 0.889682895969599, 0.728154224809259, 0.292572240810841, " ...
>>>    ##  $ : chr "do.call(paste0, list(abs, x))"
>>>    ##  $ : chr "system.time(do.call(paste0, list(abs, x)))"
>>> 
>>> The first time I noticed this I thought my session had frozen/crashed
>>> as the standard interrupt ^C does not work during the deparse.  The
>>> stall happens when on error the call stack is deparsed prior to being
>>> saved to `.Traceback`.  The deparsing is done by `deparse1m` in native
>>> code, with the value of `getOption('deparse.max.lines')` which
>>> defaults to all lines.
>>> 
>>> Since there is little value to seeing millions of lines of deparsed
>>> objects in `traceback()`, a simple work-around is to change the
>>> `deparse.max.lines` value:
>>> 
>>>    options(deparse.max.lines=1)
>>>    system.time(do.call(paste0, list(abs, x)))
>>>    ## Error in (function (..., collapse = NULL)  :
>>>    ##  cannot coerce type 'builtin' to vector of type 'character'
>>>    ## Calls: system.time -> do.call -> <Anonymous>
>>>    ## Timing stopped at: 0 0 0
>>> 
>>> Unfortunately this will affect all `deparse` calls, and it seems
>>> undesirable to pre-emptively enable it just for calls that might cause
>>> large deparses on error.
>>> 
>>> An alternative is to store the actual calls instead of their deparsed
>>> character equivalents in `.Traceback`.  This defers the deparsing to
>>> when `traceback()` is used.  As per `?traceback`, it should be
>>> relatively safe to modify `.Traceback` in this way:
>>> 
>>>> It is undocumented where .Traceback is stored nor that it is
>>>> visible, and this is subject to change.
>>> 
>>> Deferring the deparsing to `traceback()` will give us the 
>>> opportunity to use a different `max.lines` setting as we do here
>>> with the patch applied:
>>> 
>>>    system.time(do.call(paste0, list(abs, x)))
>>>    ## Error in (function (..., collapse = NULL)  :
>>>    ##  cannot coerce type 'builtin' to vector of type 'character'
>>>    ## Timing stopped at: 0.028 0 0.029
>>> 
>>>    system.time(traceback(max.lines=3))
>>>    ## 3: (function (..., collapse = NULL)
>>>    ##    .Internal(paste0(list(...), collapse)))(.Primitive("abs"), c(0.535468587651849,
>>>    ##    0.0540027911774814, 0.732930393889546, 0.565360915614292, 0.544816034380347,
>>>    ##    ...
>>>    ## 2: do.call(paste0, list(abs, x))
>>>    ## 1: system.time(do.call(paste0, list(abs, x)))
>>>    ##    user  system elapsed
>>>    ##  0.000  0.000  0.003
>>> 
>>> 
>>> More generally, it might be better to have a different smaller default
>>> value for the lines to deparse when calls  are _displayed_ as parts of
>>> lists, as is the case with `traceback()`, or in `print(sys.calls())` and
>>> similar.
>>> 
>>> I attach a patch that does this.  I have run some basic tests
>>> and `make check-devel` passes. I can file an issue on bugzilla
>>> if that is a better place to have this conversation (assuming there
>>> is interest in it).
>>> 
>>> Best,
>>> 
>>> Brodie
>>> 
>>> PS: for some reason my mail client is refusing to attach the patch so I paste it
>>> starting on the next line.
>>> Index: src/gnuwin32/Rdll.hide
>>> ===================================================================
>>> --- src/gnuwin32/Rdll.hide    (revision 76827)
>>> +++ src/gnuwin32/Rdll.hide    (working copy)
>>> @@ -94,6 +94,7 @@
>>>  R_GetMaxNSize
>>>  R_GetMaxVSize
>>>  R_GetTraceback
>>> + R_GetTracebackParsed
>>>  R_GetVarLocSymbol
>>>  R_GetVarLocValue
>>>  R_HandlerStack
>>> Index: src/include/Defn.h
>>> ===================================================================
>>> --- src/include/Defn.h    (revision 76827)
>>> +++ src/include/Defn.h    (working copy)
>>> @@ -1296,6 +1296,7 @@
>>>  void NORET ErrorMessage(SEXP, int, ...);
>>>  void WarningMessage(SEXP, R_WARNING, ...);
>>>  SEXP R_GetTraceback(int);
>>> +SEXP R_GetTracebackParsed(int);
>>>  
>>>  R_size_t R_GetMaxVSize(void);
>>>  void R_SetMaxVSize(R_size_t);
>>> Index: src/library/base/R/traceback.R
>>> ===================================================================
>>> --- src/library/base/R/traceback.R    (revision 76827)
>>> +++ src/library/base/R/traceback.R    (working copy)
>>> @@ -16,9 +16,19 @@
>>>  #  A copy of the GNU General Public License is available at
>>>  #  https://www.R-project.org/Licenses/
>>>  
>>> -.traceback <- function(x = NULL) {
>>> -    if(is.null(x) && !is.null(x <- get0(".Traceback", envir = baseenv())))
>>> -    {}
>>> +.traceback <- function(x = NULL, max.lines=getOption("deparse.max.lines")) {
>>> +    if(!(is.numeric(max.lines) && !is.na(max.lines) &&
>>> +        as.integer(max.lines) > 0L)
>>> +    ) {
>>> +        max.lines <- -1L
>>> +    }
>>> +    if(is.null(x) && !is.null(x <- get0(".Traceback", envir = baseenv()))) {
>>> +        for(i in seq_along(x)) {
>>> +            srcref <- attr(x[[i]], 'srcref')
>>> +            x[[i]] <- deparse(x[[i]], nlines=max.lines)
>>> +            attr(x[[i]], 'srcref') <- srcref
>>> +        }
>>> +    }
>>>      else if (is.numeric(x))
>>>          x <- .Internal(traceback(x))
>>>      x
>>> @@ -26,7 +36,16 @@
>>>  
>>>  traceback <- function(x = NULL, max.lines = getOption("deparse.max.lines"))
>>>  {
>>> -    n <- length(x <- .traceback(x))
>>> +    valid.max.lines <- is.numeric(max.lines) && !is.na(max.lines) &&
>>> +        as.integer(max.lines) > 0L
>>> +
>>> +    if(valid.max.lines) {
>>> +        # max.lines + 1L so we can know that output was truncated by .traceback
>>> +        max.lines.2 <- as.integer(max.lines) + 1L
>>> +    } else {
>>> +        max.lines.2 <- -1L
>>> +    }
>>> +    n <- length(x <- .traceback(x, max.lines=max.lines.2))
>>>      if(n == 0L)
>>>          cat(gettext("No traceback available"), "\n")
>>>      else {
>>> @@ -40,7 +59,7 @@
>>>                  paste0(" at ", basename(srcfile$filename), "#", srcref[1L])
>>>              }
>>>              ## Truncate deparsed code (destroys attributes of xi)
>>> -            if(is.numeric(max.lines) && max.lines > 0L && max.lines < m) {
>>> +            if(valid.max.lines &&  max.lines < m) {
>>>                  xi <- c(xi[seq_len(max.lines)], " ...")
>>>                  m <- length(xi)
>>>              }
>>> Index: src/library/base/man/traceback.Rd
>>> ===================================================================
>>> --- src/library/base/man/traceback.Rd    (revision 76827)
>>> +++ src/library/base/man/traceback.Rd    (working copy)
>>> @@ -21,7 +21,7 @@
>>>  }
>>>  \usage{
>>>  traceback(x = NULL, max.lines = getOption("deparse.max.lines"))
>>> -.traceback(x = NULL)
>>> +.traceback(x = NULL, max.lines = getOption("deparse.max.lines"))
>>>  }
>>>  \arguments{
>>>    \item{x}{\code{NULL} (default, meaning \code{.Traceback}), or an
>>> Index: src/main/errors.c
>>> ===================================================================
>>> --- src/main/errors.c    (revision 76827)
>>> +++ src/main/errors.c    (working copy)
>>> @@ -1008,7 +1008,7 @@
>>>        (which should not happen) */
>>>      if (traceback && inError < 2 && inError == oldInError) {
>>>          inError = 2;
>>> -        PROTECT(s = R_GetTraceback(0));
>>> +        PROTECT(s = R_GetTracebackParsed(0));
>>>          SET_SYMVALUE(install(".Traceback"), s);
>>>          /* should have been defineVar
>>>            setVar(install(".Traceback"), s, R_GlobalEnv); */
>>> @@ -1440,9 +1440,11 @@
>>>      PrintWarnings();
>>>      }
>>>  }
>>> -
>>> +/*
>>> + * Return the traceback without deparsing the calls
>>> + */
>>>  attribute_hidden
>>> -SEXP R_GetTraceback(int skip)
>>> +SEXP R_GetTracebackParsed(int skip)
>>>  {
>>>      int nback = 0, ns;
>>>      RCNTXT *c;
>>> @@ -1467,7 +1469,9 @@
>>>          if (skip > 0)
>>>          skip--;
>>>          else {
>>> -        SETCAR(t, deparse1m(c->call, 0, DEFAULTDEPARSE));
>>> +                // Extra paranoid PROTECTS
>>> +        SETCAR(t, PROTECT(duplicate(c->call)));
>>> +                UNPROTECT(1);
>>>          if (c->srcref && !isNull(c->srcref)) {
>>>              SEXP sref;
>>>              if (c->srcref == R_InBCInterpreter)
>>> @@ -1482,7 +1486,26 @@
>>>      UNPROTECT(1);
>>>      return s;
>>>  }
>>> +/*
>>> + * Return the traceback with calls deparsed
>>> + */
>>> +attribute_hidden
>>> +SEXP R_GetTraceback(int skip)
>>> +{
>>> +    int nback = 0;
>>> +    SEXP s, t, u, v;
>>> +    s = PROTECT(R_GetTracebackParsed(skip));
>>> +    for(t = s; t != R_NilValue; t = CDR(t)) nback++;
>>> +    u = v = PROTECT(allocList(nback));
>>>  
>>> +    for(t = s; t != R_NilValue; t = CDR(t), v=CDR(v)) {
>>> +        SETCAR(v, PROTECT(deparse1m(CAR(t), 0, DEFAULTDEPARSE)));
>>> +        UNPROTECT(1);
>>> +    }
>>> +    UNPROTECT(1);
>>> +    return u;
>>> +}
>>> +
>>>  SEXP attribute_hidden do_traceback(SEXP call, SEXP op, SEXP args, SEXP rho)
>>>  {
>>>      int skip;
>> 
>>> 
>>> ______________________________________________
>>> R-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> 
>> 
>> 
> 
> -- 
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa                  Phone:            319-335-3386
> Department of Statistics and        Fax:              319-335-3017
>    Actuarial Science
> 241 Schaeffer Hall                  email:  luke-tierney using uiowa.edu
> Iowa City, IA 52242                WWW:  http://www.stat.uiowa.edu

> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list