[Rd] [External] Mitigating Stalls Caused by Call Deparse on Error

brodie gaslam brod|e@g@@|@m @end|ng |rom y@hoo@com
Sun Jul 14 16:32:59 CEST 2019


Luke, thanks for considering the issue.  I would like to
try to separate the problem into two parts, as I _think_ 
your comments address primarily part 2 below:

1. How can we avoid significant and possibly crippling
   stalls on error with these non-standard calls.
2. What is the best way to view these non-standard calls.

I agree that issue 2. requires further thought and 
discussion under a wishlist issue ([on bugzilla now][1]).  
While I did raise issue 2., the patch itself makes no
attempt to resolve it.

The proposed patch resolves issue 1., which is a big
 usability problem.  Right now if you have the misfortune of
 using `do.call` with a big object and trigger an error, you 
have the choice of waiting a possibly long time for 
the deparse to complete, or killing your entire R session
 externally.

It seems a shame to allow a big usability issue for `do.call` 
to remain when there is a simple solution at hand, especially
since the complete deparse of large objects likely serves no
purpose in this case. Obviously, if storing the actual calls
 instead of their deparsed equivalents in .Traceback causes 
problems I'm not anticipating, then that's different.  
Is that the case?

Best,

Brodie.

[1]: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17580

On Sunday, July 14, 2019, 8:52:45 AM EDT, Tierney, Luke <luke-tierney using uiowa.edu> wrote: 





This is probably best viewed in the context of other issue with
displaying calls, such as issues arising from calls constructed in
non-standard evaluation contexts. Might be good to move to a wishlist
item in bugzilla.

Best,

luke

On Sat, 13 Jul 2019, brodie gaslam via R-devel wrote:

> When large calls cause errors R may stall for extended periods.  This
> is particularly likely to happen with `do.call`, as in this example
> with a 24 second stall:
>
>     x <- runif(1e7)
>     system.time(do.call(paste0, list(abs, x)))  # intentional error
>     ## Error in (function (..., collapse = NULL)  :
>     ##   cannot coerce type 'builtin' to vector of type 'character'
>     ## Calls: system.time -> do.call -> <Anonymous>
>     ## Timing stopped at: 23.81 0.149 24.04
>
>     str(.Traceback)
>     ## Dotted pair list of 3
>     ##  $ : chr [1:2500488] "(function (..., collapse = NULL) " ".Internal(paste0(list(...), collapse)))(.Primitive(\"abs\"), c(0.718117154669017, " "0.494785501621664, 0.1453434410505, 0.635028422810137, 0.0353180423844606, " "0.688418723642826, 0.889682895969599, 0.728154224809259, 0.292572240810841, " ...
>     ##  $ : chr "do.call(paste0, list(abs, x))"
>     ##  $ : chr "system.time(do.call(paste0, list(abs, x)))"
>
> The first time I noticed this I thought my session had frozen/crashed
> as the standard interrupt ^C does not work during the deparse.  The
> stall happens when on error the call stack is deparsed prior to being
> saved to `.Traceback`.  The deparsing is done by `deparse1m` in native
> code, with the value of `getOption('deparse.max.lines')` which
> defaults to all lines.
>
> Since there is little value to seeing millions of lines of deparsed
> objects in `traceback()`, a simple work-around is to change the
> `deparse.max.lines` value:
>
>     options(deparse.max.lines=1)
>     system.time(do.call(paste0, list(abs, x)))
>     ## Error in (function (..., collapse = NULL)  :
>     ##   cannot coerce type 'builtin' to vector of type 'character'
>     ## Calls: system.time -> do.call -> <Anonymous>
>     ## Timing stopped at: 0 0 0
>
> Unfortunately this will affect all `deparse` calls, and it seems
> undesirable to pre-emptively enable it just for calls that might cause
> large deparses on error.
>
> An alternative is to store the actual calls instead of their deparsed
> character equivalents in `.Traceback`.  This defers the deparsing to
> when `traceback()` is used.  As per `?traceback`, it should be
> relatively safe to modify `.Traceback` in this way:
>
>> It is undocumented where .Traceback is stored nor that it is
>> visible, and this is subject to change.
>
> Deferring the deparsing to `traceback()` will give us the 
> opportunity to use a different `max.lines` setting as we do here
> with the patch applied:
>
>     system.time(do.call(paste0, list(abs, x)))
>     ## Error in (function (..., collapse = NULL)  :
>     ##   cannot coerce type 'builtin' to vector of type 'character'
>     ## Timing stopped at: 0.028 0 0.029
>
>     system.time(traceback(max.lines=3))
>     ## 3: (function (..., collapse = NULL)
>     ##    .Internal(paste0(list(...), collapse)))(.Primitive("abs"), c(0.535468587651849,
>     ##    0.0540027911774814, 0.732930393889546, 0.565360915614292, 0.544816034380347,
>     ##     ...
>     ## 2: do.call(paste0, list(abs, x))
>     ## 1: system.time(do.call(paste0, list(abs, x)))
>     ##    user  system elapsed
>     ##   0.000   0.000   0.003
>
>
> More generally, it might be better to have a different smaller default
> value for the lines to deparse when calls  are _displayed_ as parts of
> lists, as is the case with `traceback()`, or in `print(sys.calls())` and
> similar.
>
> I attach a patch that does this.  I have run some basic tests
> and `make check-devel` passes. I can file an issue on bugzilla
> if that is a better place to have this conversation (assuming there
> is interest in it).
>
> Best,
>
> Brodie
>
> PS: for some reason my mail client is refusing to attach the patch so I paste it
> starting on the next line.
> Index: src/gnuwin32/Rdll.hide
> ===================================================================
> --- src/gnuwin32/Rdll.hide    (revision 76827)
> +++ src/gnuwin32/Rdll.hide    (working copy)
> @@ -94,6 +94,7 @@
>   R_GetMaxNSize
>   R_GetMaxVSize
>   R_GetTraceback
> + R_GetTracebackParsed
>   R_GetVarLocSymbol
>   R_GetVarLocValue
>   R_HandlerStack
> Index: src/include/Defn.h
> ===================================================================
> --- src/include/Defn.h    (revision 76827)
> +++ src/include/Defn.h    (working copy)
> @@ -1296,6 +1296,7 @@
>  void NORET ErrorMessage(SEXP, int, ...);
>  void WarningMessage(SEXP, R_WARNING, ...);
>  SEXP R_GetTraceback(int);
> +SEXP R_GetTracebackParsed(int);
>  
>  R_size_t R_GetMaxVSize(void);
>  void R_SetMaxVSize(R_size_t);
> Index: src/library/base/R/traceback.R
> ===================================================================
> --- src/library/base/R/traceback.R    (revision 76827)
> +++ src/library/base/R/traceback.R    (working copy)
> @@ -16,9 +16,19 @@
>  #  A copy of the GNU General Public License is available at
>  #  https://www.R-project.org/Licenses/
>  
> -.traceback <- function(x = NULL) {
> -    if(is.null(x) && !is.null(x <- get0(".Traceback", envir = baseenv())))
> -    {}
> +.traceback <- function(x = NULL, max.lines=getOption("deparse.max.lines")) {
> +    if(!(is.numeric(max.lines) && !is.na(max.lines) &&
> +         as.integer(max.lines) > 0L)
> +    ) {
> +        max.lines <- -1L
> +    }
> +    if(is.null(x) && !is.null(x <- get0(".Traceback", envir = baseenv()))) {
> +        for(i in seq_along(x)) {
> +            srcref <- attr(x[[i]], 'srcref')
> +            x[[i]] <- deparse(x[[i]], nlines=max.lines)
> +            attr(x[[i]], 'srcref') <- srcref
> +        }
> +    }
>      else if (is.numeric(x))
>          x <- .Internal(traceback(x))
>      x
> @@ -26,7 +36,16 @@
>  
>  traceback <- function(x = NULL, max.lines = getOption("deparse.max.lines"))
>  {
> -    n <- length(x <- .traceback(x))
> +    valid.max.lines <- is.numeric(max.lines) && !is.na(max.lines) &&
> +         as.integer(max.lines) > 0L
> +
> +    if(valid.max.lines) {
> +        # max.lines + 1L so we can know that output was truncated by .traceback
> +        max.lines.2 <- as.integer(max.lines) + 1L
> +    } else {
> +        max.lines.2 <- -1L
> +    }
> +    n <- length(x <- .traceback(x, max.lines=max.lines.2))
>      if(n == 0L)
>          cat(gettext("No traceback available"), "\n")
>      else {
> @@ -40,7 +59,7 @@
>                  paste0(" at ", basename(srcfile$filename), "#", srcref[1L])
>              }
>              ## Truncate deparsed code (destroys attributes of xi)
> -            if(is.numeric(max.lines) && max.lines > 0L && max.lines < m) {
> +            if(valid.max.lines &&  max.lines < m) {
>                  xi <- c(xi[seq_len(max.lines)], " ...")
>                  m <- length(xi)
>              }
> Index: src/library/base/man/traceback.Rd
> ===================================================================
> --- src/library/base/man/traceback.Rd    (revision 76827)
> +++ src/library/base/man/traceback.Rd    (working copy)
> @@ -21,7 +21,7 @@
>  }
>  \usage{
>  traceback(x = NULL, max.lines = getOption("deparse.max.lines"))
> -.traceback(x = NULL)
> +.traceback(x = NULL, max.lines = getOption("deparse.max.lines"))
>  }
>  \arguments{
>    \item{x}{\code{NULL} (default, meaning \code{.Traceback}), or an
> Index: src/main/errors.c
> ===================================================================
> --- src/main/errors.c    (revision 76827)
> +++ src/main/errors.c    (working copy)
> @@ -1008,7 +1008,7 @@
>         (which should not happen) */
>      if (traceback && inError < 2 && inError == oldInError) {
>          inError = 2;
> -        PROTECT(s = R_GetTraceback(0));
> +        PROTECT(s = R_GetTracebackParsed(0));
>          SET_SYMVALUE(install(".Traceback"), s);
>          /* should have been defineVar
>             setVar(install(".Traceback"), s, R_GlobalEnv); */
> @@ -1440,9 +1440,11 @@
>      PrintWarnings();
>      }
>  }
> -
> +/*
> + * Return the traceback without deparsing the calls
> + */
>  attribute_hidden
> -SEXP R_GetTraceback(int skip)
> +SEXP R_GetTracebackParsed(int skip)
>  {
>      int nback = 0, ns;
>      RCNTXT *c;
> @@ -1467,7 +1469,9 @@
>          if (skip > 0)
>          skip--;
>          else {
> -        SETCAR(t, deparse1m(c->call, 0, DEFAULTDEPARSE));
> +                // Extra paranoid PROTECTS
> +        SETCAR(t, PROTECT(duplicate(c->call)));
> +                UNPROTECT(1);
>          if (c->srcref && !isNull(c->srcref)) {
>              SEXP sref;
>              if (c->srcref == R_InBCInterpreter)
> @@ -1482,7 +1486,26 @@
>      UNPROTECT(1);
>      return s;
>  }
> +/*
> + * Return the traceback with calls deparsed
> + */
> +attribute_hidden
> +SEXP R_GetTraceback(int skip)
> +{
> +    int nback = 0;
> +    SEXP s, t, u, v;
> +    s = PROTECT(R_GetTracebackParsed(skip));
> +    for(t = s; t != R_NilValue; t = CDR(t)) nback++;
> +    u = v = PROTECT(allocList(nback));
>  
> +    for(t = s; t != R_NilValue; t = CDR(t), v=CDR(v)) {
> +        SETCAR(v, PROTECT(deparse1m(CAR(t), 0, DEFAULTDEPARSE)));
> +        UNPROTECT(1);
> +    }
> +    UNPROTECT(1);
> +    return u;
> +}
> +
>  SEXP attribute_hidden do_traceback(SEXP call, SEXP op, SEXP args, SEXP rho)
>  {
>      int skip;

>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:            319-335-3386
Department of Statistics and        Fax:              319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:  luke-tierney using uiowa.edu
Iowa City, IA 52242                WWW:  http://www.stat.uiowa.edu



More information about the R-devel mailing list