[Rd] Missing objects using dump.frames for post-mortem debugging of crashed batch jobs. Bug or gap in documentation?

Martin Maechler maechler at stat.math.ethz.ch
Tue Nov 15 09:24:15 CET 2016


>>>>> nospam at altfeld-im de <nospam at altfeld-im.de>
>>>>>     on Tue, 15 Nov 2016 01:15:46 +0100 writes:

> Martin, thanks for the good news and sorry for wasting your (and others
> time) by not doing my homework and query bugzilla first (lesson learned!
> ).
> 
> I have tested the new implementation from R-devel and observe a semantic
> difference when playing with the parameters:
> 
>   # Test script 1
>   g <- "global"
>   f <- function(p) {
>     l <- "local"
>     dump.frames()
>   }
>   f("parameter")
> 
> results in
>   # > debugger()
>   # Message:  object 'server' not foundAvailable environments had calls:
>   # 1: source("~/.active-rstudio-document", echo = TRUE)
>   # 2: withVisible(eval(ei, envir))
>   # 3: eval(ei, envir)
>   # 4: eval(expr, envir, enclos)
>   # 5: .active-rstudio-document#9: f("parameter")
>   # 
>   # Enter an environment number, or 0 to exit  
>   # Selection: 5
>   # Browsing in the environment with call:
>   #   .active-rstudio-document#9: f("parameter")
>   # Called from: debugger.look(ind)
>   # Browse[1]> g
>   # [1] "global"
>   # Browse[1]> 
> 
> while dumping to a file
> 
>   # Test script 2
>   g <- "global"
>   f <- function(p) {
>     l <- "local"
>     dump.frames(to.file = TRUE, include.GlobalEnv = TRUE)
>   }
>   f("parameter")
> 
> results in
>   # > load("last.dump.rda")
>   # > debugger()

>   # Message:  object 'server' not foundAvailable environments had calls:
>   # 1: .GlobalEnv
>   # 2: source("~/.active-rstudio-document", echo = TRUE)
>   # 3: withVisible(eval(ei, envir))
>   # 4: eval(ei, envir)
>   # 5: eval(expr, envir, enclos)
>   # 6: .active-rstudio-document#11: f("parameter")
>   # 
>   # Enter an environment number, or 0 to exit  
>   # Selection: 6
>   # Browsing in the environment with call:
>   #   .active-rstudio-document#11: f("parameter")
>   # Called from: debugger.look(ind)
>   # Browse[1]> g
>   # Error: object 'g' not found
>   # Browse[1]> 

Your call to f() and the corresponding dump is heavily
obfuscated by all the wrap paper that Rstudio seems to wrap around a
simple function call (or just around using debugger() ?).

All this was to get the correct environments when things are run
in a batch job... and there's no Rstudio gift wrapping in that case.

In my simple use of the above, "g" is clearly available in the .GlobalEnv
component of last.dump :

> exists("g", last.dump$.GlobalEnv)
[1] TRUE
> get("g", last.dump$.GlobalEnv)
[1] "global"
> 

and that's all what's promised, right?
In such a post mortem debugging, notably from a batch job (!),
you don't want your .GlobalEnv to be *replaced* by the
.GlobalEnv from 'last.dump', do you?

I think in the end, I think you are indirectly asking for new features to be
added to  debugger(), namely that it works more seemlessly with
a last.dump object that has been created via 'include.GlobalEnv = TRUE'.

This wish for a new feature may be a very sensible wish.
I think it's fine if you add it as wish (for a new feature to
debugger()) to the R bugzilla site
( https://bugs.r-project.org/ -- after asking one of R core to
  add you to the list of "registered ones" there, see the
  boldface note in https://www.r-project.org/bugs.html )

Personally, I would only look into this issue if we also get a patch
proposal (see also https://www.r-project.org/bugs.html), because
already now you can easily get to "g" in your example.

Martin

> The semantic difference is that the global variable "g" is visible
> within the function "f" in the first version, but not in the second
> version.
> 
> If I dump to a file and load and debug it then the search path through
> the
> frames is not the same during run time vs. debug time.
> 
> An implementation with the same semantics could be achieved
> by applying this workaround currently:
> 
>   dump.frames()
>   save.image(file = "last.dump.rda")
> 
> Does it possibly make sense to unify the semantics?
> 
> THX!
> 
> 
> On Mon, 2016-11-14 at 11:34 +0100, Martin Maechler wrote:
> > >>>>> nospam at altfeld-im de <nospam at altfeld-im.de>
> > >>>>>     on Sun, 13 Nov 2016 13:11:38 +0100 writes:
> > 
> >     > Dear R friends, to allow post-mortem debugging In my
> >     > Rscript based batch jobs I use
> > 
> >     >    tryCatch( <R expression>, error = function(e) {
> >     > dump.frames(to.file = TRUE) })
> > 
> >     > to write the called frames into a dump file.
> > 
> >     > This is similar to the method recommended in the "Writing
> >     > R extensions" manual in section 4.2 Debugging R code (page
> >     > 96):
> > 
> >     > https://cran.r-project.org/doc/manuals/R-exts.pdf
> > 
> >     >> options(error = quote({dump.frames(to.file=TRUE); q()}))
> > 
> > 
> > 
> >     > When I load the dump later in a new R session to examine
> >     > the error I use
> > 
> >     >     load(file = "last.dump.rda") debugger(last.dump)
> > 
> >     > My problem is that the global objects in the workspace are
> >     > NOT contained in the dump since "dump.frames" does not
> >     > save the workspace.
> > 
> >     > This makes debugging difficult.
> > 
> > 
> > 
> >     > For more details see the stackoverflow question + answer
> >     > in:
> >     > https://stackoverflow.com/questions/40421552/r-how-make-dump-frames-include-all-variables-for-later-post-mortem-debugging/40431711#40431711
> > 
> > 
> > 
> >     > I think the reason of the problem is:
> >     > ------------------------------------
> > 
> >     > If you use dump.files(to.file = FALSE) in an interactive
> >     > session debugging works as expected because it creates a
> >     > global variable called "last.dump" and the workspace is
> >     > still loaded.
> > 
> >     > In the batch job scenario however the workspace is NOT
> >     > saved in the dump and therefore lost if you debug the dump
> >     > in a new session.
> > 
> > 
> >     > Options to solve the issue:
> >     > --------------------------
> > 
> >     > 1. Improve the documentation of the R help for
> >     > "dump.frames" and the R_exts manual to propose another
> >     > code snippet for batch job scenarios:
> > 
> >     >       dump.frames() save.image(file = "last.dump.rda")
> > 
> >     > 2. Change the semantics of "dump.frames(to.file = TRUE)"
> >     > to include the workspace in the dump.  This would change
> >     > the semantics implied by the function name but makes the
> >     > semantics consistent for both "to.file" param values.
> > 
> > There is a third option, already in place for three months now:
> > Andreas Kersting did propose it (nicely, as a wish),
> > 	https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17116
> > and I had added it to the development version of R back then :
> > 
> > ------------------------------------------------------------------------
> > r71102 | maechler | 2016-08-16 17:36:10 +0200 (Tue, 16 Aug 2016) | 1 line
> > 
> > dump.frames(*, include.GlobalEnv)
> > ------------------------------------------------------------------------
> > 
> > So, if you (or others) want to use this before next spring,
> > you should install a version of R-devel
> > and you use that, with
> > 
> >   tryCatch( <R expression>,
> >            error = function(e)
> > 	           dump.frames(to.file = TRUE, include.GlobalEnv = TRUE))
> > 
> > Using R-devel is nice and helpful for the R community, as you
> > will help finding bugs/problems in the new features (and
> > possibly changed features) we've introduced there. 
> > 
> > 
> > Best regards,
> > Martin



More information about the R-devel mailing list