[Rd] Implicit vs explicit printing and the call stack

Sun May 13 13:13:48 CEST 2007

On Sun, 13 May 2007, hadley wickham wrote:

>> First, it was not clear that you are talking about the output of
>> traceback(), which is _a representation of_ the call stack and depends on
>> the details of deparsing.
>
> Given that there is a substantial delay in one case, and not in the
> other, I had assumed (perhaps falsely) that there was something more
> fundamental going on that simply the representation of the call stack
> by traceback. Perhaps I should have been more clear that the output of
> traceback was a symptom of the problem, not what I was really
> interested in.

traceback() does very little apart from show you .Traceback, which is 
constructed (as a representation) at the time of the error.  My guess is 
that the concern is the time it takes to create .Traceback, but see
below.

>> Second, the difference is I believe not implicit vs explicit printing, but
>> of printing an evaluated object vs printing a call.  What your subject
>> line would correspond to is
>> 
>> x <- ggplot(mtcars, aes(x = cyl, y = -mpg)) + scale_y_log10() + 
>> geom_point()
>> x vs print(x)
>> 
>> I can't demonstrate (what is aes?), but I expect that would show minimal
>> differences.
>
> I don't think that is the case.  Using Gabor's reproducible example:
>
> library(lattice)
>
> x <- xyplot(conc ~ uptake, CO2, xlim = Inf)
> x
> traceback()
> print(x)
> traceback()
>
> shows the same differences that I reported.

It is not the same as regards my comments: your example showed the details 
of mtcars, and that appeared to be your concern.  That example does not 
show the details of CO2.

Part of my point is that if your returned object does not contain the 
evaluated dataset (but, say, a promise to it), this will not happen.

>> The issue seems to be whether arguments have been evaluated, and if so
>> whether they were promises.  Promises can be deparsed back to the symbolic
>> representation rather than to their value.  When you call print explicitly
>> _on the ggplot call_, you have promises to arguments (to support lazy
>> evaluation).  As far as I know (this is a detail of your code), the object
>> ggplot returns has evaluated 'mtcars' and contains the value and not the
>> promise, hence the difference in the deparsing of the call stack.
>
> I'm not sure I completely follow, and I don't think your reasoning
> holds given the example above. Please correct me if I am wrong.

It does.  In one traceback() you have print.trellis(x), in the other 
print.trellis(<the result of get("x")>).  This is because in print(x), 
print.trellis is passed a promise, and in 'x', 'x' has been evaluated and 
print and print.trellis are passed the value (a list).

[ ... ]

Let me try to spell out the mechanisms in so far as I understand them (R 
lacks technical documentation):

A) print(x).

The R evaluator sets up an argument list containing a promise to evaluate 
'x' in the calling environemnt. It then calls UseMethod("print"), which 
needs to evaluate the first argument to find the class, finds the method, 
and manipulates the call to be to print.ggplot.  At that point the details 
get complicated, but when evaluation gets to the body of print.ggplot, the 
intention is that it is just as if print.ggplot(x) had been called and so 
the first argument is a promise to evaluate 'x' in the calling environment 
of print().  (Because the details vary across the internals, I am not sure 
without tracing through the code if the promise is a new one or the 
evaluated one: in either case it will deparse to 'x'.)

B) 'x' at top-level.

This creates an anonymous object whose value is the result of eval(x) (at 
C level).  The R evaluator then notices that the R_Visible flag is set and 
calls PrintValueEnv on the anonymous object.  The latter notices that the 
object has the OBJECT bit set and so calls print() on the anonymous 
object. At that point the name 'x' is not available and the anonymous 
object deparses to its value.

Suppose that printing integers had a bug.  Then my understanding is that 
print(2L+3L) would have '2L+3L' in the deparsed call, and '2L+3L' would 
have '5' (or '5L').  That is intended to illustrate that deparsing to the 
call may or may not be more compact than to the value.

>> You chose to call traceback() without 'max.lines', and the answer would
>> seem to be
>> 
>> 1) don't use a buggy print method,
>> 2) if you have to (to fix it), use 'max.lines' or do this in two steps.
>> 
>> If you want your end users to see a nicer representation, reconsider what
>> you actually return in your object.
>
> I don't really care about the representation in the call stack
> (although it is a pain when debugging, even though max.lines helps),
> the main problem is the 4-5 second delay after an error before control
> returns to the user.  This does not occur with an explicit print.

I can't reproduce that of course.  It might well be that deparsing the 
calls to dump the call stack to .Traceback is the problem. in which case 
my '1)' applies. (See the comments about try in ?traceback that will help 
you distinguish.)

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595