[Rd] Implicit vs explicit printing and the call stack
Prof Brian Ripley
ripley at stats.ox.ac.uk
Sun May 13 13:13:48 CEST 2007
On Sun, 13 May 2007, hadley wickham wrote:
>> First, it was not clear that you are talking about the output of
>> traceback(), which is _a representation of_ the call stack and depends on
>> the details of deparsing.
>
> Given that there is a substantial delay in one case, and not in the
> other, I had assumed (perhaps falsely) that there was something more
> fundamental going on that simply the representation of the call stack
> by traceback. Perhaps I should have been more clear that the output of
> traceback was a symptom of the problem, not what I was really
> interested in.
traceback() does very little apart from show you .Traceback, which is
constructed (as a representation) at the time of the error. My guess is
that the concern is the time it takes to create .Traceback, but see
below.
>> Second, the difference is I believe not implicit vs explicit printing, but
>> of printing an evaluated object vs printing a call. What your subject
>> line would correspond to is
>>
>> x <- ggplot(mtcars, aes(x = cyl, y = -mpg)) + scale_y_log10() +
>> geom_point()
>> x vs print(x)
>>
>> I can't demonstrate (what is aes?), but I expect that would show minimal
>> differences.
>
> I don't think that is the case. Using Gabor's reproducible example:
>
> library(lattice)
>
> x <- xyplot(conc ~ uptake, CO2, xlim = Inf)
> x
> traceback()
> print(x)
> traceback()
>
> shows the same differences that I reported.
It is not the same as regards my comments: your example showed the details
of mtcars, and that appeared to be your concern. That example does not
show the details of CO2.
Part of my point is that if your returned object does not contain the
evaluated dataset (but, say, a promise to it), this will not happen.
>> The issue seems to be whether arguments have been evaluated, and if so
>> whether they were promises. Promises can be deparsed back to the symbolic
>> representation rather than to their value. When you call print explicitly
>> _on the ggplot call_, you have promises to arguments (to support lazy
>> evaluation). As far as I know (this is a detail of your code), the object
>> ggplot returns has evaluated 'mtcars' and contains the value and not the
>> promise, hence the difference in the deparsing of the call stack.
>
> I'm not sure I completely follow, and I don't think your reasoning
> holds given the example above. Please correct me if I am wrong.
It does. In one traceback() you have print.trellis(x), in the other
print.trellis(<the result of get("x")>). This is because in print(x),
print.trellis is passed a promise, and in 'x', 'x' has been evaluated and
print and print.trellis are passed the value (a list).
[ ... ]
Let me try to spell out the mechanisms in so far as I understand them (R
lacks technical documentation):
A) print(x).
The R evaluator sets up an argument list containing a promise to evaluate
'x' in the calling environemnt. It then calls UseMethod("print"), which
needs to evaluate the first argument to find the class, finds the method,
and manipulates the call to be to print.ggplot. At that point the details
get complicated, but when evaluation gets to the body of print.ggplot, the
intention is that it is just as if print.ggplot(x) had been called and so
the first argument is a promise to evaluate 'x' in the calling environment
of print(). (Because the details vary across the internals, I am not sure
without tracing through the code if the promise is a new one or the
evaluated one: in either case it will deparse to 'x'.)
B) 'x' at top-level.
This creates an anonymous object whose value is the result of eval(x) (at
C level). The R evaluator then notices that the R_Visible flag is set and
calls PrintValueEnv on the anonymous object. The latter notices that the
object has the OBJECT bit set and so calls print() on the anonymous
object. At that point the name 'x' is not available and the anonymous
object deparses to its value.
Suppose that printing integers had a bug. Then my understanding is that
print(2L+3L) would have '2L+3L' in the deparsed call, and '2L+3L' would
have '5' (or '5L'). That is intended to illustrate that deparsing to the
call may or may not be more compact than to the value.
>> You chose to call traceback() without 'max.lines', and the answer would
>> seem to be
>>
>> 1) don't use a buggy print method,
>> 2) if you have to (to fix it), use 'max.lines' or do this in two steps.
>>
>> If you want your end users to see a nicer representation, reconsider what
>> you actually return in your object.
>
> I don't really care about the representation in the call stack
> (although it is a pain when debugging, even though max.lines helps),
> the main problem is the 4-5 second delay after an error before control
> returns to the user. This does not occur with an explicit print.
I can't reproduce that of course. It might well be that deparsing the
calls to dump the call stack to .Traceback is the problem. in which case
my '1)' applies. (See the comments about try in ?traceback that will help
you distinguish.)
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list