[Rd] digits in summary.default

Martin Maechler
Fri Sep 15 09:52:14 CEST 2006

Simone Giannerini
on Thu, 14 Sep 2006 11:14:51 +0200 writes:

    Simone> Dear all, the number of significant digits in
    Simone> summary default is

    Simone> digits = max(3, getOption("digits") - 3)

    Simone> on my platform this results to be 4. The point is
    Simone> that if you have, say, integer data of magnitude
    Simone> greater than 10^3 the command summary will produce
    Simone> heavily rounded results.

    Simone>   A simple example follow:

    >> x <- c(123456,234567,345678)

    >> x
    Simone> [1] 123456 234567 345678

    >> summary(x)
    Simone>    Min. 1st Qu.  Median Mean 3rd Qu.  Max.  123500
    Simone> 179000 234600 234600 290100 345700

    Simone> # quite different from

    >> quantile(x)
    Simone>       0%        25%     50%      75%     100%
    Simone>  123456.0 179011.5 234567.0 290122.5 345678.0

Yes, a very very very old topic, and has been frequently on the
R lists.
The reason for this default has been compatibility with S
and in particular Splus-3.4 (1996) which used to be a partial
role model for R in its infancy.

However, I now see that Insightful also must have decided that
the old S setting was not satisfactory and that one can and
should do better.

    Simone> Is it possible to adapt the number of significant
    Simone> digits to the magnitude of the data?  The first
    Simone> thing that comes into my mind is 

    Simone>      digits = nchar(trunc(max(x))) #

that's a first step of one thing to consider, yes,
but does need quite a bit of fixup before it's usable.

Since I've now seen the code of summary.default in S-plus 6.2,
I'm not in a good position to propose a code change here ---
unless Insightful ``donates'' their 3 lines of implementation to
R  {which I think would be quite fair given the recent flurry of
    things they've recently ported into S-plus 8.x}

    Simone> If it is not possible then I think it would be nice
    Simone> to mention the issue in the documentation.

The issue is mentioned but maybe in a too terse way.

I agree that I'd also want to change this behavior.
It's definitely too late for R 2.4.0, since although this may
seem like a small thing to do,
it can have quite a large effect in many outputs of R scripts.

    Simone> Thanks for the attention,

    Simone> Simone

    >> R.version
    (does not really matter - here for once)

Martin Maechler, ETH Zurich

