[Rd] digits in summary.default
Martin Maechler
maechler at stat.math.ethz.ch
Fri Sep 15 09:52:14 CEST 2006
>>>>> "Simone" == Simone Giannerini <sgiannerini at gmail.com>
>>>>> on Thu, 14 Sep 2006 11:14:51 +0200 writes:
Simone> Dear all, the number of significant digits in
Simone> summary default is
Simone> digits = max(3, getOption("digits") - 3)
Simone> on my platform this results to be 4. The point is
Simone> that if you have, say, integer data of magnitude
Simone> greater than 10^3 the command summary will produce
Simone> heavily rounded results.
Simone> A simple example follow:
>> x <- c(123456,234567,345678)
>> x
Simone> [1] 123456 234567 345678
>> summary(x)
Simone> Min. 1st Qu. Median Mean 3rd Qu. Max. 123500
Simone> 179000 234600 234600 290100 345700
Simone> # quite different from
>> quantile(x)
Simone> 0% 25% 50% 75% 100%
Simone> 123456.0 179011.5 234567.0 290122.5 345678.0
Yes, a very very very old topic, and has been frequently on the
R lists.
The reason for this default has been compatibility with S
and in particular Splus-3.4 (1996) which used to be a partial
role model for R in its infancy.
However, I now see that Insightful also must have decided that
the old S setting was not satisfactory and that one can and
should do better.
Simone> Is it possible to adapt the number of significant
Simone> digits to the magnitude of the data? The first
Simone> thing that comes into my mind is
Simone> digits = nchar(trunc(max(x))) #
that's a first step of one thing to consider, yes,
but does need quite a bit of fixup before it's usable.
Since I've now seen the code of summary.default in S-plus 6.2,
I'm not in a good position to propose a code change here ---
unless Insightful ``donates'' their 3 lines of implementation to
R {which I think would be quite fair given the recent flurry of
things they've recently ported into S-plus 8.x}
Simone> If it is not possible then I think it would be nice
Simone> to mention the issue in the documentation.
The issue is mentioned but maybe in a too terse way.
I agree that I'd also want to change this behavior.
It's definitely too late for R 2.4.0, since although this may
seem like a small thing to do,
it can have quite a large effect in many outputs of R scripts.
Simone> Thanks for the attention,
Simone> Simone
>> R.version
..............
(does not really matter - here for once)
Martin Maechler, ETH Zurich
More information about the R-devel
mailing list