[Rd] summary.default rounding on numeric seems inconsistent with other R behaviors

Martin Maechler maechler at stat.math.ethz.ch
Tue Aug 23 14:33:58 CEST 2016


>>>>> Dirk Eddelbuettel <edd at debian.org>
>>>>>     on Fri, 19 Aug 2016 11:40:05 -0500 writes:

    > It is the old story of defined behaviour and expected outcomes. Hard to
    > change now.

yes...  not impossible though... see below

    > So I would suggest you do something like this in your ~/.Rprofile:

    R> smry <- function(...) summary(..., digits=6)
    R> smry(155555L)
    > Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
    > 155555  155555  155555  155555  155555  155555
    R> 

    > Maybe call it Summary() instead.

yes, do use a different name.   There other such functions, 'summarize()'.

Simone wrote

> I had raised the matter ten years ago, and I was told that the topic was
> already very^3 old
> 
> https://stat.ethz.ch/pipermail/r-devel/2006-September/042684.html
> 
> there is some discussion on its origin and also a declaration of intents to
> change the default behaviour, which, unfortunately, remained a declaration.
> I agree that R could do better here, let's hope in less than ten years
> though. ;-)

and the 2006 thread he mentions is basically a similar question
and a reply by me that I agreed to some extent that a change was
desirable ... originally we had adhered to the S "standard"
which became the S+ one and at that time I did still have access
to a running instance of S-PLUS 6.2 where I had seen that
Insightful (the company selling curating and selling S-PLUS)
also had decided to change the ~15 year old S "standard"... and
indeed I was implicitly *asking* for proposals of such a change,
but I think I never saw a (careful) proposal.

In the spirit of probably 99% of other "base R" code, a change
should really *not* round __at all__ in the summary() methods,
but *only* in the print() methods of such summary() results.

OTOH, for back compatibility, if a user does use  summary(.., digits=.)
explicitly, these digits should be 'obeyed' of course.

I think summary(<1-variable>)  could easily, and relatively "back-compatibly"
be changed in the above vain.

One "real problem" is the wrong decision (also from S and S-PLUS
times IIRC) to return a "character" matrix for
   summary(<data.frame>, ..)
or summary(<matrix>, ..)
(For a data frame, I think it should return a list() of
 single-variable summary()es, or then a numeric matrix .. in
 both cases have a good print() method)

because when you return a character matrix, all the numbers are
already rounded, ... and if we follow the above approach they 
would have to be rounded further... ``the horror''

I wonder how much code out there is relying on the internal
structure of  summary(<data.frame>).. because that is the one
part I'd definitely want to change, too.


Martin



More information about the R-devel mailing list