[Rd] summary.default rounding on numeric seems inconsistent with other R behaviors

Simone Giannerini sgiannerini at gmail.com
Fri Aug 19 18:24:07 CEST 2016


John,

I had raised the matter ten years ago, and I was told that the topic was
already very^3 old

https://stat.ethz.ch/pipermail/r-devel/2006-September/042684.html

there is some discussion on its origin and also a declaration of intents to
change the default behaviour, which, unfortunately, remained a declaration.
I agree that R could do better here, let's hope in less than ten years
though. ;-)

Kind regards,

Simone

On Fri, Aug 19, 2016 at 5:04 PM, John Mount <jmount at win-vector.com> wrote:

> I was wondering if it would make sense to change the default behavior of
> the following:
>
> summary(15555L)
> ##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
> ##   15560   15560   15560   15560   15560   15560
>
> summary.default on numeric values rounds values (not just presentation) to
> getOption("digits")-3L (or four) digits by default, making those values
> surprising and less suitable for further calculation.  Summary on matrix
> and data.frame do not do so.
>
> It seems it would be nice to have x=15555L; summary(x)[['Min.']] == min(x)
> evaluate to TRUE.  I know one can alter behavior by changing the global
> “digits” option, but I don’t know what other impacts that might have.
> Ideally I would think summary.default would not round its values at all,
> but use digits to control presentation (by overriding print and such).
> Even in presentation the rounding without switching to scientific notation
> (such as 1.556e+4) is a bit surprising (I understand rounding and
> scientific notation are two different presentation issues, but new users
> are very confused that something that appears to be an integer has been
> rounded).
>
> Example:
>
> summary(data.frame(x=15555))
> ##        x
> ##  Min.   :15555
> ##  1st Qu.:15555
> ##  Median :15555
> ##  Mean   :15555
> ##  3rd Qu.:15555
> ##  Max.   :15555
> summary(15555)
> ##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
> ##   15560   15560   15560   15560   15560   15560
>
> I have a (bit whiny) polemic trying to explain the pain point here
> http://www.win-vector.com/blog/2016/08/my-criticism-of-r-numeric-summary/
> <http://www.win-vector.com/blog/2016/08/my-criticism-of-r-numeric-summary/>
> (I am not trying to be rude, more I am trying to emphasize why this can be
> confusing to new users).
>
>
>
> ---------------
> John Mount
> http://www.win-vector.com/ <http://www.win-vector.com/>
> Our book: Practical Data Science with R http://www.manning.com/zumel/ <
> http://www.manning.com/zumel/>
>
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel




-- 
___________________________________________________

Simone Giannerini
Dipartimento di Scienze Statistiche "Paolo Fortunati"
Universita' di Bologna
Via delle belle arti 41 - 40126  Bologna,  ITALY
Tel: +39 051 2098262  Fax: +39 051 232153
http://www2.stat.unibo.it/giannerini/
___________________________________________________

	[[alternative HTML version deleted]]



More information about the R-devel mailing list