[Rd] summary.default rounding on numeric seems inconsistent with other R behaviors
Simone Giannerini
sgiannerini at gmail.com
Fri Aug 19 18:24:07 CEST 2016
John,
I had raised the matter ten years ago, and I was told that the topic was
already very^3 old
https://stat.ethz.ch/pipermail/r-devel/2006-September/042684.html
there is some discussion on its origin and also a declaration of intents to
change the default behaviour, which, unfortunately, remained a declaration.
I agree that R could do better here, let's hope in less than ten years
though. ;-)
Kind regards,
Simone
On Fri, Aug 19, 2016 at 5:04 PM, John Mount <jmount at win-vector.com> wrote:
> I was wondering if it would make sense to change the default behavior of
> the following:
>
> summary(15555L)
> ## Min. 1st Qu. Median Mean 3rd Qu. Max.
> ## 15560 15560 15560 15560 15560 15560
>
> summary.default on numeric values rounds values (not just presentation) to
> getOption("digits")-3L (or four) digits by default, making those values
> surprising and less suitable for further calculation. Summary on matrix
> and data.frame do not do so.
>
> It seems it would be nice to have x=15555L; summary(x)[['Min.']] == min(x)
> evaluate to TRUE. I know one can alter behavior by changing the global
> “digits” option, but I don’t know what other impacts that might have.
> Ideally I would think summary.default would not round its values at all,
> but use digits to control presentation (by overriding print and such).
> Even in presentation the rounding without switching to scientific notation
> (such as 1.556e+4) is a bit surprising (I understand rounding and
> scientific notation are two different presentation issues, but new users
> are very confused that something that appears to be an integer has been
> rounded).
>
> Example:
>
> summary(data.frame(x=15555))
> ## x
> ## Min. :15555
> ## 1st Qu.:15555
> ## Median :15555
> ## Mean :15555
> ## 3rd Qu.:15555
> ## Max. :15555
> summary(15555)
> ## Min. 1st Qu. Median Mean 3rd Qu. Max.
> ## 15560 15560 15560 15560 15560 15560
>
> I have a (bit whiny) polemic trying to explain the pain point here
> http://www.win-vector.com/blog/2016/08/my-criticism-of-r-numeric-summary/
> <http://www.win-vector.com/blog/2016/08/my-criticism-of-r-numeric-summary/>
> (I am not trying to be rude, more I am trying to emphasize why this can be
> confusing to new users).
>
>
>
> ---------------
> John Mount
> http://www.win-vector.com/ <http://www.win-vector.com/>
> Our book: Practical Data Science with R http://www.manning.com/zumel/ <
> http://www.manning.com/zumel/>
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
--
___________________________________________________
Simone Giannerini
Dipartimento di Scienze Statistiche "Paolo Fortunati"
Universita' di Bologna
Via delle belle arti 41 - 40126 Bologna, ITALY
Tel: +39 051 2098262 Fax: +39 051 232153
http://www2.stat.unibo.it/giannerini/
___________________________________________________
[[alternative HTML version deleted]]
More information about the R-devel
mailing list