[R] odd behavior of "summary" function
Peter Ehlers
ehlers at ucalgary.ca
Tue Aug 24 19:47:55 CEST 2010
On 2010-08-24 11:06, Mike Williamson wrote:
> Hello All,
>
> Using the standard "summary" function in 'R', I ran across some odd
> behavior that I cannot understand. Easy to reproduce:
>
> Typing:
>
> summary(c(6,207936))
>
> Yields::
>
> Min. *1st Qu. Median Mean 3rd Qu. Max.*
> 6 *51990 104000 104000 156000 207900*
>
>
> None of these values are correct except for the minimum. If I perform
> "quantile(c(6, 207936))", it gives the correct values. I originally
> presumed that summary was merely calling "quantile" if it saw a numeric, but
> this doesn't seem to be the case.
> Anyone know what's going on here? On a related note, what is the
> statistically correct answer for calculating the 1st quartile& 3rd quartile
> when only 2 values are present? I presume one takes the mid-point between
> the median (also calculated) and the min or max. So in this case, 51988.5
> for 1st& 155953.5 for 3rd (which is what quantile calculates). But taking
> 25%& 75% of the sum of the 2 also seems "reasonable". Either way,
> "summary" is calculating the wrong number, and most disturbing is that it
> mis-calculates the max.
>
> Regards,
> Mike
This is one of those (many) situations where reading the help pages
really helps nicely:
help(summary) points you to the 'digits' argument (as David has said)
and that probably defaults to 'digits=4' for you. So, no, R is not
miscalculating anything.
help(quantile) shows that there are quite a few ways to define
quantiles and that R defaults to 'type=7'.
-Peter Ehlers
More information about the R-help
mailing list