[R] odd behavior of "summary" function

Peter Ehlers ehlers at ucalgary.ca
Tue Aug 24 19:47:55 CEST 2010


On 2010-08-24 11:06, Mike Williamson wrote:
> Hello All,
>
>      Using the standard "summary" function in 'R', I ran across some odd
> behavior that I cannot understand.  Easy to reproduce:
>
> Typing:
>
>     summary(c(6,207936))
>
> Yields::
>
>     Min. *1st Qu.  Median    Mean 3rd Qu.    Max.*
>        6   *51990  104000  104000  156000  207900*
>
>
>      None of these values are correct except for the minimum.  If I perform
> "quantile(c(6, 207936))", it gives the correct values.  I originally
> presumed that summary was merely calling "quantile" if it saw a numeric, but
> this doesn't seem to be the case.
>      Anyone know what's going on here?  On a related note, what is the
> statistically correct answer for calculating the 1st quartile&  3rd quartile
> when only 2 values are present?  I presume one takes the mid-point between
> the median (also calculated) and the min or max.  So in this case, 51988.5
> for 1st&  155953.5 for 3rd (which is what quantile calculates).  But taking
> 25%&  75% of the sum of the 2 also seems "reasonable".  Either way,
> "summary" is calculating the wrong number, and most disturbing is that it
> mis-calculates the max.
>
>                                              Regards,
>                                                      Mike

This is one of those (many) situations where reading the help pages
really helps nicely:

help(summary) points you to the 'digits' argument (as David has said)
and that probably defaults to 'digits=4' for you. So, no, R is not
miscalculating anything.

help(quantile) shows that there are quite a few ways to define
quantiles and that R defaults to 'type=7'.

   -Peter Ehlers



More information about the R-help mailing list