[R] odd behavior of "summary" function

Erik Iverson eriki at ccbr.umn.edu
Tue Aug 24 19:24:20 CEST 2010


summary.default uses the signif function to round for display purposes.

In ?summary, we can see the digits argument is used to control
the value passed to signif.

 > lapply(1:6, function(x) summary(c(6, 207936), digits = x))

[[1]]
    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
   6e+00   5e+04   1e+05   1e+05   2e+05   2e+05

[[2]]
    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
       6   52000  100000  100000  160000  210000

[[3]]
    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
       6   52000  104000  104000  156000  208000

[[4]]
    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
       6   51990  104000  104000  156000  207900

[[5]]
    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
       6   51988  103970  103970  155950  207940

[[6]]
     Min.  1st Qu.   Median     Mean  3rd Qu.     Max.
      6.0  51988.5 103971.0 103971.0 155954.0 207936.0


Mike Williamson wrote:
> Hello All,
> 
>     Using the standard "summary" function in 'R', I ran across some odd
> behavior that I cannot understand.  Easy to reproduce:
> 
> Typing:
> 
>    summary(c(6,207936))
> 
> Yields::
> 
>    Min. *1st Qu.  Median    Mean 3rd Qu.    Max.*
>       6   *51990  104000  104000  156000  207900*
> 
> 
>     None of these values are correct except for the minimum.  If I perform
> "quantile(c(6, 207936))", it gives the correct values.  I originally
> presumed that summary was merely calling "quantile" if it saw a numeric, but
> this doesn't seem to be the case.
>     Anyone know what's going on here?  On a related note, what is the
> statistically correct answer for calculating the 1st quartile & 3rd quartile
> when only 2 values are present?  I presume one takes the mid-point between
> the median (also calculated) and the min or max.  So in this case, 51988.5
> for 1st & 155953.5 for 3rd (which is what quantile calculates).  But taking
> 25% & 75% of the sum of the 2 also seems "reasonable".  Either way,
> "summary" is calculating the wrong number, and most disturbing is that it
> mis-calculates the max.
> 
>                                             Regards,
>                                                     Mike
> 
> 
> "Telescopes and bathyscaphes and sonar probes of Scottish lakes,
> Tacoma Narrows bridge collapse explained with abstract phase-space maps,
> Some x-ray slides, a music score, Minard's Napoleanic war:
> The most exciting frontier is charting what's already here."
>   -- xkcd
> 
> --
> Help protect Wikipedia. Donate now:
> http://wikimediafoundation.org/wiki/Support_Wikipedia/en
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list