[R] Max vs summary inconsistency

Adam D. I. Kramer adik at ilovebacon.org
Mon Aug 27 20:54:59 CEST 2007


On Mon, 27 Aug 2007, François Pinard wrote:

>>> summary(m)
>>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
>>       1   13000   26280   25890   38550   50910 
>>> max(m)
>> [1] 50912
>
>> ...it seems to me like max() and summary(m)[6] ought to return the same
>> number.  Am I doing something wrong?
>
> Some may say that you did not scrutinize the documentation enough, as
> "summary" artificially limits the number of significant digits.

Indeed, several have said so in private email as well as email to the list.
Thanks to all, apologies for my lack of scrutiny.

> However, this question reoccurs often and regularly in these mailing
> lists, so at last, maybe something should be done about it, beyond
> documenting how it works.  Overall, too many users got mislead, that one
> may not so bluntly assert they are all wrong.

I would agree, and not only because I was misled: Several people are
scrutinizing the RESPONSE of summary()'s output, and noticing it is
incorrect.

However, it is very VERY likely that many more are NOT scrutinizing it, and
as such are forming false beliefs about their data sets, which may be
subsequently published or used in further analyses.

Taking a small step in the implementation of summary() to potentially
prevent the publication of incorrect data seems worthwhile. Certainly, any
researcher should check their output in many ways, but it makes no sense to
me that summary() would round its output to 4 significant digits by default.

> For example, resorting to scientific notation whenever non significant
> zero digits would have otherwise been printed.  This should clarify a bit
> that the printing precision got artificially limited.

I think this is a great solution, though I'm not sure whether scripts that
use summary() would break if passed a number in scientific notation.

That said, scripts that use summary() are probably assuming that the number
reported is maximally precise, and thus are making the same mistake I
did...and thus should indeed break!

--
Adam Kramer
Ph.D. Student, Social Psychology
University of Oregon
adik at uoregon.edu


More information about the R-help mailing list