[R] Summary shows wrong maximum

Bert Gunter gunter.berton at gene.com
Wed Dec 6 21:04:45 CET 2006


Mike:

I offered no opinion -- and really didn't have any -- about the worthiness
of any of the comments that were made. I just liked Brian's little quotable
aside.

But since you bait me a bit ...

In general, I believe that showing th 2-3 most "important" -- **not
significant** -- digits **and no more** is desirable. By " most important" I
mean the leftmost digits which are changing in the data (there are some
caveats in the presence of extreme outliers). Printing more digits merely
obfuscates the ability of the eye/brain to perceive the patterns of change
in the data, the presumed intent of displaying it (not of storing it, of
course). Displaying excessive digits to demonstrate (usually falsely) one's
precision is evil. Clarity of communications is the standard we should
aspire to.

These views have been more eloquently expressed by  A.S.C Ehrenburg and
Howard Wainer among others...

-- Bert


Bert Gunter
Nonclinical Statistics
7-7374

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Mike Prager
Sent: Wednesday, December 06, 2006 11:46 AM
To: r-help at stat.math.ethz.ch
Subject: Re: [R] Summary shows wrong maximum

I don't know about candidacy, and I'm not going to argue about
"correctness," but it seems to me that the only valid reasons to
limit precision of printing in a statistics program are (1) to
save space and (2) to allow for machine limitations. This is
neither. To chop off information and replace it with zeroes is
just plain nasty.


Bert Gunter <gunter.berton at gene.com> wrote:

>  
> Folks:
> 
> Is 
> 
> "So this is at best a matter of opinion, 
> and credentials do matter for opinions."
> 
> -- Brian Ripley
> 
> an R fortunes candidate?
> 
> -- Bert Gunter
> 
> 
> On Tue, 5 Dec 2006, Oliver Czoske wrote:
> 
> > On Mon, 4 Dec 2006, Uwe Ligges wrote:
> >> Sebastian Spaeth wrote:
> >>> Hi all,
> >>> I have a list with a numerical column "cum_hardreuses". By coincidence
I
> >>> discovered this:
> >>>
> >>>> max(libs[,"cum_hardreuses"])
> >>> [1] 1793
> >>>
> >>>> summary(libs[,"cum_hardreuses"])
> >>>     Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
> >>>        1       2       4      36      14    1790
> >>>
> >>> (note the max value of 1790) Ouch this is bad! Anything I can do to
> remedy
> >>> this? Known bug?
> >>
> >> No, it's a feature! See ?summary: printing is done up to 3 significant
> >> digits by default.
> >
> > Unfortunately, '1790' is printed with *four* significant digits, not
> > three. The correct representation with three significant digits would
have
> > to employ scientific notation, 1.79e3.
> >
> >

-- 
Mike Prager, NOAA, Beaufort, NC
* Opinions expressed are personal and not represented otherwise.
* Any use of tradenames does not constitute a NOAA endorsement.

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list