[Rd] Error in R-Intro document (PR#13079)
Martin Maechler
maechler at stat.math.ethz.ch
Sat Sep 27 18:25:26 CEST 2008
>>>>> "d" == davidhedin <davidhedin at mac.com>
>>>>> on Sat, 27 Sep 2008 07:10:06 +0200 (CEST) writes:
d> Full_Name: David Hedin Version: R 2.6.0 GUI 1.21 OS: Mac
d> 10.4.11 Submission from: (NULL) (24.205.60.123)
d> page 64 of the R introduction document makes the claim
d> "If the probability=TRUE argument is given, the bars
d> represent relative frequencies instead of counts"
d> This is wrong, the densities (relative frequency/class
d> width) are given, not the relative frequency. It's only
d> true when the class width is 1.
Thank you; I have added "divided by the bin width".
Note that one could argue that it really depends on the
definition of "relative" if paragraph you cite is really wrong.
I could define "relative" to mean
"relative WRT to the total and the bin width" :-)
D> What IS the code which will produce a relative frequency
d> histogram?
Well, do you really want a y-axis scale which is neither
counts nor has the usual density scale?
I'd recommend against that.
Here's an example derived from help(plot.histogram) :
wwt <- hist(women$weight, nclass = 7, plot = FALSE)
## modify the result to show "relative frequencies"
wt. <- wwt; wt.$density <- wwt$density * diff(wwt$breaks)[1]
plot(wt., freq=FALSE, ylab="Relative Frequency")
## or probably rather
wtP <- wwt; wtP$density <- wwt$density * 100 * diff(wwt$breaks)[1]
plot(wtP, freq=FALSE, ylab="Relative Frequency [ % ]")
But note that I would strongly advocate to use the default of
counts instead of the above, since from counts, one intuitively
gets a notion of *precision* (most people have a crude
approximation of the Poisson built in their brains :-)
which is completely lost when switching to percents.
Martin Maechler, ETH Zurich
More information about the R-devel
mailing list