[R] relative frequency plot
Erik Iverson
iverson at biostat.wisc.edu
Thu Apr 27 22:13:08 CEST 2006
Martin -
Of course you are right. The documentation for truehist (and hist)
explains that fact nicely, which is why I thought to send him there.
Sorry for any confusion.
Thanks,
Erik
Martin Maechler wrote:
>>>>>>"Erik" == Erik Iverson <iverson at biostat.wisc.edu>
>>>>>> on Thu, 27 Apr 2006 13:44:16 -0500 writes:
>
>
> Erik> See ?truehist in the MASS package.
>
> Not in this case!
> truehist() also computes a density,
> and its values on the "y axis" are not probabilities, either!
> hist(*, freq = FALSE)
> is fully sufficient here -- the problem of the original poster
> was to understand that a density can have values larger than 1.
> It may be interesting and is somewhat disappointing for us
> "teachers of statistics" to see how many people have posted in
> the past on this exact topic, sometimes even more or less
> assuming that R was doing some things wrongly because it showed
> densities (or density estimates as here) with values larger than
> one... oh dear
> "Mit der Dummheit kaempfen Goetter selbst vergebens."
> - Friedrich Schiller, "Die Jungfrau von Orleans"
>
> Martin
>
> >> Philipp Pagel wrote:
> >> On Thu, Apr 27, 2006 at 10:48:39AM -0700, nlei at sfu.ca
> >> wrote:
> >>
> >>> Hi All,
> >>>
> >>> I want to use "hist" to get the relative frequency
> >>> plot. But the range of ylab is greater than 1,which I
> >>> think it should be less than 1 since it stands for the
> >>> probability.
> >>>
> >>> I'm confused. Could you please help me with it?
> >>
> >>
> >> I was pretty confused by that, too at first. The solution
> >> is that freq=False cause hist to plot the DENSITY rather
> >> than frequency. And density is not necesssarily the same
> >> as relative frequency. Excerpt from ?hist:
> >>
> >> density: values f^(x[i]), as estimated density values. If
> >> 'all(diff(breaks) == 1)', they are the relative
> >> frequencies 'counts/n' and in general satisfy sum[i;
> >> f^(x[i]) (b[i+1]-b[i])] = 1, where b[i] = 'breaks[i]'.
> >>
> >> If you want relative distance try something like this:
> >>
> >> myhist = hist(x,breaks=52, plot=F) myhist$counts =
> >> myhist$counts / sum(myhist$counts)
> >> plot(myhist,main=NULL,border=TRUE,xlab="days",xlim=c(0,6),lty=2)
> >>
> >> Not exactly clean, though -- we are messing with the
> >> myhist object...
> >>
> >>
> >> cu Philipp
> >>
>
> Erik> ______________________________________________
> Erik> R-help at stat.math.ethz.ch mailing list
> Erik> https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
> Erik> read the posting guide!
> Erik> http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list