[R] hist() and density
Peter Dalgaard BSA
p.dalgaard at biostat.ku.dk
Fri Nov 17 13:54:58 CET 2000
Bill Simpson <wsi at gcal.ac.uk> writes:
> There were some questions about hist() a couple of days ago which
> triggered this post. My question/suggestion is about the y-axis in hist.
> There are reasons to prefer making the y-axis density=relative
> frequency/bin width. One reason is that the height of the plot does not
> depend on the bin width; another is that if your histogram is in density
> then you can easily superimpose a smooth theoretical pdf on top--they will
> be on the same scale. (BTW the best intro stats book--Freedman et al--only
> shows students how to make density histograms)
>
> It doesn't seem easy to make density on the y-axis with the current
> hist(). The freq argument only lets you choose counts (TRUE) or relative
> frequency (FALSE). I would like to suggest that freq (or some renamed
> version of the argument) take on the values
> counts
> freqs
> densities
> So that you can easily do a density histogram.
Look closer: freq=FALSE *does* give densitities. Try for instance
x<-rnorm(500,sd=100)
hist(x,freq=F)
curve(dnorm(x,sd=100),add=T)
freq=T gives absolute frequencies i.e. counts
> The strange thing to me is that if you do
> h<-hist(...)
> you get h$intensities, which is the densities. So hist calculates the
> densities, but doesn't let you plot them?
Well, it does...
> May I suggest you call it $densities? I don't know why you call it
> intensities--I thought in stats intensity meant the instantaneous rate of
> a point process.
I tend to agree here. Maybe singular $density is better, though.
> Also I find the help quite obscure:
>
> intensities
> values f^(x[i]), as estimated density values. If
> all(diff(breaks) == 1), they are the relative frequencies counts/n and in
> general satisfy sum[i; f^(x[i]) (b[i+1]-b[i])] = 1, where b[i] =
> breaks[i].
>
> May I suggest:
>
> densities estimated densities calculated by relative frequency/bin
> width, where relative frequency is count/n
> (I can't figure out why you've got powering ^ in there!)
I think that should read as "f hat". No particular reason to single
out the case of unit bin width, I agree.
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list