[R] hist() and density

Peter Dalgaard BSA p.dalgaard at biostat.ku.dk
Fri Nov 17 13:54:58 CET 2000


Bill Simpson <wsi at gcal.ac.uk> writes:

> There were some questions about hist() a couple of days ago which
> triggered this post. My question/suggestion is about the y-axis in hist.
> There are reasons to prefer making the y-axis density=relative
> frequency/bin width. One reason is that the height of the plot does not
> depend on the bin width; another is that if your histogram is in density
> then you can easily superimpose a smooth theoretical pdf on top--they will
> be on the same scale. (BTW the best intro stats book--Freedman et al--only
> shows students how to make density histograms)
> 
> It doesn't seem easy to make density on the y-axis with the current
> hist().  The freq argument only lets you choose counts (TRUE) or relative
> frequency (FALSE). I would like to suggest that freq (or some renamed
> version of the argument) take on the values
> counts
> freqs
> densities
> So that you can easily do a density histogram.

Look closer:  freq=FALSE  *does* give densitities. Try for instance

x<-rnorm(500,sd=100)
hist(x,freq=F)
curve(dnorm(x,sd=100),add=T)

freq=T gives absolute frequencies i.e. counts

> The strange thing to me is that if you do
> h<-hist(...)
> you get h$intensities, which is the densities. So hist calculates the
> densities, but doesn't let you plot them?

Well, it does...

> May I suggest you call it $densities? I don't know why you call it
> intensities--I thought in stats intensity meant the instantaneous rate of
> a point process. 

I tend to agree here. Maybe singular $density is better, though.

> Also I find the help quite obscure:
> 
> intensities
>              values f^(x[i]), as estimated density values. If
> all(diff(breaks) == 1), they are the relative frequencies counts/n and in
> general satisfy sum[i; f^(x[i]) (b[i+1]-b[i])] = 1, where b[i] =
> breaks[i].
> 
> May I suggest:
> 
> densities	estimated densities calculated by relative frequency/bin
> width, where relative frequency is count/n
> (I can't figure out why you've got powering ^ in there!)

I think that should read as "f hat". No particular reason to single
out the case of unit bin width, I agree.

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list