[R] Histogram questions

M. Edward (Ed) Borasky znmeb at aracnet.com
Tue Feb 6 18:24:59 CET 2001

On 6 Feb 2001, Thomas Vogels wrote:

> > Eg
> >  data(faithful)
> >  hist(faithful$eruptions,prob=TRUE)
> >  lines(density(faithful$eruptions))
> what appears to be working better for me in practice is to take care
> of ylim:
> > x <- rnorm(100)
> > hh <- hist (x, prob=TRUE, plot=FALSE)
> > dd <- density (x)
> > hist (x, prob=TRUE, ylim=c(0, max(hh$intensities, dd$y)))
> > lines (dd)
> the reason I bring this up is that:
> > hist (x, prob=TRUE)
> is not the same as:
> > plot (hh <- hist (x, prob=TRUE))

Actually, there are two issues:

1. When you assign the output of "hist" to an object, it computes both the
frequencies and the densities. When you later plot the object, it will take
the default, which is frequencies, unless you specify "prob=TRUE" when you do
the plot.

2. There is nothing to guarantee that the kernel density and the histogram will
have the same scale.

The code I use is

	h <- hist(dataset,plot=FALSE)
	d <- density(dataset)

	# scale plot for both histogram and density
	plot (d, ylim=c(0, max(h$density,d$y)))
	lines.histogram (h, prob=TRUE)

In the "faithful" example, it just happened that the scale on the histogram was
higher than on the density. In my datasets, I was getting pretty much the
opposite, which is why my code plotted the density first, until I figured out
how to compute the correct scale for the two overlaid plots.
znmeb at aracnet.com (M. Edward Borasky) http://www.aracnet.com/~znmeb

There are three kinds of people: those who are good with mathematics and those
who aren't.

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list