[R] yaxis of density plot as counts/bandwidth

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Nov 20 09:58:12 CET 2006


On Mon, 20 Nov 2006, Ido M. Tamir wrote:

> Dear all,
>
> I would like to change the scale of a density plot
> to something that represents counts/bandwidth.
>
> What I am actually plotting is density of genomic
> data along the chromosome, and I need a simple
> representation of e.g. genes/50000 bases.
>
> pos <- c(123423,124313,124790,,....)
> den <- density(pos,bw=50000)
> plot(den, yaxt="n", ylab="features/50kb")
> etc...
>
> Is the following correct?
>
> library("MASS")
> par(mfrow=c(2,2))
> hist(geyser$duration)
> den <- density(geyser$duration)
> plot(den)
> hist(geyser$duration,breaks=20)
> plot(den,yaxt="n", ylab="")
> fm <- den$bw * max(den$y) * den$n
> at <- pretty(c(0,max(den$y)), n=5)
> lab <- pretty(c(0,fm), n=5)
> axis(2,at=at[1:6],lab=lab[1:6])

No.  Your second histogram plot has bins of width 0.2, and so the bin 
heights need to be divided by 299*0.2 to get frequencies.  If you want to 
plot what is effectively the expected counts, do something like

> library("MASS")
> hist(geyser$duration,breaks=20)
> den <- density(geyser$duration)
> lines(den$x, 299*0.2*den$y)

which is similar to figure 5.9 in MASS apart from the y scale.

(If you have read MASS you will be aware that this is a very poor example, 
not even from a continuous distribution.)

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list