[R] Overlying a Normal Dist in a Barplot

Sat Jul 9 06:51:08 CEST 2005

Offline, Marc pointed out to me that boxplot has an at= argument.
This suggests that we could substitute a boxplot command for the
rect command since a boxplot of c(0,a) looks like a bar from 0 to a if 
we use medlty=0 (which omits the median line) and boxwex=1
(which eliminates the space between the boxes).  This does have the 
advantage that one does not have to compute the corners of the rectangles
which my prior solution had to do.

(I also simplified the yrange calculation based on the fact that the height
of the density curve is less than the maximum testdata point so we can just 
take the range of the that.  Also I corrected x which should be 0:8 rather than 
what I wrote in the previous post.)

# data
testdata <- c(0.196454948, 0.063515510, 0.149187592, 0.237813885, 0.282127031, 
0.066469719, 0.001477105, 0.001477105, 0.001477105)
x <- 0:8

# setup plot ranges and axes
xrange <- range(x) + c(-0.5, +0.5)
yrange <- c(0, max(testdata)) 
plot(xrange, yrange, type = "n", xlab = "X", ylab = "Probability", xaxt = "n")

# draw bars using boxplot and density using curve
boxplot(as.data.frame(rbind(0,testdata)), at = x, names = x,
	boxwex = 1, medlty = 0, add = TRUE, col = "lightgrey")
curve(dnorm(x, 2.84, 1.57), min(xrange), max(xrange), add = TRUE)

On 7/8/05, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
> On 7/8/05, Bret Collier <bret at tamu.edu> wrote:
> > R-Users,
> > Hopefully someone can shed some light on these questions as I had
> > little luck searching the archives (although I probably missed something
> > in my search due to the search phrase).  I estimated multinomial
> > probabilities for some count data (number successful offspring) ranging
> > from 0 to 8 (9 possible response categories).  I constructed a barplot
> > (using barplot2) and I want to "overlay" a normal distribution on the
> > figure (using rnorm (1000, mean, sd)).  My intent is to show that using
> > a mean(and associated sd) estimated from discrete count data may not be
> > a valid representation of the distribution of successful offspring.
> >
> > Obviously the x and y axes (as structured in barplot2) will not be
> > equivalent for these 2 sets of information and this shows up in my
> > example below.
> >
> > 1)  Is it possible to somehow reconcile the underlying x-axis to the
> > same scale as would be needed to overly the normal distribution (e.g.
> > where 2.5 would fall on the normal density, I could relate it to 2.5 on
> > the barplot)?  Then, using axis (side=4) I assume I could insert a
> > y-axis for the normal distribution.
> >
> > 2)  Is lines(density(x)) the appropriate way to insert a normal
> > distribution into this type of figure?  Should I use 'curve'?
> >
> > If someone could point me in the right direction, I would appreciate
> > it.
> >
> > TIA, Bret
> >
> > Example:
> >
> > testdata
> > 0    0.196454948
> > 1    0.063515510
> > 2    0.149187592
> > 3    0.237813885
> > 4    0.282127031
> > 5    0.066469719
> > 6    0.001477105
> > 7    0.001477105
> > 8    0.001477105
> >
> >
> > x<-rnorm(1000, 2.84, 1.57)
> > barplot2(testdata, xlab="Fledgling Number",
> >             ylab="Probability", ylim=c(0, 1), col="black",
> >             border="black", axis.lty=1)
> > lines(density(x))
> >
> 
> Maybe something like this using rect and curve:
> 
> # data from your post
> testdata <- c(0.196454948, 0.06351551, 0.149187592, 0.237813885,
>  0.282127031, 0.066469719, 0.001477105, 0.001477105, 0.001477105)
> x <- 0:9
> 
> # setup plot ranges noting max of normal density is at mean
> xrange <- range(x) + c(-0.5,+0.5)
> yrange <- range(c(testdata, dnorm(2.84, 2.84, 1.57), 0))
> plot(xrange, yrange, type = "n", xlab = "X", ylab = "Probability", xaxt = "n")
> axis(1, x)
> 
> # draw bars using rect and density using curve
> rect(x - 0.5, 0, x + 0.5, testdata, col = "lightgrey")
> curve(dnorm(x, 2.84, 1.57), min(xrange), max(xrange), add = TRUE)
>