[R] Histograms, density, and relative frequencies
Bret Collier
bacolli at uark.edu
Wed Jul 7 19:29:40 CEST 2004
R-users,
I have been using R for about 1 year, and I have run across a
couple of graphics problem that I am not quite sure how to address. I have
read up on the email threads regarding the differences between density and
relative frequencies (count/sum(count) on the R list, and I am hoping that
someone could provide me with some advice/comments concerning my
approach. I will admit that some of the underlying mathematics of the
density discussion are beyond my current understanding, but I am looking
into it.
I have a data set (600,000 obs) used to parameterize a probabilistic causal
model where each obs is a population response for one of 2 classes (either
regs1 and regs2). I have been attempting to create 1 marginal probability
plot with 2 lines (one for each class). Using my rather rough code, I
created a plot that seems to adhere to the commonly used (although from
what I can understand wrong) relative frequency histogram approach.
My rough code looks like this:
bk <- c(0, .05, .1, .15, .2, .25,.3, .35, 1)
par(mfrow=c(1, 1))
fawn1 <- hist(MFAWNRESID[regs1], plot=F, breaks=bk)
fawn2 <- hist(MFAWNRESID[regs2], plot=F, breaks=bk)
count1 <- fawn1$counts/sum(fawn1$counts)
count2 <- fawn2$counts/sum(fawn2$counts)
b <- c(0, .05, .1, .15, .2, .25, .3, .35)
plot(count1~b,xaxt="n", xlim=c(0, .5), ylim=c(0, .40), pch=".", bty="l")
lines(spline(count1~b), lty=c(1), lwd=c(2), col="black")
lines(spline(count2~b), lty=c(2), lwd=c(2), col="black")
axis(side=1, at=c(0, .05, .1, .15, .2, .25, .3, .35))
Using the above, I get frequency values for regs1 that look like this
(which is the same as output for my probabilistic model):
> count1
[1] 1.213378e-01 3.454324e-01 3.365343e-01 1.580839e-01 3.342101e-02
[6] 4.698426e-03 4.488942e-04 4.322685e-05
First, count1 is the frequency of occurrence within range 0-0.05, but when
plotted is the value at b=0 and does not really represent the range? Are
there any suggestions on a technique to approach this?
Next: Using the above code, the x-axis values end at 0.35, but the axis
continues (because bk ends at 1)? While there is the chance of occurrence
out past .35, it is low and I want to extend the lines to about .35 and
clip the x-axis. But, I have been unable to figure out how to clip Could
someone point me in the correct direction?
TIA,
Bret A. Collier
Arkansas Cooperative Fish and Wildlife Research Unit
Department of Biological Sciences University of Arkansas
More information about the R-help
mailing list