[R] comparative density estimates
friendly at yorku.ca
Fri Mar 24 15:04:50 CET 2006
The cdplot is quite interesting too, though it answers a slightly
different question and seems to finess the bandwidth question
(maybe not a bad thing).
Here's a similar plot, fleshed out as my others:
Where <- factor(c(rep("North America", length(sub1)),
Year <- c(sub1, sub2)
cdplot(where ~ year, bw = "sj")
cdplot(Where ~ Year, bw = "sj", col=gray.colors(2,start=.7),
main="Milestones: Place of development"
abline(v= ref, lty=3, col="blue")
laby<- 0.6 + 0.05 * c(0, 1, 2, 3, 5, 3, 5, 2)
text(labx, laby, labels=txt1, cex=1.2, xpd=TRUE)
rug(sub1, quiet=TRUE, col="red", side=3)
This also solves the little problem I had with offsetting
the two rug plots (so as not to rely on color).
But I wonder why my main= title does not appear.
Achim Zeileis wrote:
> very nice and interesting plots!
> One alternative idea to compare the proportion of milestone items
> (that does not really answer the bandwith question) in Europe and North
> America might be a conditional density plot. After running your R
> source code, you could do:
> where <- factor(c(rep("North America", length(sub1)),
> rep("Europe", length(sub2))))
> year <- c(sub1, sub2)
> cdplot(where ~ year, bw = "sj")
> showing the decrease in the European proportion.
> Internally, this first computes the unconditional density as in
> plot(density(year, bw = "sj"))
> and then the density for Europe with the same bandwidth.
> Best wishes,
> On Thu, 23 Mar 2006 14:25:53 -0500 Michael Friendly wrote:
>>I have two series of events over time and I want to construct a graph
>>of the relative frequency/density of these events that allows their
>>be sensibly compared. The events are the milestones items in my
>>project on milestones in the history of data visualization , and I
>>want to compare trends
>>in Europe vs. North America.
>>I decided to use a graph of two overlaid density estimates with rug
>>plots, but then
>>the question arises of how to choose the bandwidth (BW) for the two
>>series to allow them
>>to be sensibly compared, because the range of time and total
>>for the two series. To avoid clutter on this list, I've placed the
>>data and R code
>>I have two versions of this graph, one selecting an optimal BW for
>>and the other using the adjust= argument of density() to
>>the BW to the value determined for the whole series combined. The
>>(done with SAS) are shown at
>>The densities in the first are roughly equivalent to the R code
>>d1 <- density(sub1, from=1500, to=1990, bw="sj", adjust=1)
>>d2 <- density(sub2, from=1500, to=1990, bw="sj", adjust=1)
>>the second to
>>d1 <- density(sub1, from=1500, to=1990, bw="sj", adjust=2.5)
>>d2 <- density(sub2, from=1500, to=1990, bw="sj", adjust=0.75)
>>The second graph seems to me to undersmooth the more extensive data
>>from Europe and undersmooth the data from North America.
>>- any comments or suggestions?
>>- are there other methods I should consider?
>>I did find overlap.Density() in the DAAG package, but perversely, it
>>uses a bw=
>>argument to select a B&W/grayscale plot.
>>Michael Friendly Email: friendly at yorku.ca
>>Professor, Psychology Dept.
>>York University Voice: 416 736-5115 x66249 Fax: 416 736-5814
>>4700 Keele Street http://www.math.yorku.ca/SCS/friendly.html
>>Toronto, ONT M3J 1P3 CANADA
>>R-help at stat.math.ethz.ch mailing list
>>PLEASE do read the posting guide!
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Street http://www.math.yorku.ca/SCS/friendly.html
Toronto, ONT M3J 1P3 CANADA
More information about the R-help