[R] combined plot of observed and expected fractions

Gerrit Draisma gdraisma at xs4all.nl
Tue Nov 30 17:42:34 CET 2010


Dear R-users,
I have a dataset of numbers of cases with a certain age
and stage.

I want to plot the observed stage distribution by age
and compare it to an expected or predicted one.
As I want to plot these for several populations
I want to use lattice plots.

Below is example code I made that produces a plot
that satisfies me. I prefer this to using stacked bars.
But I would like to improve on the code.

Specifically:
Question 1: How to get cumulative fractions from
observed numbers?

Question 2: How to divide each number by the
total number in each age group?

Question 3:  How to construct an indicator
for combined categories (here:
observed/expected x Stage)

Question 4: Lattice: could I use data values
to locate the text?

I would appreciate any comments.
Thank you,
Gerrit.

=====
# stagexage.r
# compare observed and predicted stage by age.

library(lattice)
# simple data set
df<-data.frame(Age=rep(1:5,each=3),Stage=1:3)
# expectation
df$mu<-5+(df$Age-1)*df$Stage

# simulated data
df$N<-rpois(15,df$mu)

# for lattice xyplot
df<-reshape(df,direction="long",varying=c("N","mu"),timevar="OE",v.name="N")

# cumulate numbers over Age group and OE (Observed, Expected)
# Q1: how to aply cumsum by age and stage?
i<-df$Age+5*(df$OE-1)
P<-unlist(tapply(df$N,i,cumsum))

# compute probabilities
# Q2: How to divide by the total number by age and stage?
j<-3*(0:29%/%3) + 3
P<-P/P[j]

# combining OE and Stage in a single index
# for superposing graphs in one plot
# Q3: How to create an index from Stage and number (obs or exp)?
ix<-(df$Stage-1)*2+df$OE

# plot observed and predicted fractions in a "area" plot
j<-df$Stage!=3
xyplot(P[j]~Age,groups=ix[j],data=df[j,],type="o",
panel=function(x,y,...){
    panel.xyplot(x,y,...)
# Q4: would it be possible to position labels based
#     data values?
    panel.text(x=3,y=c(0.15,0.5,0.85),labels=paste("Stage", 1:3))
    },
scales=list(y=list(at=0:4/4)),
axs="i",ylim=c(0,1),
xlab="Age", ylab="Fraction",
par.settings=list(superpose.line=list(col=c("blue","red"),lty=1:2),
superpose.symbol=list(col=c("blue","red"),type=1)),
auto.key=list(text=c("Obs","Pred"),
    points=F,lines=T,type="o",divide=1,columns=2)
)

=====

-- 
Gerrit Draisma
Department of Public Health
Erasmus MC, University Medical Center Rotterdam
Room AE-235
P.O. Box 2040 3000 CA  Rotterdam The Netherlands
Phone: +31 10 7043787 Fax: +31 10 7038474
http://mgzlx4.erasmusmc.nl/pwp/?gdraisma



More information about the R-help mailing list