[R] Post Stratification
Mark Hempelmann
e.rehak at t-online.de
Sun Jun 18 22:06:20 CEST 2006
Dear WizaRds,
having met some of you in person in Vienna, I think even more fondly
of this community and hope to continue on this route. It was great
talking with you and learning from you. Thank you. I am trying to work
through an artificial example in post stratification. This is my dataset:
library(survey)
age <- data.frame(id=1:8, stratum=rep( c("S1","S2"),c(5,3)),
weight=rep(c(3,4),c(5,3)), nh=rep(c(5,3),c(5,3)),
Nh=rep(c(15,12),c(5,3)), y=c(23,25,27,21,22, 77,72,74) )
pop.types <- table(stratum=age$stratum)
age.post <- svydesign(ids=~1, strata=NULL, data=age, fpc=~Nh) ## no
clusters, no strata
post <- postStratify(design=age.post, strata=~stratum, population=pop.types)
svymean (~y, post)
svytotal (~y, post)
gives
mean SE
y 42.625 0.5467
total SE
y 341 4.3737
So, is it correct to define pop.types as the number of elements sampled
per stratum (nh) or rather the total of elements per stratum (Nh)? If so:
pop.types <- data.frame(stratum = c("S1","S2"), Freq = c(15, 12))
The help says: The 'population' totals can be specified as a table with
the strata variables in the margins, or as a data frame where one
column lists frequencies and the other columns list the unique
combinations of strata variables. ??
However, I compute:
Nh=c(15,12); nh=c(5,3); sh=by(age$y, age$stratum, var); N=sum(Nh)
# Mean estimator
y.bar=by(age$y, age$stratum, mean) ## 23.6; 74.33
estimator=1/N*sum(Nh*y.bar) ## 46.14815
# Variance estimator
vari=1/N^2*sum(Nh*(Nh-nh)*sh/nh)
sqrt(vari) ## .7425903
and with Taylor expansion .7750118
Please help me correct my mistakes. Thank you so much.
Yours
mark
More information about the R-help
mailing list