# [R] Survey and Stratification

Mark Hempelmann neo27 at t-online.de
Thu May 26 16:10:41 CEST 2005

```Dear WizaRds,

Working through sampling theory, I tried to comprehend the concept of
stratification and apply it with Survey to a small example. My question
is more of theoretic nature, so I apologize if this does not fully fit
this board's intention, but I have come to a complete stop in my efforts

age<-matrix(c(rep(1,5), rep(2,3), 1:8, rep(3,5), rep(4,3), rep(5,5),
rep(3,3), rep(15,5), rep(12,3), 23,25,27,21,22, 33,27,29), ncol=6, byrow=F)
colnames(age)<-c("stratum", "id", "weight", "nh", "Nh", "y")
age<-as.data.frame(age)

## create survey design object
age.des1<-svydesign(ids=~id, strata=~stratum, weight=~Nh, data=age)
svymean(~y, age.des1)
## gives mean 25.568, SE 0.9257

age.des2<-svydesign(ids=~id, strata=~stratum, weight=~I(nh/Nh), data=age)
svymean(~y, age.des2)
## gives mean 25.483, SE 0.9227

age.des3<-svydesign(ids=~id, strata=~stratum, weight=~weight, data=age)
svymean(~y, age.des3)
## gives mean 26.296, SE 0.9862

age.des4<-svydesign(ids=~id, strata=~stratum, data=age)
svymean(~y, age.des4)
## gives mean 25.875, SE 0.9437

age.des3 is the only estimator I am able to compute per hand correctly.
It is stratified random sampling with inverse probablility weighting
with weight= nh/Nh ## sample size/ stratum size.

Basically, I thought the option weight=~Nh as well as weight=~I(nh/Nh)
would result in the same number, but it does not. I am reading
Thompson(02), Cochran(77) and of course Lumley on his Survey package,
but I can't find my mistake.

I thought the Hansen-Hurwitz estimator per stratum offers the right numbers:
p1=5/15, p2=3/12, so y1.total=1/5*(3*118), y2.total=1/3*(4*89) and the
stratified estimator with this design should be:
1/27(y1.total+y2.total), obviously wrong. How on earth do I get the
numbers Survey is calculating?

I am very sorry to bother you with this problem, however, I didn't find
anybody who was willing to help me.

Thank you so much
Mark

```