[R] Two Phase Sampling

Tue Jul 11 15:40:53 CEST 2006

Dear WizaRds,

	I tried to construct a two-phase sampling design in Survey just the way 
I hoped understood in Vienna - I was wrong. I think I am too stupid to 
create the correct subset for phase 2. Phase1: Sample 1000 parts with 80 
defective. Phase2: Sample 100 parts out of these 1000 with  15 
defective. 0:ok, 1:defunct. The table below gives the conditional 
sampling values.

Please help me:

library(survey)
ss1 <- data.frame(id=1:1000, ph1.x=rep(c(1,0),c(10,990)),
subset=rep(c(1,0),c(100,900)), ph2.y=rep(c(1,0,NA),c(15,85,900)),
n1=rep(1000,1000), n2=rep(100,1000) )
table(ss1$ph1.y, ss1$ph2.x)

 >        Phase1.x
 >Phase2.y  0  1
 >       0 85  0
 >       1  5 10

p2 <- twophase(id=list(~id,~id), strata=list(NULL,NULL),
data=ss1, subset=~subset, fpc=list(~n1,~n2))
svymean (~ph2.y, design=p2s)

 >      mean SE
 >ph2.y 0.15  0

However, taking into consideration the 2nd sample, the estimator should be:

ph1.x.bar (phase1)=80/1000=0.08 and ph2.y.bar (phase2)=15/100=0.15 
defect boards, that means y.est=1.5*0.08=0.12 defect boards, since the 
RATIO ESTIMATOR equals 15/10=1.5 defect parts for the ratio of defect 
ph2/defect ph1.

What again did I do wrong? I am positive that the estimator is 12 
defective parts per 100 average, so how do I correctly construct the 
twophase design?

ps: I hope this is not sthg. undergraduates master eloquently...

Thank you so much for your help. I invite you to all the BBQ and beer 
there is in Europe!

Yours always
mark