[R] Design-consistent variance estimate

Stas Kolenikov skolenik at gmail.com
Mon Aug 18 16:39:33 CEST 2008


On 8/16/08, Doran, Harold <HDoran at air.org> wrote:
> In terms of the "design" (which is a term used loosely) the schools were
> not randomly selected. They volunteered to participate in a pilot study.

Oh, that's a next level of disaster, then! You may have to work with
treatment effect models, of which there are many: propensity score
matching, nearest neighbor matching, instrumental variables, etc.
Those methods require asymptotics in terms of number of treatment
units, which would be schools -- and I would imagine those are
numbered in dozens rather than thousands in your study, so
straightforward application of those methods might be problematic...
At least I would augment my analysis with propensity score weights:
somehow estimate the (school level) probability of participating in
the study (I imagine you have the school characteristics at hand for
your complete universe of schools -- principal's education level, # of
computers per student, fraction free/reduced price lunch, whatever...
you probably know those better than I do :) ), and use inverse of that
probability as the probability weight. If the selection was
informative, you might see quite different results in weighted and
unweighted analysis.

> In Wolter (1985) he shows the variance of a cluster sample with a single strata
> and then extends that to the more general example. It turns out though in
> many educational assessment studies, the single stage cluster sample is a
> norm and not so rare.

I can see why. Thanks, I'll keep educational statistics examples in
mind for those kinds of designs!

-- 
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.



More information about the R-help mailing list