[R] large survey data set
Andrew Perrin
clists at perrin.socsci.unc.edu
Fri Jun 28 17:42:51 CEST 2002
This is interesting and a bit disturbing. I've been using the weights=
syntax to assign a case-weighting system in a survey dataset as well. Can
you send me somewhere for documentation of the differences?
Thanks.
----------------------------------------------------------------------
Andrew J Perrin - http://www.unc.edu/~aperrin
Assistant Professor of Sociology, U of North Carolina, Chapel Hill
clists at perrin.socsci.unc.edu * andrew_perrin (at) unc.edu
On Thu, 27 Jun 2002, Thomas Lumley wrote:
> On Thu, 27 Jun 2002, Andrew Perrin wrote:
>
> > The lm function (for linear modelling aka linear regression) includes
> > case weights with a simple syntax:
> >
> > foo<-lm(dependent ~ indep + indep + ... ,
> > data = <data object>,
> > weights = <weight variable>)
>
> Yes, but that isn't what he means by weights...
>
> The standard regression weights are variance weights: a weight of 2
> denotes an observation with half the variance of a weight of 1.
>
> In survey sampling (and in related missing data and causal inference
> models) you need probability weights: a weight of 2 means an observation
> had half the chance of being sampled. You get the same regression
> coefficients (more or less) but quite different standard errors.
>
> The `model-robust' sandwich variance estimators give about the right
> standard errors (as long as the sampling fraction is small). These are
> built in to the survival models, but not in most other software. They are
> pretty easy to calculate but with a 20% sample they probably aren't going
> to work well.
>
>
> -thomas
>
>
>
>
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list