[R] Using Sampling Weights in R

Thomas Lumley tlumley at u.washington.edu
Wed Apr 11 17:26:23 CEST 2007


On Tue, 10 Apr 2007, Thomas W. Volscho wrote:

> Dear List,
> I have a dataset that provides sampling weights (National Survey of
> Family Growth 2002).  I want to produce a cross-tabulation and use the
> provided sampling weights to obtain representative population estimates.
> (I believe they are simply frequency weights but codebook is
> uninformative).

They are almost certainly not simply frequency weights -- the NCHS web 
page on this survey describes a multistage sampling scheme and gives code 
examples for survey software using the design features.  If you only want 
point estimates and no intervals or p-values then it doesn't matter what 
type of weights they are.

> I can reproduce results (using this data) that were reported in a recent
> journal article, if I use SPSS or STATA--specifying the weight.

The "survey" package does analysis of surveys of this sort.

Judging from the Stata example at 
http://www.cdc.gov/nchs/data/nsfg/Ser2_Example1_FINAL.pdf
  if the data are in a data frame called 'nsfg' you would create a survey 
design object with

   dnsfg <- svydesign(id=~SECU_R, weight=~FINALWGT, strata=~SEST,
     data=nsfg)

and then you could get crosstabs with,eg,

   svytable(~agerx+pill, design=dnsfg)



 	-thomas



More information about the R-help mailing list