[R] Advice on approach to weighting survey

Thomas Lumley tlumley at uw.edu
Sun Oct 2 21:15:55 CEST 2011


On Sat, Oct 1, 2011 at 4:59 AM, Farley, Robert <FarleyR at metro.net> wrote:
> I'm about to add weights to a bus on-board survey dataset with ~150 variables and ~28,000 records.  My intention is to weight (for each bus "run") by boarding stop and alighting stop.  I've seen the Rake function of the Survey package, but it seems that converting to a "svydesign" might be excessive for my purpose.
>
> My dataset has a huge number of unique "Run-Boarding" and "Run-Alighting" groups each with a small number of records to expand.  Would it be easier to manually implement Iterative-Proportional-Fitting/Raking/Fratar/Furness on the data?  Or are there benefits to converting the data to a svydesign that would make it valuable?  This "traditional" weighting expands what we call unlinked (based on each boarding)trips.  I'm thinking of also using IPF/Raking to estimate linked (based on each individual) trips.  Would this change the consideration of using the svydesign process?
>

If you're planning to do any analysis afterwards it would be useful to
have the data in a svydesign object, or if you end up needing to do
weight trimming or bounding, or other slightly more complicated weight
adjustments.  Otherwise it might well just be easier to do your own
IPF algorithm.

   -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland



More information about the R-help mailing list