[R-sig-DCM] Weighting in DCMs

Thu Feb 24 19:22:49 CET 2011

Good question.  I think weighting post-hoc is perhaps better than nothing 
when a difference is expected -- but is almost certainly problematic since 
the assumption in the (prior) model fitting stage would have been that the 
overall sample was a "random" sample of the population.  So I'd only 
consider that if there is no other option, e.g., if all I had was summary 
statistics.

Instead, I'd suggest one (or both) of these options: (1) if you believe 
apriori that gender is so important, then fit it as a covariate or (if 
really important or highly differently distributed) as a separate sample, in 
which case you'll have estimates that could be used separately or (arguably) 
combined according to general mixture models; (2) sample the respondents 
prior to fitting to match the population prevalence (and ideally, bootstrap 
that sampling and model the "empirical" credible distribution).

Personally, barring other information and with the open issues around 
covariates, I'd probably assume that the goal is to get a good population 
estimate and not so much to focus on gender differences or covariates, in 
which case #2 would be my default, and which is similar to handling it with 
sample quotas but perhaps even better.

As for explaining to the client, with #1, I'd probably say "we specifically 
modeled the contribution of gender using all the information so weighting is 
not necessary."  For #2, the client might complain at "overpaying" or 
"throwing away" sample, so I'd likely say something like "resampling to 
ensure population match is part of good data cleaning, and -- if I 
bootstrapped -- no one was actually lost because we resampled [or 
bootstrapped] to see the effect across the whole sample".  However, since 
I'm my own client and analyst, that might be a harder sell to someone else 
:-)

If the client can handle it (or you're curious) it would be interesting to 
compare this to the weighting approach.  I'd be interested to hear other 
thoughts,

-- chris

--------------------------------------------------
From: "Data Analytics Corp." <walt at dataanalyticscorp.com>
Sent: Thursday, February 24, 2011 5:46 PM
To: <r-sig-dcm at r-project.org>
Subject: Re: [R-sig-DCM] Weighting in DCMs

> Hi,
>
> I don't have an answer to this, but I would like to hear arguments pro and 
> con on the weighting.
>
> Thanks,
>
> Walt
>
> ________________________
>
> Walter R. Paczkowski, Ph.D.
> Data Analytics Corp.
> 44 Hamilton Lane
> Plainsboro, NJ 08536
> ________________________
> (V) 609-936-8999
> (F) 609-936-3733
> walt at dataanalyticscorp.com
> www.dataanalyticscorp.com
> _____________________________________________________
>
> On 2/24/2011 12:31 PM, Dimitri Liakhovitski wrote:
>> Everyone, hi!
>>
>> it's not an R-related question - but it'd be great to hear your 
>> arguments.
>> Sometimes we run a DCM (HB) for a sample and then the clients say: Would 
>> you
>> please weight the results - our sample has too many women.
>>
>> I know there is a number of arguments against doing this. Which ones did 
>> you
>> find working best with your clients?
>>
>> Thank you!
>> Dimitri
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-SIG-DCM mailing list
>> R-SIG-DCM at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-dcm
>>
>
> _______________________________________________
> R-SIG-DCM mailing list
> R-SIG-DCM at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-dcm
>