[R-sig-ME] Predicted probabilites with CIs for multilevel logistic regression with prior weights

Sam Crawley @@m_cr@w|ey @end|ng |rom w@rpm@||@net
Mon Jun 10 22:49:11 CEST 2019


Yes, I am looking to apply post-stratification weights (the dataset I am using is from Eurobarometer, which provides the weights).

Thanks very much all for your help so far. It looks like I have a bit of reading to do, so I will continue to work on this and let you know if I have further questions.

Cheers,
Sam Crawley.

On Tue, 11 Jun 2019, at 06:15, d.luedecke using uke.de wrote:
> > Feel free to follow up on issue 285 if you have more insight. 

> 

> At least from the technical side, I don’t have more insights, I guess. I already noticed the discussion in #285 some time ago, so I’m lurking, but not actively following 😉

> 

> Terminology used in different papers or from method reports of different surveys also doesn’t seem always consistent to me. I think, “post-stratification weights” were requested by Sam, which are weights based on group (or stratum) characteristics (like the distribution of age or gender proportions). Ben Bolker also mentioned sample weights in #285 (“in addition to these two cases, there's also the case of sampling weights, which is difficult/a mess for complex regression models but worth discussing at least ...”).

> 

> The difference between the weights-argument in typical regression model functions and the survey-package is “The survey package not only allows for adjusting the composition of a sample to the characteristics of the general population. Most base packages would allow you to do that by specifying a weights argument. The survey package goes further by correcting the design effect introduced by the application of post-stratification weights.” (https://tophcito.blogspot.com/2014/04/social-science-goes-r-weighted-survey.html).

> 

> This of course only applies if you actually have survey-data.

> 


> *Von:* Mollie Brooks <mollieebrooks using gmail.com> 
> *Gesendet:* Montag, 10. Juni 2019 19:46
> *An:* d.luedecke using uke.de
> *Cc:* Sam Crawley <sam_crawley using warpmail.net>; Help Mixed Models <r-sig-mixed-models using r-project.org>
> *Betreff:* Re: [R-sig-ME] Predicted probabilites with CIs for multilevel logistic regression with prior weights

> 

> 

>> On 10Jun 2019, at 19:40, <d.luedecke using uke.de> <d.luedecke using uke.de> wrote:

>> 

>> I think that Sam is talking about “sampling” or “survey” weights (as compared to analytical or frequency weights, used by “normal” regression models).

>> 

>> The issue you’re referring to is referenced by another issue (https://github.com/glmmTMB/glmmTMB/issues/440),

> 

> Yes, I (mebrooks) am the one who referenced it and the user (mmeierer) said it fit their needs for "sample weights".


> 

>> which in turn shows an example from Cross Validated:

>> https://stats.stackexchange.com/questions/57107/use-of-weights-in-svyglm-vs-glm

>> 

>> If I use that example, and add a third model fitted with glmmTMB, I get following result when comparing the weights from the fitted objects:

>> 

>> library(glmmTMB)

>> glm2 <- glmmTMB(re78 ~ treat, weights = w , data = lalonde)

>> cbind(glm1$weights, glm11$weights, glm2$frame$`(weights)`)

>> #> [,1] [,2] [,3]

>> #> 1 1.4682453 2.108394 2.108394

>> #> 2 0.9593877 1.377677 1.377677

>> #> 3 0.7489954 1.075554 1.075554

>> #> 4 0.7319955 1.051143 1.051143

>> #> 5 0.7283328 1.045883 1.045883

>> #> 6 0.7244569 1.040317 1.040317

>> 

>> As you can see, “glm” and “glmmTMB” produce the same weights, while the survey-package has different weights… I’m not sure that the weights implemented in glmmTMB are actually “sampling” weights (for surveys, as implemented in the survey package),

> 

> Ok. I don’t know the survey package and don’t have time to look into it now. Feel free to follow up on issue 285 if you have more insight. 

> 

> cheers,

> Mollie


> 

>> or how to reproduce such weights using glmmTMB.

>> 


>> *Von:* Mollie Brooks <mollieebrooks using gmail.com> 
>> *Gesendet:* Montag, 10. Juni 2019 19:04
>> *An:* Sam Crawley <sam_crawley using warpmail.net>; Help Mixed Models <r-sig-mixed-models using r-project.org>
>> *Cc:* d.luedecke using uke.de
>> *Betreff:* Re: [R-sig-ME] Predicted probabilites with CIs for multilevel logistic regression with prior weights

>> 

>> 


>> 
>> 

>>> On 10Jun 2019, at 17:33, <d.luedecke using uke.de> <d.luedecke using uke.de> wrote:

>>> 

>>>> mixed models in R do correctly not account for sampling weights


>>> 
>>> Should be: mixed models in R do *currently* not account for sampling weights

>> 

>> I’m still trying to get a handle of the different definitions of "weights" but I believe we implemented sampling weights in glmmTMB. We do this by weighting the log-likelihood contribution of each observation. I think this is different from prior weights if you mean Bayesian priors. There has been some discussion of the different implementations of "weights" in different R functions (link below) and we still need to update the documentation for glmmTMB 

>> https://github.com/glmmTMB/glmmTMB/issues/285

>> 

>> Here’s a binomial example:

>> 

>> library(glmmTMB)

>> set.seed(123)

>> n=100

>> dat=data.frame(trials=rpois(n, lambda=50), rownum=1:n)

>> dat$success=rbinom(n, dat$trials, prob=.3)

>> dat$rep=sample(1:5, size=n, replace=TRUE) #each observation is repeated 1 to 5 times

>> rows=rep(dat$rownum, each=1, times=dat$rep)

>> dat_disaggregated=dat[rows, ]

>> 
>> 

>> summary(glmmTMB(cbind(success, trials-success)~1, weights=rep, dat, family=binomial))

>> summary(glmmTMB(cbind(success, trials-success)~1, dat_disaggregated, family=binomial))

>> 
>> 

>> and it works with non-integer weights

>> 
>> 

>> summary(glmmTMB(cbind(success, trials-success)~1, weights=rep/5, dat, family=binomial))

>> 
>> 

>> cheers,

>> Mollie

>> 
>> 


>>> 
>>> -----Ursprüngliche Nachricht-----
>>> Von: R-sig-mixed-models <r-sig-mixed-models-bounces using r-project.org> Im
>>> Auftrag von d.luedecke using uke.de
>>> Gesendet: Montag, 10. Juni 2019 17:31
>>> An: 'Sam Crawley' <sam_crawley using warpmail.net>;
>>> r-sig-mixed-models using r-project.org
>>> Betreff: Re: [R-sig-ME] Predicted probabilites with CIs for multilevel
>>> logistic regression with prior weights
>>> 
>>> Hi Sam,
>>> 
>>> you could the "ggeffects" package
>>> (https://strengejacke.github.io/ggeffects/), and there is also an example
>>> for a logistic mixed effects model
>>> (https://strengejacke.github.io/ggeffects/articles/practical_logisticmixedmo
>>> del.html), which might help you.
>>> 
>>> For binomial models, using weights often results in the following warning:
>>> #> non-integer #successes in a binomial glm!
>>> 
>>> However, CIs for the predicted probabilities can be calculated nevertheless
>>> (at least in my quick example). Note that afaik, mixed models in R do
>>> correctly not account for sampling weights. However, Thomas Lumley, author
>>> of the survey-package, works on a survey-function for mixed models
>>> (https://github.com/tslumley/svylme), probably the GitHub version is quite
>>> stable (haven't tested yet).
>>> 
>>> An alternative would be the "scale_weights()" function from the
>>> sjstats-package
>>> (https://strengejacke.github.io/sjstats/articles/mixedmodels-statistics.html
>>> #rescale-model-weights-for-complex-samples ), which rescales sampling
>>> weights so they can be used as "weights" for the mixed models function you
>>> have in R (lme4, lme, ...).
>>> 
>>> Based on that function, I have a small example that demonstrates how to
>>> compute predicted probabilities for mixed models with (sampling) weights
>>> (ignore the warnings, this is just for demonstration purposes):
>>> 
>>> library(lme4)
>>> library(sjstats) # for scale_weights() and sample data
>>> library(ggeffects) # for ggpredict()
>>> 
>>> data(nhanes_sample)
>>> set.seed(123)
>>> nhanes_sample$bin <- rbinom(nrow(nhanes_sample), 1, prob = .3)
>>> nhanes_sample <- scale_weights(nhanes_sample, SDMVSTRA, WTINT2YR)
>>> 
>>> m <- glmer(
>>>  bin ~ factor(RIAGENDR) * age + factor(RIDRETH1) + (1 | SDMVPSU),
>>>  family = binomial(),
>>>  data = nhanes_sample,
>>>  weights = svywght_a
>>> )
>>> 
>>> ggpredict(m, c("age", "RIAGENDR")) %>% plot()
>>> 
>>> 
>>> Best
>>> Daniel
>>> 
>>> -----Ursprüngliche Nachricht-----
>>> Von: R-sig-mixed-models <r-sig-mixed-models-bounces using r-project.org> Im
>>> Auftrag von Sam Crawley
>>> Gesendet: Montag, 10. Juni 2019 10:36
>>> An: r-sig-mixed-models using r-project.org
>>> Betreff: [R-sig-ME] Predicted probabilites with CIs for multilevel logistic
>>> regression with prior weights
>>> 
>>> Hello all,
>>> 
>>> I am doing a multilevel binomial logistic regression using lme4, and the
>>> survey data I am using requires weights to be used. I would like to
>>> calculate various predicted probabilities with confidence intervals based on
>>> the estimated model. The predict function obviously doesn't give me standard
>>> errors, and the recommended method to get these is to use the bootMer
>>> function.
>>> 
>>> However, in my case, the weights provided are not integers, and the bootMer
>>> function exits with an error if the weights are not integers (I raised a
>>> GitHub issue about this, and was pointed to this list:
>>> https://github.com/lme4/lme4/issues/524 ).
>>> 
>>> So my question is, what is the best way to calculate the predicted
>>> probabilities (with confidence intervals) in my case?
>>> 
>>> I would appreciate any help you can give me, and I'm happy to provide more
>>> details if required.
>>> 
>>> Thanks,
>>> Sam Crawley.
>>> 
>>> _______________________________________________
>>> R-sig-mixed-models using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>> 
>>> --
>>> 
>>> _____________________________________________________________________
>>> 
>>> Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen
>>> Rechts; Gerichtsstand: Hamburg | www.uke.de
>>> Vorstandsmitglieder: Prof. Dr. Burkhard Göke (Vorsitzender), Prof. Dr. Dr.
>>> Uwe Koch-Gromus, Joachim Prölß, Marya Verdel
>>> _____________________________________________________________________
>>> 
>>> SAVE PAPER - THINK BEFORE PRINTING
>>> 
>>> _______________________________________________
>>> R-sig-mixed-models using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>> 
>>> --
>>> 
>>> _____________________________________________________________________
>>> 
>>> Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg | www.uke.de
>>> Vorstandsmitglieder: Prof. Dr. Burkhard Göke (Vorsitzender), Prof. Dr. Dr. Uwe Koch-Gromus, Joachim Prölß, Marya Verdel
>>> _____________________________________________________________________
>>> 
>>> SAVE PAPER - THINK BEFORE PRINTING
>>> 
>>> _______________________________________________
>>> R-sig-mixed-models using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

>> 

>> 

>> 
>> Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg | www.uke.de
>> Vorstandsmitglieder: Prof. Dr. Burkhard Göke (Vorsitzender), Prof. Dr. Dr. Uwe Koch-Gromus, Joachim Prölß, Marya Verdel

>> 
>> SAVE PAPER - THINK BEFORE PRINTING

> 

> 
> 
> Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg | www.uke.de
> Vorstandsmitglieder: Prof. Dr. Burkhard Göke (Vorsitzender), Prof. Dr. Dr. Uwe Koch-Gromus, Joachim Prölß, Marya Verdel

> 
> SAVE PAPER - THINK BEFORE PRINTING


	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list