[R-sig-ME] estimation of intercept in binomial glmer

Tue Dec 3 22:35:07 CET 2013

A helpful answer to this question is on page 363 of Fitzmaurice et al (2004): Applied longitudinal analysis. 
The approximate relationship between Beta_glm and Beta_glmm is given as:
Beta_glm=~Beta_glmm/sqrt(1+0.346*var(b_i))
Where b_i is the random intercept in the glmm.
Hope this helps!
--------------------------------
Mulugeta Gebregziabher, PhD
Associate Professor of Biostatistics
Department of Public Health Sciences
Medical University of South Carolina

-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Ben Bolker
Sent: Tuesday, December 03, 2013 4:16 PM
To: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] estimation of intercept in binomial glmer

Björn Lindström <Bjorn.Lindstrom at ...> writes:

> 
> Dear all,

> I have a data set with 25 subjects, all with 20 binary responses 
> (psychological learning task). Many subjects gave the 1 response (lets 
> call this response A and the 0 response B) throughout the task.

> My goal is to estimate the Probability of A (P(A)), and if it is above 
> chance (the latter is trivial in this data set, but I have several 
> other similar sets where its more of an issue).

> If I calculate the proportion of A responses for each subject (mean, 
> na.rm=T), the sample mean is 0.757 (the sample distribution of 
> proportion A is very skewed toward 1, with a few all 0 respondents).

> If i instead use glmer:
> glmer(RespondA~1+(1|Subject),family=binomial,data=data),

> Fixed effects:
>       Estimate Std. Error z value Pr(>|z|)
> (Intercept)    3.660      1.143   3.201  0.00137 **
> 
> ,with an estimate that is far above 0.757. Plogis(3.66) = 0.974. 
> This estimate is close to the sample median (md =1), but does it make 
> sense?
> 
> Ordinary glm, ignoring the Subject factor, gives an intercept closer 
> to  the sample mean :
> 
> Coefficients:
>             Estimate Std. Error z value Pr(>|z|)
> (Intercept)   1.1995     0.1073   11.18   <2e-16 ***
> 
> (plogis(1.1995) = 0.768)

> Can someone please illuminate whats happening here? Is it shrinkage in 
> the GLMM? Seem a bit much for just the intercept right?
> Overdispersion (dont know much about that...)

  This is an interesting question; I find it hard to answer precisely without seeing the original data, but it doesn't surprise me very much that in this kind of extreme situation (with complete or near-complete separation for some of the respondents) the results from naive averaging, GLM estimation (which should correspond to averaging on the logit scale), and GLMM estimation would differ considerably.  The GLMM intercept represents (roughly) the population average across individuals of the log-odds response, while the GLM intercept represents the the population average across observations.
You might get some enlightenment out of the relevant section from Agresti's _Categorical Data Analysis_ book (sorry, don't have it with me) on marginal vs conditional estimates ...

 Ben Bolker

_______________________________________________
R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models