[R-sig-ME] correspondence between intercept in a logit model and mean y response/probability

Wed Jun 29 11:31:25 CEST 2011

Dear list,

I'm fitting a mixed logit model with lme4, and finding something that seems weird to me, but probably has a simple explanation. I suspect someone on this list will be able to clarify what's going on. In brief, the issue is the correspondence between the intercept term in a mixed logit model and the mean response/probability of an outcome across all units.

The mean of my binary response variable is about 0.35:

> mean(longdata$contact)
[1] 0.3503684

But when I fit mod1 below, the Intercept is estimated to be -1.28111, which does NOT correspond to this mean response:

> mod1 <- lmer(contact ~ 1 + (1 | group) + (1 | id), longdata, family=binomial)
> plogis(fixef(mod1))
(Intercept) 
 0.2173616 

Huh? Why is this happening? Is it something to do with the shrinkage that occurs because of the clustering in higher-level units? I would have expected an intercept term close to the log-odds equivalent of a probability of 0.35. I presume the difference between empirical and modelled mean probability isn't indicative of any big problems, and indeed might be a useful result, but I'd like to know what I should understand by it.

Any help would be much appreciated (and apologies for posting a lot to this list recently).

- Malcolm