[R-sig-ME] correspondence between intercept in a logit model and mean y response/probability

Jarrod Hadfield j.hadfield at ed.ac.uk
Wed Jun 29 16:52:40 CEST 2011


Hi,

Its perhaps easier to understand in terms of the odds ratios.

The expected odds ratio for the observations might be:

  E[exp(Xb+Zu)] = exp(XB)E[exp(Zu)]

since exp(XB) is a constant and so E[exp(XB)] = exp(XB) and  
COV(exp(XB), exp(Zu))=0.

In your case you also treated exp(Zu) as fixed (u=0, and so  
E[exp(Zu)]=exp(Zu)=1) hence:

E[exp(Xb+Zu)] = exp(XB)

conditional on u=0.

However, averaging over the population  - treating Zu as a random  
variable - is different. Imagine Z=I, then

E[exp(Zu)] = E[exp(u)] = exp(VAR(u)/2)

since exp(u) is log normal if u is normal on the link scale (with mean  
zero and variance var(u)).

In this case  E[exp(Xb+Zu)] depends on the variance of the random effects:

E[exp(Xb+Zu)] = exp(Xb)*exp(VAR(u)/2) = exp(Xb+VAR(u)/2)

Jarrod




















Quoting Malcolm Fairbrother <m.fairbrother at bristol.ac.uk> on Wed, 29  
Jun 2011 15:23:09 +0100:

> Thanks Jarrod, I figured it would be something elementary like that.  
> Also, I see that:
>
> mean(plogis(mod1 at eta))
> and
> mean(mod1 at mu)
>
> both yield 0.3473491--very close to the mean of longdata$contact (0.3503684).
>
> However, I don't really know what to make of the "predicted mode".  
> The usual explanation of logit models says something like: (a) we're  
> interested in probabilities; (b) we model the log-odds by necessity;  
> and (c) having fitted a logit model it's useful to reconvert the  
> expected values for different combinations of covariates back to  
> probabilities, using prob = exp(XB)/(1+exp(XB). If "prob" in this  
> case is estimated to be 0.2173616, whereas we know that the overall  
> average probability in the dataset is 0.3503684... what gives?  
> What's the relationship between the "predicted mode" and the  
> "expected probability"?
>
> Much appreciated,
> Malcolm
>
>
> On 29 Jun 2011, at 13:48, Jarrod Hadfield wrote:
>
>> Hi,
>>
>> 0.2173616 is the predicted mode. The inverse-logit transform is  
>> non-linear so f(E[x]) does not equal E[f(x)].
>>
>> E[f(x)] can be approximated (well) as:
>>
>> c2<-((16*sqrt(3))/(15*pi))^2
>> plogis(eta/sqrt(1+c2*v))
>>
>> where eta is the linear predictor on the link scale (the intercept  
>> in your case), and v is the variation around the linear predictor  
>> on the link scale (probably the sum of your variance components).
>>
>> Jarrod
>>
>>
>>
>>
>> Quoting Malcolm Fairbrother <m.fairbrother at bristol.ac.uk> on Wed,  
>> 29 Jun 2011 10:31:25 +0100:
>>
>>> Dear list,
>>>
>>> I'm fitting a mixed logit model with lme4, and finding something  
>>> that seems weird to me, but probably has a simple explanation. I  
>>> suspect someone on this list will be able to clarify what's going  
>>> on. In brief, the issue is the correspondence between the  
>>> intercept term in a mixed logit model and the mean  
>>> response/probability of an outcome across all units.
>>>
>>> The mean of my binary response variable is about 0.35:
>>>
>>>> mean(longdata$contact)
>>> [1] 0.3503684
>>>
>>> But when I fit mod1 below, the Intercept is estimated to be  
>>> -1.28111, which does NOT correspond to this mean response:
>>>
>>>> mod1 <- lmer(contact ~ 1 + (1 | group) + (1 | id), longdata,  
>>>> family=binomial)
>>>> plogis(fixef(mod1))
>>> (Intercept)
>>> 0.2173616
>>>
>>> Huh? Why is this happening? Is it something to do with the  
>>> shrinkage that occurs because of the clustering in higher-level  
>>> units? I would have expected an intercept term close to the  
>>> log-odds equivalent of a probability of 0.35. I presume the  
>>> difference between empirical and modelled mean probability isn't  
>>> indicative of any big problems, and indeed might be a useful  
>>> result, but I'd like to know what I should understand by it.
>>>
>>> Any help would be much appreciated (and apologies for posting a  
>>> lot to this list recently).
>>>
>>> - Malcolm
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>>
>>
>>
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>>
>
>
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.




More information about the R-sig-mixed-models mailing list