[R-sig-ME] understanding log-likelihood/model fit

Daniel Ezra Johnson danielezrajohnson at gmail.com
Wed Aug 20 02:19:51 CEST 2008


Thanks for your reply.

> What you observe has everything to do with
> 'the extreme non-normality of the random effects in "null" as opposed to
> "fixed"'.

I thought so too, but it's not correct. The log-likelihood of the
mixed-effects model does not appear to depend on how nearly normal, or
not, the random effect BLUPs are.

That was what led to my original question. When we have a random
effect (subject) and a between-group fixed effect (an "outer" effect),
logically they are competing to account for the same variance.

The mixed model fitting appears to "prefer" the model with the fixed
effect, even though the model with no fixed effect appears to fit the
data equally well (judging by residuals and the fact that the BLUPs
are exactly what they 'should be', no shrinkage here).

I am comfortable with this result, indeed my work depends on it, but I
want to understand better why it comes out this way. This has nothing
to do with practical research design questions.

D
>
> The distribution at the subject level is nowhere near normal.
> The fixed effects account for a rather severe departure from normality.
> For this reason, the between subjects components of variance
> estimate is radically different between the two models.  Also the
> subjects random effects are radically different.  Try
>
> znull <- ranef(null, drop=TRUE)
> zfix <- ranef(fixed, drop=TRUE)
> plot(zfix$subject ~ znull$subject)
>
> The residuals for the two models are simiiar, but not at all the same.
> The residuals from the null model are surprisingly close to normal.
> [Actually, a criticism of the Pinheiro and Bates book is that it relies
> far too heavily on plots of residuals for its diagnostic checking.]
>
> Think about how the results might be used.  The analyses that you
> give seem to imply that the existence of the two groups is hidden
> from you.  You want to generalize to other subjects (the usual
> situation), so that fitting subject as a fixed effect would defeat the
> whole aim of the enterprise.  So, unless you learn of the existence
> of the two groups from initial exploratory analysis of the data (you
> darn well should), you have to do the null analysis.
>
> At this late stage you might examine the random subject effects
> and belatedly conclude that you have been laboring under a gross
> misapprehension.  Or you might charge blindly ahead and use the
> model for predictive purposes.  In the latter case, the accuracy will
> be terrible (high SE), but, hey, you can make predictions on what to
> expect from another subject or group of subjects.
>
> There are strong reasons for trying to ensure that models (1) are
> somewhat near correct, and (2) that they model aspects of the data
> that reflect the intended use of the results.
>
> Now try
> test2 <- data.frame(subject=rep(1:200,each=500),
>        response=500+rep(rnorm(200), each=500)+rnorm(100000,0,10),
>        fixed=(rep(c("A","B"),each=50000)))
>
> lmer(response~ (1|subject), data=test2)
> lmer(response~ fixed+(1|subject), data=test2)
>
> Wrestling with such simulated data is a good way to gain
> understanding and tune one's intuitive processes.
>
> By the way, I believe you should be fitting by maximum likelihood
> (ML), for comparing models with different fixed effects.
>
> John Maindonald             email: john.maindonald at anu.edu.au
> phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
> Centre for Mathematics & Its Applications, Room 1194,
> John Dedman Mathematical Sciences Building (Building 27)
> Australian National University, Canberra ACT 0200.
>
>
> On 20 Aug 2008, at 6:47 AM, Daniel Ezra Johnson wrote:
>
>> Dear All,
>>
>> I'm sure this is a simple question, but I haven't been able to find an
>> answer to it that I can understand. I'd like an answer that is pitched
>> somewhere between the full mathematics on the one hand, and an
>> oversimplified overview on the other.
>>
>> One way of putting the question is, what is the difference, really,
>> between a fixed and a random effect (as they are fit by lmer)?
>> Another way of putting it involves the following example.
>>
>> Suppose we have observations of a response from many subjects.
>> The overall average response is 500.
>> The subjects fall into two groups.
>> Half have an effect of +100 and half an effect of -100.
>>
>> test1 <- data.frame(subject=rep(1:200,each=500),
>>        response=500+c(rep(-100,50000),rep(100,50000))+rnorm(100000,0,10),
>>        fixed=(rep(c("A","B"),each=50000)))
>>
>> The following model treats subject as a random effect:
>>>
>>> null <- lmer(response~(1|subject),test1)
>>
>> The following model keeps the subject effect and adds the fixed effect.
>>>
>>> fixed <- lmer(response~fixed+(1|subject),test1)
>>
>>> null
>>
>> Linear mixed model fit by REML
>> Formula: response ~ (1 | subject)
>>  Data: test1
>>  AIC    BIC  logLik deviance REMLdev
>> 746923 746951 -373458   746923  746917
>> Random effects:
>> Groups   Name        Variance Std.Dev.
>> subject  (Intercept) 10041.81 100.209
>> Residual               100.46  10.023
>> Number of obs: 100000, groups: subject, 200
>> Fixed effects:
>>          Estimate Std. Error t value
>> (Intercept)  500.000      7.086   70.56
>>
>>> fixed
>>
>> Linear mixed model fit by REML
>> Formula: response ~ fixed + (1 | subject)
>>  Data: test1
>>  AIC    BIC  logLik deviance REMLdev
>> 743977 744015 -371984   743960  743969
>> Random effects:
>> Groups   Name        Variance  Std.Dev.
>> subject  (Intercept)  0.016654 0.12905
>> Residual             99.642120 9.98209
>> Number of obs: 100000, groups: subject, 200
>> Fixed effects:
>>           Estimate Std. Error t value
>> (Intercept) 400.11806    0.04647    8610
>> fixedB      199.87485    0.06572    3041
>>
>> The result is what one would expect, intuitively.
>> In the model "null" there is a large subject variance.
>> In the model "fixed" there is virtually no subject variance.
>> In both models the residuals are the same.
>> The logLik of the model with the fixed effect is closer to zero (by about
>> 1500).
>> Therefore, we say the model with the fixed effect fits better.
>>
>> This makes sense. Instead of 100 subject effects near +100 and 100
>> near -100, we have virtually no subject effects and the fixed effect
>> accounts for all the between-subject variance.
>>
>> The question: why? Why does the model with the fixed effect fit better?
>> Why does the smaller (zero) random effect plus the fixed effect
>> translate into an improvement in log-likelihood?
>>
>> It's not anything to do with the residuals. The models make the same
>> predictions:
>>
>>> fitted(null)[c(1:5,50001:50005)]
>>
>> [1] 400.2807 400.2807 400.2807 400.2807 400.2807 600.2013 600.2013
>> 600.2013
>> [9] 600.2013 600.2013
>>
>>> fitted(fixed)[c(1:5,50001:50005)]
>>
>> [1] 400.1282 400.1282 400.1282 400.1282 400.1282 599.9839 599.9839
>> 599.9839
>> [9] 599.9839 599.9839
>>
>> And I don't think it has anything to do with the extreme non-normality
>> of the random effects in "null" as opposed to "fixed".
>>
>> So what's the difference?
>>
>> What, in terms of model fitting, makes it preferable to account for
>> the between-subject variation with a fixed effect (as in "fixed")
>> rather than with a random effect (as in "null")?
>>
>> Thanks for your help,
>> Daniel
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>




More information about the R-sig-mixed-models mailing list