# [R-sig-ME] chi-square mixtures for random effects LRTs

Daniel Ezra Johnson danielezrajohnson at gmail.com
Sat Aug 9 12:54:18 CEST 2008

```Pinheiro & Bates (2000:86-87) discuss, but in the end do not
recommend, using mixed chi-squared distributions for random effects
likelihood ratio tests.
Their recommendation is to use the conservative 'naive' df, equal to
the difference in the number of non-redundant parameters in the model.
They discuss two examples (updated to lmer notation below):

Example 1:
fm1Machine <- lmer(score~Machine+(1|Worker),data=Machines)
fm2Machine <- lmer(score~Machine+(1|Worker/Machine),data=Machines)

Example 2:
fm1OrthF <- lmer(distance~age+(1|Subject),data=Orthodont[Orthodont\$Sex=="Female",])
fm2OrthF <- lmer(distance~age+(age|Subject),data=Orthodont[Orthodont\$Sex=="Female",])

In Example 1, the difference between the models is the addition of a
(nested) random intercept, therefore the number of parameters
increases by one.
In Example 2, the difference is a random slope, which also generates a
correlation parameter, so the number of parameters increases by two.

Therefore, P&B recommend using a conservative df of 1 for the test in
Example 1, and df of 2 in Example 2.

My first question is, if the difference in your model was the addition
of a crossed random effect intercept, not discussed in P&B:

Example 3:
imaginary1 <- lmer(score~sex+(1|subject))
imaginary2 <- lmer(score~sex+(1|subject)+(1|item))

No correlation term would be generated here, so would this pattern
just like Example 1? That is, would df=1 be the naive and conservative
choice?

A second question: if one did wish to employ Stram and Lee's
correction using a mixed chi-squared distribution -- between the df
given above and one less degree of freedom, e.g. Mix(0,1) in Examples
1 (and 3?), and Mix(1,2) in Example 2 -- how would this be done?

P&B implement it somewhere in plot(simulate.lme()) but I cannot find
the code for it. Is it as simple as:

mean(pchisq(LRTS,df=c(0,1),lower.tail=F))   # Example 1 (and 3?) aka
pchisq(LRTS,df=1,lower.tail=F)/2 in this special case
mean(pchisq(LRTS,df=c(1,2),lower.tail=F))   # Example 2

If this is correct, it would seem reasonably easy to use Stram & Lee's
df correction as long as the models being compared differ minimally,
as in these examples.
Figure 2.4 in P&B shows that the correction still leaves a
conservative result in the ML case for Example 1, but it still looks
better than the 'naive' df.
So I'm a bit puzzled why P&B don't in the end recommend the