[R-sig-ME] Model comparison between a full model and a random effect reduced model

Tue Sep 22 17:10:02 CEST 2015

 <lei.he at ...> writes:

> 
> Dear all,

> I’m studying speaker idiosyncratic intensity variability in the
>  speech signal. My dataset looks like this:

> -nPVIm: numeric variable, quantifying intensity variability

> -tempo: factor with 5 levels, indicating five levels of speech rates
> (normal, slow, even slower, fast, fastest possible)

> -sentence: factor with 7 levels, i.e. seven different sentences
> 
> -speaker: factor 12 levels, i.e. twelve speakers

> I first I fitted speaker as a random effect with the rationale that
>  we cannot exhaust all the possible speakers:

> Full1 = lmer(nPVIm ~ tempo + (1|speaker) + (1|sentence), data=dat, REML=F)

> Then I fitted speaker as a fixed effect. The rationale is that if we
> apply “nPVIm” in a close-set speaker identification or verification
> system, the speakers are fixed:

> Full2 = lmer(nPVIm~tempo + speaker + (1|sentence), data=dat, REML=F)
> 
> Next, I fitted a reduced model without speaker effect:
> 
> Reduced = lmer(nPVIm~tempo + (1|sentence), data=dat, REML=F)

> Finally, I used the anova () function to test whether Full1 and
>  Full2 are significantly different from Reduced. 

> Results showed that speaker as both random and fixed effects are
> significant, and the AICs of both Full1 and Full2 are lower than
> that of Reduced.

> Now we have received the reviewer’s comments. The reviewer wasn’t
> certain if it allows the models to be compared like this, especially
> anova(Full1, Reduced). So I would like to ask if our way of model
> fitting and comparisons is free from problems.

 [snip]

 A few things:

* the test of the model with and without a random effect of speaker
is testing whether there is any variation among speakers in their
baseline (normal-tempo) value of nPVIm.  Assuming that's what you want,
the test is *almost* OK but is actually conservative.

from Fox et al <http://ukcatalogue.oup.com/product/9780199672554.do>
chapter 13:

Boundary effects: statistical tests for linear models, including
GLMMs, typically assume that estimated parameters could be either
above or below their null value (e.g., slopes and intercepts can be
either positive or negative). This is not true for the random effect
variances in a (G)LMM—they must be positive—which causes problems with
standard hypothesis tests and confidence interval calculations
(Pinheiro and Bates 2000). In the simplest case of testing whether a
single random-effect variance is zero, the p-value derived from
standard theory is twice as large as it should be, leading to a
conservative test (you’re more likely to conclude that you can’t
reject the null hypothesis). To test the null hypothesis that the sole
random-effect variance in a model is equal to zero you can just divide
the p-value by 2. If you want to test hypotheses about random effects
in a model with more than one random effect you will need to simulate
the null hypothesis (section 13.6.2).

This is also discussed in Bolker 2008 (chapter 7, I think) and in
http://glmm.wikidot.com/faq

As far as testing between fixed and random effects specifications;
I don't normally do this (I don't think it's a question that often
makes sense), but econometricians have considered it: see
"Hausman tests"
https://en.wikipedia.org/wiki/Durbin%E2%80%93Wu%E2%80%93Hausman_test

I don't know if anyone has coded it for lme4 models.