[R-sig-ME] model specification in lmer

Fri Dec 1 13:09:57 CET 2017

Thank you so much Paul for your detailed and helpful response.

Yes, sorry, there was a typo in the specification of model3. It should be lm instead of lme. I'm also aware of the conservatism of the LRT when comparing lmer and lm objects. 

Could you please provide clarification for this sentence? "The null hypothesis being tested is that both the variance of the Trt effect and its covariance with the random intercept are zero."

Do you mean the variance of the random slope of treatment? The main effect of Treatment is a fixed factor.

Thank you again.

Silvia 

-----Mensaje original-----
De: Paul Johnson [mailto:paul.johnson at glasgow.ac.uk] 
Enviado el: viernes, 1 de diciembre de 2017 2:58
Para: silvia.matesanzgarcia at gmail.com
CC: Voeten, C.C. <c.c.voeten at hum.leidenuniv.nl>; r-sig-mixed-models at r-project.org
Asunto: Re: [R-sig-ME] model specification in lmer

Hi Silvia & CC,

> model3=lme(Trait~Pop*Trt , data=data)

should be 

> model3=lm(Trait~Pop*Trt , data=data)

I didn’t know you could test a random effect with anova(lmer(…, REML = FALSE), lm(…)). Is that valid when fitted with ML? 

I recommend reading http://bbolker.github.io/mixedmodels-misc/glmmFAQ.html :

"With recent versions of lme4, goodness-of-fit (deviance) can be compared between (g)lmer and (g)lm models, although anova() must be called with the mixed ((g)lmer) model listed first. Keep in mind that LRT-based null hypothesis tests are conservative when the null value (such as σ2=0) is on the boundary of the feasible space; in the simplest case (single random effect variance), the p-value is approximately twice as large as it should be (Pinheiro and Bates 2000).
	• Consider not testing the significance of random effects. If the random effect is part of the experimental design, this procedure may be considered ‘sacrificial pseudoreplication’ (Hurlbert 1984). Using stepwise approaches to eliminate non-significant terms in order to squeeze more significance out of the remaining terms is dangerous in any case.
	• consider using the RLRsim package, which has a fast implementation of simulation-based tests of null hypotheses about zero variances, for simple tests. (However, it only applies to lmer models, and is a bit tricky to use for more complex models.)"

> model1=lmer(Trait~Pop*Trt + (Trt|Family), data=data) 
> model2=lmer(Trait~Pop*Trt + (1|Family), data=data) 
> model3=lme(Trait~Pop*Trt , data=data)
> 
> anova(model1,model2) would provide significance for the interaction 
> between Family and Treatment.

To avoid confusion between random and fixed effects, it’s better not to call “(Trt|Family)" an interaction (even though it has a lot in common with an interaction). Pop:Trt is an interaction. The difference between (1|Family) and (Trt|Family) is that in the former only the intercept varies randomly between families, while in the latter the main effect of Trt also varies randomly between families. The null hypothesis being tested is that both the variance of the Trt effect and its covariance with the random intercept are zero.

Another simple way to test for a random effect in a lmer fit (if you still want to despite the advice quoted above) would be to use confint(fit) and see if the 95% CI for the family variance includes zero. I did a very quick-and-dirty permutation-based simulation to compare confint(…, method = "profile") with the anova(lmer, lm) test:

library(lme4)
library(nlme)
sim.res <-
  replicate(1000, {
    Orthodont$Subject <- sample(Orthodont$Subject)
    fm1 <- lmer(distance ~ age + (1 | Subject), data = Orthodont, REML = FALSE)
    fm1.lm <- lm(distance ~ age, data = Orthodont)
    c(LRT = anova(fm1, fm1.lm)[2, "Pr(>Chisq)"] < 0.05, 
      CI = confint(fm1, 1)[1] > 0)
  })
table(sim.res["LRT", ], sim.res["CI", ])

The results were conservative (false positive rate of ~2%), which I expected for the LRT, but not for the CI, and completely concordant, presumably because both use the likelihood:

        FALSE TRUE
  FALSE   982    0
  TRUE      0   18

I ran it a few more times and it was very consistent, ~2% false positives and 100% concordance.

Best wishes,
Paul

> anova(model2, model3) would provide significance for the Family term.
> 
> Is this correct?
> 
> 
> Thank you so much again!!
> 
> 
> 
> 
> 
> -----Mensaje original-----
> De: Voeten, C.C. [mailto:c.c.voeten at hum.leidenuniv.nl]
> Enviado el: jueves, 30 de noviembre de 2017 21:27
> Para: silvia.matesanzgarcia at gmail.com; 
> r-sig-mixed-models at r-project.org
> Asunto: RE: [R-sig-ME] model specification in lmer
> 
> There are various ways to do this. The most important thing to make 
> sure of is that the models you're comparing are both fitted either 
> with or without REML; you can't mix ML and REML fits (that will 
> heavily bias your LRT in favor of the ML model). You can fit the 
> no-random-intercept model using lm(), and use anova() to compare it 
> with a random-intercept lmer model fitted with REML=F. A (perhaps 
> slightly better) alternative is to use gls() from nlme to fit the 
> no-random-intercept model using REML. You can then manually calculate chi-square values like so:
> 
>> library(nlme)
>> library(lme4)
> Loading required package: Matrix
> 
> Attaching package: 'lme4'
> 
> The following object is masked from 'package:nlme':
> 
>    lmList
> 
>> model1 <- gls(Reaction~Days,sleepstudy)
>> model2 <- lmer(Reaction~Days + (1|Subject),sleepstudy) ( neg2ll1 <-
>> -2*logLik(model1) )
> 'log Lik.' 1893.664 (df=3)
>> ( neg2ll2 <- -2*logLik(model2) )
> 'log Lik.' 1786.465 (df=4)
>> ( difference <- as.numeric(neg2ll2-neg2ll1) )
> [1] -107.1986
>> pchisq(difference,1)
> [1] 0
> 
> Good luck!
> ________________________________________
> Van: R-sig-mixed-models [r-sig-mixed-models-bounces at r-project.org] 
> namens silvia.matesanzgarcia at gmail.com 
> [silvia.matesanzgarcia at gmail.com]
> Verzonden: donderdag 30 november 2017 19:55
> Aan: r-sig-mixed-models at r-project.org
> Onderwerp: [R-sig-ME] model specification in lmer
> 
> Hello all
> 
> I am attempting to fit a mixed model in lmer. I have an experimental 
> design with families nested within populations in two different 
> treatments, and I want to test the effects of Pop, Trt, and Pop:Trt as 
> fixed factors, and Family and Family*Trt as random factors.
> 
> 
> 
> My full model would be:
> 
> model1=lmer(Trait~Pop*Trt + (Trt|Family)
> 
> I can then use:
> 
> model2=lmer(Trait~Pop*Trt + (1|Family)
> 
> and
> 
> anova(model1, model2) would provide signification for the interaction term.
> 
> 
> 
> I would need a third model with only the interaction term, so that I 
> can compare it to the full model and obtain significance for the family term.
> But I can't figure out how to do this last part.
> 
> 
> 
> Can I just compare model2 to a model with no random part?
> 
> 
> 
> Thank you so much for your help in advance.
> 
> 
> 
> 
>        [[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models