[R-sig-ME] P-values from interaction terms using lme4

Sun Jun 12 10:31:20 CEST 2016

Hi Sam,

if you're getting p-values from lmer outside of a likelihood-ratio test,
then you're using lmerTest, not lme4. lmerTest is designed to be a
drop-in replacement for lme4, but it does bring some extra
features/'complications' in the form of the denominator degrees of
freedom.

There are two easy options for getting ANOVA-style p-values:

1. Using lmerTest, you can just do
anova(mod1,ddf="Satterthwaite",type=2) 

If you're using vanilla lme4, then you need to do this first:

library(lmerTest)
mod1 <- as(mod1,"merModLmerTest")

which will cast your model to an lmerTest model.

2. Using the car package:

library(car)
Anova(mod1,test.statistic="F",type=2)

If you fitted your model with maximum likelihood, i.e. with REML=FALSE,
then you need to refit your model using REML first:

mod1 <- update(mod1,REML=TRUE)

For both options, you need to specify the types of test you're doing. I
highly recommend Type 2, but there is a lot of material on that debate,
search for e.g. Venables' "Exegeses on linear models", or look at the
documentation for car::Anova(). Please note that the distinction for
Type 2 vs Type 3 is *very* relevant for your question since you're
concerned about interaction terms.

For the lmerTest route, you can specify the approximation to use for the
denominator degrees of freedom: Satterthwaite is much faster, but
Kenward-Roger is more accurate. For car::Anova(), the F-statistic is
always computed with  Kenward-Roger (and only works for REML-fitted
models for reasons that I can't explain quickly), but you have the
option of using a Chisq test statistic, which is equivalent to assuming
that the denominator degrees of freedom are infinite, or equivalently,
that your t-values for the coefficients are z-values. 

These ANOVA-style tests are Wald tests and are asymptotically equivalent
to the LR-tests, but are less conservative for finite samples. 

Now, since you care about particular contrasts within the model, you may
also just want to look at your model coefficients. Depending on which
coding scheme you're using, the contrasts represented by your model
coefficients might not be the ones you want, but packages like lsmeans
(a regular mention on the list here and very well documented) can
compute all types of contrasts post-hoc. Rereading your question, this
may be the best way to go for your target conclusion/result statements.

Best,
Phillip

On Sat, 2016-06-11 at 16:41 +0000, Sam Hardman [sah74] wrote:
> Dear all,
> 
> 
> I have some data which I would like to analyse using lme4 and I would really appreciate some help deciding what the best method is.
> 
> 
> My experiment is as follows:
> 
> 
> I tested the responses of urban and rural great tits to playbacks of great tit song from a loud speaker within their territories.
> 
> 
> I created three playback song types:
> 
> -undegraded
> 
> -degraded
> 
> -very degraded
> 
> 
> I played these to birds in 20 different cities. In each city I tested one bird in the city centre and one bird in a rural location outside of the city (i.e. paired samples). Each bird received all three playback types.
> 
> 
> I measured five different responses to these playbacks:
> 
> -Time to sing back to playback (in seconds)
> 
> -Time to sing back to playback (in seconds)
> 
> -Time the bird spent within five metres of the speaker (in seconds)
> 
> -Number of times the bird flew over the speaker (count)
> 
> -The closest approach the bird made to the speaker (in metres)
> 
> 
> For each of these five repsonses I would like to know if there is an interaction between habitat and playback type.
> 
> 
> So, I have a model which looks like this:
> 
> 
> mod1<-lmer(repsonse ~ Playback*UR + (1|ID))
> 
> 
> Where response is one of the five repsonse behaviours, playback is the playback type, UR is habitat type (urban or rural) and ID is the ID of the bird.
> 
> 
> This gives me results and P-Values but am not sure these P-values are valid and I think I should compare this model to a null model to get a valid P-values.
> 
> 
> So I can use a likelihood ratio tests to test for differences in response by habitat type alone:
> 
> 
> mod1<-lmer(approach ~ UR + (1|Location))
> mod2<-lmer(approach ~ 1 + (1|Location))
> anova(mod1, mod2)
> 
> or for differences in response according tom playback type alone:
> 
> mod1<-lmer(approach ~ Playback + (1|Location))
> mod2<-lmer(approach ~ 1 + (1|Location))
> anova(mod1, mod2)
> 
> But how should I do this when there is an interaction term? I deally I would like P-values for each playback type in interaction with habitat. e.g.
> 
> Undregraded playback * Habitat (urban/rural
> Degraded playback * Habitat (urban/rural)
> Very degraded playback * Habitat (urban/rural)
> 
> This would allow me to say, for example, something like "urban birds approached the speaker more closely than rural birds in response to undegraded playbacks". I would like to do this with each of the five response behaviours.
> 
> I would really appreciate any suggestions for the best way forward with this and apologies if this question is too simple for this group.
> 
> Best wishes,
> Sam
> 
> 
> --------------------------------------------------------------------
> Aberystwyth - Prifysgol Gyntaf Cymru https://www.aber.ac.uk/cy/
> 
> Aberystwyth - Wales' First University https://www.aber.ac.uk/en/
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models