[R-sig-ME] P-values from interaction terms using lme4
Phillip Alday
phillip.alday at mpi.nl
Wed Aug 9 16:58:23 CEST 2017
Re'CCing the list after being bounced due to PGP signature.
On 08/09/2017 04:46 PM, Douglas Bates wrote:
> I think you mean "conditional modes", i.e. the point where the
> conditional density of the random effects is maximized, not "conditional
> models".
Yes, that was a typo.
> Technically it is true that the number of parameters does not depend on
> the number of random effects, only on the number of unique values in the
> covariance matrices for the random effects. However, I think that leads
> to an undercount of the effective number of parameters when, say,
> performing a likelihood ratio test of models that differ in their random
> effects specification.
This is related to the more general issue of degrees of freedom in mixed
models, right?
But in terms of estimation (especially as implemented in
lme4/MixedModels.jl), increasing the levels of a random effect will
generally provide better estimates, right?
Phillip
>
> On Wed, Aug 9, 2017 at 9:08 AM Phillip Alday <phillip.alday at mpi.nl
> <mailto:phillip.alday at mpi.nl>> wrote:
>
> Today is apparently the day for me to follow up on zombie threads:
>
> While Alex is right that mixed-effects models are difficult with small
> sample sizes, there seems to be an implicit assumption that a large
> number of grouping levels and thus a large number of random intercepts
> increases the number of model parameters. This isn't correct. The
> variance of the intercepts is a model parameter, not the intercepts
> themselves -- the individual intercepts are BLUPs / predictions /
> conditional models and not estimates / model parameters. This is the
> difference to including subject-id as a covariate in a classical
> fixed-effects regression. Indeed, increasing the number of groups
> generally helps in mixed-effects regression because variance estimates
> are quite sensitive to small sample sizes. (@Ben Bolker and others if
> I've messed this up, please correct me!)
>
> Your sample size is still nonetheless quite small if you want to model
> e.g. the variance for playback type (which I think could be important in
> your model). Using rmANOVA isn't really better than a mixed-effects
> model here, because rmANOVA *is* an intercepts-only mixed-effects model
> with additional symmetry assumptions (sphericity)!
>
> Compare the output of
> > summary(aov(Reaction ~ Days + Error(Subject),sleepstudy))
>
> Error: Subject
> Df Sum Sq Mean Sq F value Pr(>F)
> Residuals 17 250618 14742
>
> Error: Within
> Df Sum Sq Mean Sq F value Pr(>F)
> Days 1 162703 162703 169.4 <2e-16 ***
> Residuals 161 154634 960
>
> and
>
> > anova(lmer(Reaction ~ Days + (1|Subject),sleepstudy))
> Analysis of Variance Table
> Df Sum Sq Mean Sq F value
> Days 1 162703 162703 169.4
>
> You can get the residual Mean Sq F in the mixed model by looking at the
> variance column of the Residual grouping under Random effects in
> summary()-output.
>
> The advantage to mixed-effects models in this case is that you're doing
> the regression explicitly.
>
> Best,
> Phillip
>
>
>
> On 06/13/2016 01:43 AM, Sam Hardman [sah74] wrote:
> > Hi Alex,
> >
> >
> > The reason I was looking at using mixed models is because I have
> > repeated measures for indivuals (i.e. each bird was tested three times
> > with three different playback treatments). As far as I know ANOVA
> can't
> > control for this. I am wrong about this?
> >
> >
> > You are correct that I have 40 animals in the experiment, with three
> > treatments each that makes 120 measurements per response behaviour. I
> > agree with you that the sample size is small which could cause
> problems
> > with reliability of results. If ANOVA could work then I will certainly
> > look at that. How would I control for repeated measures?
> >
> >
> > Thanks for your help, I appreciate it.
> >
> > Sam
> >
> >
> ------------------------------------------------------------------------
> > *Von:* Alex Fine <abfine at gmail.com <mailto:abfine at gmail.com>>
> > *Gesendet:* Sonntag, 12. Juni 2016 23:10:59
> > *An:* Phillip Alday
> > *Cc:* Sam Hardman [sah74]; r-sig-mixed-models at r-project.org
> <mailto:r-sig-mixed-models at r-project.org>
> > *Betreff:* Re: [R-sig-ME] P-values from interaction terms using lme4
> >
> > Hey Sam,
> >
> > I actually think mixed effects regression might be inappropriate
> in this
> > case. Do I have it right that you have 40 total animals in the
> > experiment? You said you had 1 bird in the city center and 1 outside
> > the city in 20 cities.
> >
> > If that's right, then the model you described (to take one of your
> > examples) would be:
> >
> > Approach ~ PlayBack * UrbanRural + (1|ID)
> >
> > That's 3 * 2 - 1 = 5 coefficients in your model, plus a random
> intercept
> > for 20 different cities. I don't think you have enough data to
> fit such
> > a model and trust the results.
> >
> > If you have a balanced design and you have clear a priori predictions
> > about each of the contrasts in your design, I'd recommend just using
> > ANOVA. I don't think you gain very much from using MEMs in this case.
> > You could use vanilla linear regression, but setting up the coding
> > schemes for your two predictors is going to be so complicated that
> it'll
> > actually be much simpler to just do ANOVA + t-tests.
> >
> > Alex
> >
> > On Sun, Jun 12, 2016 at 4:31 AM, Phillip Alday
> > <Phillip.Alday at unisa.edu.au <mailto:Phillip.Alday at unisa.edu.au>
> <mailto:Phillip.Alday at unisa.edu.au
> <mailto:Phillip.Alday at unisa.edu.au>>> wrote:
> >
> > Hi Sam,
> >
> > if you're getting p-values from lmer outside of a
> likelihood-ratio test,
> > then you're using lmerTest, not lme4. lmerTest is designed to be a
> > drop-in replacement for lme4, but it does bring some extra
> > features/'complications' in the form of the denominator degrees of
> > freedom.
> >
> > There are two easy options for getting ANOVA-style p-values:
> >
> > 1. Using lmerTest, you can just do
> > anova(mod1,ddf="Satterthwaite",type=2)
> >
> > If you're using vanilla lme4, then you need to do this first:
> >
> > library(lmerTest)
> > mod1 <- as(mod1,"merModLmerTest")
> >
> > which will cast your model to an lmerTest model.
> >
> > 2. Using the car package:
> >
> > library(car)
> > Anova(mod1,test.statistic="F",type=2)
> >
> > If you fitted your model with maximum likelihood, i.e. with
> REML=FALSE,
> > then you need to refit your model using REML first:
> >
> > mod1 <- update(mod1,REML=TRUE)
> >
> > For both options, you need to specify the types of test you're
> doing. I
> > highly recommend Type 2, but there is a lot of material on
> that debate,
> > search for e.g. Venables' "Exegeses on linear models", or look
> at the
> > documentation for car::Anova(). Please note that the
> distinction for
> > Type 2 vs Type 3 is *very* relevant for your question since you're
> > concerned about interaction terms.
> >
> > For the lmerTest route, you can specify the approximation to
> use for the
> > denominator degrees of freedom: Satterthwaite is much faster, but
> > Kenward-Roger is more accurate. For car::Anova(), the
> F-statistic is
> > always computed with Kenward-Roger (and only works for
> REML-fitted
> > models for reasons that I can't explain quickly), but you have the
> > option of using a Chisq test statistic, which is equivalent to
> assuming
> > that the denominator degrees of freedom are infinite, or
> equivalently,
> > that your t-values for the coefficients are z-values.
> >
> > These ANOVA-style tests are Wald tests and are asymptotically
> equivalent
> > to the LR-tests, but are less conservative for finite samples.
> >
> > Now, since you care about particular contrasts within the
> model, you may
> > also just want to look at your model coefficients. Depending
> on which
> > coding scheme you're using, the contrasts represented by your
> model
> > coefficients might not be the ones you want, but packages like
> lsmeans
> > (a regular mention on the list here and very well documented) can
> > compute all types of contrasts post-hoc. Rereading your
> question, this
> > may be the best way to go for your target conclusion/result
> statements.
> >
> > Best,
> > Phillip
> >
> >
> > On Sat, 2016-06-11 at 16:41 +0000, Sam Hardman [sah74] wrote:
> > > Dear all,
> > >
> > >
> > > I have some data which I would like to analyse using lme4 and I
> > would really appreciate some help deciding what the best
> method is.
> > >
> > >
> > > My experiment is as follows:
> > >
> > >
> > > I tested the responses of urban and rural great tits to
> playbacks
> > of great tit song from a loud speaker within their territories.
> > >
> > >
> > > I created three playback song types:
> > >
> > > -undegraded
> > >
> > > -degraded
> > >
> > > -very degraded
> > >
> > >
> > > I played these to birds in 20 different cities. In each city I
> > tested one bird in the city centre and one bird in a rural
> location
> > outside of the city (i.e. paired samples). Each bird received all
> > three playback types.
> > >
> > >
> > > I measured five different responses to these playbacks:
> > >
> > > -Time to sing back to playback (in seconds)
> > >
> > > -Time to sing back to playback (in seconds)
> > >
> > > -Time the bird spent within five metres of the speaker (in
> seconds)
> > >
> > > -Number of times the bird flew over the speaker (count)
> > >
> > > -The closest approach the bird made to the speaker (in metres)
> > >
> > >
> > > For each of these five repsonses I would like to know if
> there is
> > an interaction between habitat and playback type.
> > >
> > >
> > > So, I have a model which looks like this:
> > >
> > >
> > > mod1<-lmer(repsonse ~ Playback*UR + (1|ID))
> > >
> > >
> > > Where response is one of the five repsonse behaviours,
> playback is
> > the playback type, UR is habitat type (urban or rural) and ID
> is the
> > ID of the bird.
> > >
> > >
> > > This gives me results and P-Values but am not sure these
> P-values
> > are valid and I think I should compare this model to a null
> model to
> > get a valid P-values.
> > >
> > >
> > > So I can use a likelihood ratio tests to test for differences in
> > response by habitat type alone:
> > >
> > >
> > > mod1<-lmer(approach ~ UR + (1|Location))
> > > mod2<-lmer(approach ~ 1 + (1|Location))
> > > anova(mod1, mod2)
> > >
> > > or for differences in response according tom playback type
> alone:
> > >
> > > mod1<-lmer(approach ~ Playback + (1|Location))
> > > mod2<-lmer(approach ~ 1 + (1|Location))
> > > anova(mod1, mod2)
> > >
> > > But how should I do this when there is an interaction term? I
> > deally I would like P-values for each playback type in interaction
> > with habitat. e.g.
> > >
> > > Undregraded playback * Habitat (urban/rural
> > > Degraded playback * Habitat (urban/rural)
> > > Very degraded playback * Habitat (urban/rural)
> > >
> > > This would allow me to say, for example, something like "urban
> > birds approached the speaker more closely than rural birds in
> > response to undegraded playbacks". I would like to do this
> with each
> > of the five response behaviours.
> > >
> > > I would really appreciate any suggestions for the best way
> forward
> > with this and apologies if this question is too simple for
> this group.
> > >
> > > Best wishes,
> > > Sam
> > >
> > >
> > >
> --------------------------------------------------------------------
> > > Aberystwyth - Prifysgol Gyntaf Cymru https://www.aber.ac.uk/cy/
> > >
> > > Aberystwyth - Wales' First University https://www.aber.ac.uk/en/
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > _______________________________________________
> > > R-sig-mixed-models at r-project.org
> <mailto:R-sig-mixed-models at r-project.org>
> > <mailto:R-sig-mixed-models at r-project.org
> <mailto:R-sig-mixed-models at r-project.org>> mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
> > _______________________________________________
> > R-sig-mixed-models at r-project.org
> <mailto:R-sig-mixed-models at r-project.org>
> > <mailto:R-sig-mixed-models at r-project.org
> <mailto:R-sig-mixed-models at r-project.org>> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
> >
> >
> >
> > --
> > Alex Fine
> > Ph. (336) 302-3251 <tel:(336)%20302-3251>
> > web: http://internal.psychology.illinois.edu/~abfine/
> > <http://internal.psychology.illinois.edu/~abfine/AlexFineHome.html>
> >
> >
> > --------------------------------------------------------------------
> > Aberystwyth - Prifysgol Gyntaf Cymru https://www.aber.ac.uk/cy/
> >
> > Aberystwyth – Wales’ First University https://www.aber.ac.uk/en/
>
> _______________________________________________
> R-sig-mixed-models at r-project.org
> <mailto:R-sig-mixed-models at r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
More information about the R-sig-mixed-models
mailing list