# [R-sig-ME] lmertest F-test anova(fullm) and anova(fullm, reducedm)

Lionel hughes.dupond at gmx.de
Fri Nov 28 18:29:51 CET 2014

```Dear Marie,

Ok, it makes things easier, I would then go for:
Scores~Condition*Specie+Order+(1+Order|Subject), you would then get an
estimation of how variable is the intercept between the Subject, plus
how variable the slope Scores vs Order between the Subject is, in this
context having one value per subject and order will not be a problem. I
guess the discussion between this model and the one you wrote is similar
to the one about having an interaction term without having a main effect
in the first place, I am not sure if it is also an issue in mixed models
but just for safety I would then include Order as a main effect.

In my example the model with (1|subject) will estimate the variation of
the intercept, you could actually get the estimated variation for each
subject to the average but this is usually not so much of interest. So
if you do ranef(model) you would get one column, one 'coefficient' per
subject (actually these are the deviations from the overall coefficient,
they are not coefficient per se as the model did not estimate them
individually).
However if your model is (weight|subject), this is equivalent to
(1+weight|subject), then you would get again the variation of the
intercept PLUS the variation of the slope response vs weight, if you do
ranef(model) you would then get two columns so two 'coefficient' per
subject.

Cordialement,
Lionel

On 28/11/2014 10:51, marie devaine wrote:
> Dear Lionel,
>
> Thanks a lot for your input.
>
> 1) I am still not sure to get how to write things down, and I am sorry
> that my description of data and model was not clear enough.
> I place me in the first of the two cases that you describe, i.e. the
> Order effect is a parametrical effect, Subject specific but
> independent of levels. In fact, the Order variable is just a count of
> the number of time the task has been performed, irrespectively of
> which Condition has been performed. This is not a categorical variable
> and is just suppose to capture how well the primate is learning
> general features about the task (independent of Condition).
> As it is, Scores~Condition*Specie+(1|Subject/Order) gives me an error
> since Order values are interpreted as level, but there are as many
> levels as observations by subjects.
>
> In fact, in your example, I don't really see the difference between
> (weight|subject) and (1|subject) since in both cases, the model
> evaluate one coefficient by subject.
>
>
> 2)3) This is very clear, thank you again.
>
> Marie
>
>
> 2014-11-27 18:52 GMT+01:00 Lionel <hughes.dupond at gmx.de
> <mailto:hughes.dupond at gmx.de>>:
>
>     Dear Marie Devaine,
>
>     1) The way you account for the order effects is not the way I
>     would go, I can see various options:
>      - The effect of Order on Scores is not changing the relationships
>     between your fixed effects part and the Scores, and each
>     individuals is "learning" the task differently I would then use a
>     nested random part: Scores~Condition*Specie+(1|Subject/Order), you
>     would then get an estimation of much variation there is in the
>     Scores between subject and also how much variation there is within
>     subject between Order levels.
>     - Order is changing the relationship between your fixed effect
>     part and the Score, ie the Condition effect on the Scores is
>     different whether a primate is in its first trials or in its
>     fourth one. You would then need random slopes, and then one way to
>     go would be: Scores~Condition*Species+(1+Condition|Subject/Order),
>     you would then get the same estimate as in the previous options
>     plus how much the Condition slope vary between the Subject and
>     within the Subject, between the Order. Seeing your number of
>     levels I guess that the estimation will be rather tricky ...
>     You can see the wiki for more infomation on this:
>     http://glmm.wikidot.com/faq#modelspec
>     I guess that your are misinterpreting the random slope part, you
>     can see it as an interaction term between one fixed effect term
>     and one random term, for example if you were to measure the
>     weights of your primates and made the hypothesis that the weights
>     affect the scores but that this effect (direction+strength ie
>     slope) might vary between your subject then you would have a
>     random slope of weight depending on the subject (weight|subject).
>
>     2-3) The first method identify if the interaction term explain a
>     big enough portion of the total sum of square, it is a measure of
>     how important is this term at explaining the variation in your
>     data. The second method compare the likelihood (ie the probability
>     to find this dataset with this particular set of parameter)
>     between the model with and the model without the interaction term,
>     if the removal of the interaction term leads to a big decline in
>     the likelihood of the model then the p-value should score
>     significant and you should keep the full model, in the other case
>     the parcimony approach would lead you to choose the reduced model.
>     So the difference come from the fact that the two methods are
>     computing a different thing. As to which one is better this is a
>     tricky question, the way I would go would be to compute confidence
>     intervals around the main effect plus interaction term using
>     bootMer for example and then interpreting them. You may have a
>     look at ?pvalues for more options/suggestions.
>
>     As I am not familiar with lmerTest package I will not comment on
>     your last question.
>
>     Hoping that I clarified some points,
>     Lionel
>
>
>
>     On 27/11/2014 16:03, marie devaine wrote:
>
>         Dear mixed-model list,
>
>         I am sorry if my questions sound trivial: I am all new to R
>         and mixed model.
>
>         My data set is the following : I try to model scores of
>         primates from
>         different species in different conditions of a task. Each
>         individual
>         repeats each condition a certain number of time ( most of the
>         time 4 times
>         but with some exceptions).
>         I have only few individuals by specie (from 4 to 7), 3
>         conditions and 7
>         species
>
>         As dependent variables, I am mostly interested in the
>         condition and the
>         Specie, but I want to correct for learning effect at the
>         individual level
>         (parametric effect on repetition -'Order').
>
>         I wrote the following model (letting Subject be a random
>         effect and 'Order'
>         a random slope) :
>         fullm = lmer(Scores ~ Condition*Specie+(1+Order|Subject))
>         1) Is it a sensible way to model my data?
>
>         Then, I want to test for the interaction between Species and
>         condition. I
>         found two ways to do so with the lmerTest :
>         *computing the p-value of the F-test corresponding to
>         Specie:Condition as
>         given by anova(fullm).
>         *constructing the reduced model without the interaction
>         reducedm= lmer(Scores ~ Condition+Specie+(1+Order|Subject))
>         and performing the Likelihood ratio test : anova(reducedm,fullm).
>
>         2) What is the conceptual difference between the two methods?
>
>         3) The numerical results are different in my case (pvalues
>         around .05,
>         below in the reduced model manner, above in the F-test
>         manner), why is it
>         the case? Is one better than the other one?
>
>         4) This point is not directly related to my title, but on the
>         same data and
>         still on the lmerTest pasckage : the Species for now are
>         categorical, but I
>         could instead take a numerical value such as the
>         encephalization quotient
>         for each specie. In this case how could I evaluate the
>         significance of the
>         parametric effect? lsmeans seems to care only about
>         categorical factors.
>
>         It is very likely that I miss here very simple points, and
>         would be very
>         thankful if you could help me with it.
>
>         Thank you in advance,
>
>         Marie Devaine
>
>                 [[alternative HTML version deleted]]
>
>         _______________________________________________
>         R-sig-mixed-models at r-project.org
>         <mailto:R-sig-mixed-models at r-project.org> mailing list
>         https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>
>     _______________________________________________
>     R-sig-mixed-models at r-project.org
>     <mailto:R-sig-mixed-models at r-project.org> mailing list
>     https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>
>
>
> --
> Marie Devaine
> PhD Student at the Brain and Spine Institute (France)
> Personal Page