[R-sig-ME] lmertest F-test anova(fullm) and anova(fullm, reducedm)

Thu Nov 27 18:52:14 CET 2014

Dear Marie Devaine,

1) The way you account for the order effects is not the way I would go, 
I can see various options:
  - The effect of Order on Scores is not changing the relationships 
between your fixed effects part and the Scores, and each individuals is 
"learning" the task differently I would then use a nested random part: 
Scores~Condition*Specie+(1|Subject/Order), you would then get an 
estimation of much variation there is in the Scores between subject and 
also how much variation there is within subject between Order levels.
- Order is changing the relationship between your fixed effect part and 
the Score, ie the Condition effect on the Scores is different whether a 
primate is in its first trials or in its fourth one. You would then need 
random slopes, and then one way to go would be: 
Scores~Condition*Species+(1+Condition|Subject/Order), you would then get 
the same estimate as in the previous options plus how much the Condition 
slope vary between the Subject and within the Subject, between the 
Order. Seeing your number of levels I guess that the estimation will be 
rather tricky ...
You can see the wiki for more infomation on this: 
http://glmm.wikidot.com/faq#modelspec
I guess that your are misinterpreting the random slope part, you can see 
it as an interaction term between one fixed effect term and one random 
term, for example if you were to measure the weights of your primates 
and made the hypothesis that the weights affect the scores but that this 
effect (direction+strength ie slope) might vary between your subject 
then you would have a random slope of weight depending on the subject 
(weight|subject).

2-3) The first method identify if the interaction term explain a big 
enough portion of the total sum of square, it is a measure of how 
important is this term at explaining the variation in your data. The 
second method compare the likelihood (ie the probability to find this 
dataset with this particular set of parameter) between the model with 
and the model without the interaction term, if the removal of the 
interaction term leads to a big decline in the likelihood of the model 
then the p-value should score significant and you should keep the full 
model, in the other case the parcimony approach would lead you to choose 
the reduced model. So the difference come from the fact that the two 
methods are computing a different thing. As to which one is better this 
is a tricky question, the way I would go would be to compute confidence 
intervals around the main effect plus interaction term using bootMer for 
example and then interpreting them. You may have a look at ?pvalues for 
more options/suggestions.

As I am not familiar with lmerTest package I will not comment on your 
last question.

Hoping that I clarified some points,
Lionel

On 27/11/2014 16:03, marie devaine wrote:
> Dear mixed-model list,
>
> I am sorry if my questions sound trivial: I am all new to R and mixed model.
>
> My data set is the following : I try to model scores  of primates from
> different species in different conditions of a task. Each individual
> repeats each condition a certain number of time ( most of the time 4 times
> but with some exceptions).
> I have only few individuals by specie (from 4 to 7), 3 conditions and 7
> species
>
> As dependent variables, I am mostly interested in the condition and the
> Specie, but I want to correct for learning effect at the individual level
> (parametric effect on repetition -'Order').
>
> I wrote the following model (letting Subject be a random effect and 'Order'
> a random slope) :
> fullm = lmer(Scores ~ Condition*Specie+(1+Order|Subject))
> 1) Is it a sensible way to model my data?
>
> Then, I want to test for the interaction between Species and condition. I
> found two ways to do so with the lmerTest :
> *computing the p-value of the F-test corresponding to Specie:Condition as
> given by anova(fullm).
> *constructing the reduced model without the interaction
> reducedm= lmer(Scores ~ Condition+Specie+(1+Order|Subject))
> and performing the Likelihood ratio test : anova(reducedm,fullm).
>
> 2) What is the conceptual difference between the two methods?
>
> 3) The numerical results are different in my case (pvalues around .05,
> below in the reduced model manner, above in the F-test manner), why is it
> the case? Is one better than the other one?
>
> 4) This point is not directly related to my title, but on the same data and
> still on the lmerTest pasckage : the Species for now are categorical, but I
> could instead take a numerical value such as the encephalization quotient
> for each specie. In this case how could I evaluate the significance of the
> parametric effect? lsmeans seems to care only about categorical factors.
>
> It is very likely that I miss here very simple points, and would be very
> thankful if you could help me with it.
>
> Thank you in advance,
>
> Marie Devaine
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models