[R-sig-ME] Using anova vs. Anova for linear mixed model

Fri Sep 13 18:20:02 CEST 2019

Dear Kevin,

My brief advice is to use "type-II" tests (and to say that "type-I", i.e., sequential, tests are rarely sensible). The different "types" of tests address different hypotheses (unless the data are balanced), and it really isn't a good to do all of them in the same analysis.

The distinctions among the "types" of tests are sufficiently intricate that I'd rather not address them in an email, and the presence of empty cells complicates the matter. Depending on the configuration of empty cells, for example, some interactions might not be estimable and in any event will not be estimable in their entirety. You could do some reading (for example, these issues are addressed in my Applied Regression Analysis and Generalized Linear Models text), but I suggest that you seek competent statistical help, which is surely available locally at Duke. I suspect that there are substantive statistical issues concerning sparsity of data that need to be addressed on a non-mechanical level.

Best,
 John
  -----------------------------
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Sep 13, 2019, at 11:53 AM, Kevin Chu <kevin.m.chu using duke.edu> wrote:
> 
> Hello Dr. Alday and Dr. Fox,
> 
> Thank you for your replies. I am indeed using the anova method from lmerTest with the default Satterthwaite method for estimating ddf. 
> 
> I am not a statistics expert (I am a graduate student in electrical and computer engineering), so I do not entirely understand the differences between ANOVA types. I ran the anova method using the three ANOVA types, but I obtained very similar p-values. The part I am suspicious about is that the sum of squares for the STRATEGY factor is exactly equal to 0, which I suspect may be due to the missing cells.
> 
> My question: Do I need to specify any arguments in the anova method so that it can handle missing cells?
> 
> Thank you,
> Kevin
> 
>> On Sep 13, 2019, at 11:08 AM, Alday, Phillip <Phillip.Alday using mpi.nl> wrote:
>> 
>> Dear Jon, dear Kevin, 
>> 
>> I suspect Kevin is using lmerTest and not lme4 directly. lmerTest does have a type argument for anova()  and defaults to the Satterthwaite ddf approximation. 
>> 
>> Phillip 
>> 
>> Sent from my mobile, please excuse the brevity.
>> Von: "Fox, John" <jfox using mcmaster.ca>
>> Gesendet: Freitag, 13. September 2019 17:06
>> An: Kevin Chu
>> Cc: r-sig-mixed-models using r-project.org
>> Betreff: Re: [R-sig-ME] Using anova vs. Anova for linear mixed model 
>> 
>> Dear Kevin, 
>> 
>> It's not entirely clear to me what you did, because as far as I know, the anova() method for merMod objects supplied by the lme4 package doesn't have a type argument and computes sequential ("type-I") tests. (You say that you're using anova() in the stats package, but while stats provides the anova() generic function, the method is coming from someplace else.) 
>> 
>> That said, I suspect that the discrepancy is due to the empty cells in the table of the fixed-effects factors. Normally, Anova() will detect the resulting aliased coefficients in the model and report an error, but I believe that lmer() suppresses the aliased  coefficients by removing redundant columns of the model matrix. Whatever anova() method you used apparently detected the empty cells directly and printed a warning. 
>> 
>> Finally, and particularly in light of the empty cells, I wonder why you want to compute type-III tests. 
>> 
>> I hope that this is of some help, 
>> John 
>> 
>>   ----------------------------- 
>>   John Fox, Professor Emeritus 
>>   McMaster University 
>>   Hamilton, Ontario, Canada 
>>   Web: http::/socserv.mcmaster.ca/jfox 
>> 
>> > On Sep 12, 2019, at 2:14 PM, Kevin Chu <kevin.m.chu using duke.edu> wrote: 
>> > 
>> > Hello, 
>> > 
>> > I built a linear mixed effects model with three fixed factors and one random factor. I want to test for statistical significance of the fixed effects using F-tests from a type III ANOVA table. Since I am using a type III ANOVA, I understand that I need to set the contrasts to contr.sum so that the sums of squares are calculated correctly. 
>> > 
>> > These are the data types. 
>> > 
>> >> str(mydata) 
>> > 'data.frame': 280 obs. of  5 variables: 
>> > $ SUBJECT  : Factor w/ 20 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ... 
>> > $ CONDITION: Factor w/ 4 levels "anechoic","aula",..: 1 1 1 1 2 2 2 2 3 3 ... 
>> > $ CHANNEL  : Factor w/ 2 levels "0","1": 1 1 2 2 1 1 2 2 1 1 ... 
>> > $ STRATEGY : Factor w/ 2 levels "0","1": 1 2 1 2 1 2 1 2 1 2 ... 
>> > $ SCORE    : num  107.4 57 90.1 96.1 -16.4 ... 
>> > 
>> > Below is the code I used to generated the model. 
>> > 
>> > lmm <- lmer(SCORE ~ CONDITION * CHANNEL * STRATEGY + (1 | SUBJECT), data=mydata, contrasts=list(CONDITION=contr.sum, CHANNEL=contr.sum, STRATEGY=contr.sum)) 
>> > 
>> > I tried passing lmm through anova from the stats package and Anova from the car package, but I obtained different results (screenshots are attached). 
>> > 
>> > My questions: 
>> > 1) Why do anova and Anova give different results even though I specified type III ANOVA? 
>> > 2) Why is the Sum Sq equal to 0 in the table produced by anova? 
>> > 
>> > I would prefer not to release the data as I plan to publish a paper based on my results, but if it helps I can create dummy data. 
>> > 
>> > Thank you, 
>> > Kevin Chu 
>> > <Anova_car.png><anova_stats.png>_______________________________________________ 
>> > R-sig-mixed-models using r-project.org mailing list 
>> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models 
>> 
>> _______________________________________________ 
>> R-sig-mixed-models using r-project.org mailing list 
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models 
>> 
>