[R-sig-ME] Question on mixed models, pseudoreplication and inflated degrees of freedom

Thu Aug 1 02:10:54 CEST 2013

Dear all,
I would appreciate any feedback on the following experimental setup and especially regarding a referee comment that suggested we were using inflated degrees of freedom and that our experiment suffered from pseudoreplication. So my question is whether the mixed model we use adequately takes into account different sources of dependency in our data, and hence resolves potential pseudoreplication.

The setup is with a clonal aphid species of which different genetic lines ( variable Clones, n=8) were used. Those lines could be divided into two groups for a certain characteristic (amount of SUGAR produced in their honeydew - Low or High). The measurements of the dependent variable (FITNESS, nr of aphids present 1 week after inoculating a plant) were repeated 4 times per clone on different plants (PLANT). Finally, there was one other factor with 2 categories (ANT TENDED - Yes or No, describing whether the aphid colonies were tended by ants or not), and a numerical covariate we would like to correct for (C).

We then used the following GLMM with poisson error structure
glmer(FITNESS~SUGAR+ANT_TENDED+ANT_TENDED*SUGAR+C+(1|PLANT)+(1|SUGAR/CLONES),family=poisson,data=data)

Significance of the fixed effects was then tested using Wald type tests and likelihood ratio tests.

My question is whether this model adequately captures our error structure? 
Furthermore, are inflated degrees really an an issue with mixed models and how should this be reported?
I suppose this is only relevant in an ANOVA type analysis, right, but not in a mixed modelling context, since the likelihood ratio tests have 1 df anyway?
Or would it indeed apply if one assessed significance using an anova type methodology, e.g. using lmertest with Sattertwaithe or Kenward Roger df approximations?

Specific comments we received were:
"From the presented results we have no idea whether effects of melezitose type have indeed been tested over the right model (in a GLMM type analysis) or error term (clone within melezitose type), which actually should have quite low df in these experiments (in an ANOVA type analysis)."

"All scores used in the statistical tests should be averaged for four high-melezitose clones and four low-melezitose ones."
(This doesn't make sense, right, as that would represent a huge loss of information?)

"Because measurements on different aphids on the same plant are not independent, how did you account for this in the analysis?"
(This was covered through our inclusion of an observation (plant) level random factor, right?)

Would any of you be able to advise me on this by any chance?

Yours sincerely,
Tom Wenseleers

_______________________________________________________________________________________

Prof. Tom Wenseleers
P      Lab. of Socioecology and Social Evolution
           Dept. of Biology
           Zoological Institute
           K.U.Leuven
           Naamsestraat 59, box 2466
           B-3000 Leuven
           Belgium 
  +32 (0)16 32 39 64 / +32 (0)472 40 45 96
//tom.wenseleers at bio.kuleuven.be
http://bio.kuleuven.be/ento/wenseleers/twenseleers.htm