[R-sig-ME] AIC and LRT

Mon Feb 14 21:37:17 CET 2011

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 02/14/2011 02:52 PM, George Wang wrote:
> Dear list,
> 
> I am examining the effect of oak leaftying caterpillars on herbivory
> intensity, where the response variable is % leaf skeletonization, log
> transformed (log.skel).
> 
> There are 5 fixed effects:
> tree species (species) - 8 levels
> leaftie treatment (treat) - 4 levels
> condensed tannin (condensed) - continuous
> hydrolyzable tannin (hydrolyzable) - continuous
> total phenolics (total) - continuous
> 
> The random effect is individual trees (tree) with unique ID's, N=10 per
> species, each tree with a unique value set of the three phenolics, and each
> tree contains all four levels of the leaftie treatment.
> 
> I constructed a set of models with tree as random effect and treat as
> uncorrelated random effect:
>> mix2 <- lmer(log.Skel ~ Treat*Condensed*Hydrolyzable + (1|Tree) +
> (0+Treat|Tree), REML=FALSE, data=esppskel)
>> mix3 <- lmer(log.Skel ~ Treat*Condensed*Total + (1|Tree) + (0+Treat|Tree),
> REML=FALSE, data=esppskel)
>> mix5 <- lmer(log.Skel ~ Treat*Species*Condensed + (1|Tree) +
> (0+Treat|Tree), REML=FALSE, data=esppskel)
>> mix6 <- lmer(log.Skel ~ Treat*Species*Condensed*Total + (1|Tree) +
> (0+Treat|Tree), REML=FALSE, data=esppskel)
>> mix7 <- lmer(log.Skel ~ Treat*Species*Condensed*Hydrolyzable + (1|Tree) +
> (0+Treat|Tree), REML=FALSE, data=esppskel)
> 
> and compared them using anova.
> 
>       Df    AIC    BIC  logLik  Chisq Chi Df Pr(>Chisq)
> mix2  28 1938.4 2072.0 -941.21
> mix3  28 1941.0 2074.6 -942.51  0.000      0    1.00000
> mix5  76 1977.8 2340.2 -912.87 59.274     48    0.12755
> mix6 140 2013.2 2680.9 -866.58 92.578     64    0.01124 *
> mix7 140 2041.7 2709.5 -880.86  0.000      0    1.00000
> 
> I understand that I cannot compare the models based on LRT because they are
> not nested, so I should use the AIC. It is also my understanding that models
> with ÄAIC > 10 are considered different. Based on the above output, for
> example mix6 and mix7 have a ÄAIC > 25, so should I ignore that p-value of
> 1.0 and consider them different? Also, is there any meaning in the ordering
> of the models in these outputs? In this case, the models are ordered in
> ascending AIC values, but in a related study, my models were ordered in
> descending AIC values.
> 

    I would be strongly tempted to make Species a random variable (you
would then be making inferences about the variance among species (in
their 'intrinsic' (=intercept) level of vulnerability, and in their
response to treatment etc.): the full model of (potential) interest
would be

lmer(log.Skel ~ Treat*Condensed*Hydrolyzable*Total + (Treat|Tree)+
   (Treat*Condensed*Hydrolyzable*Total|Species), data=esppskel)

 but this is almost certainly overkill.  So are the models you have
fitted above: the rule of thumb is that (# parameters)/(model df) should
be 10-20, very rarely less than 10, and you have 320/140 which is a lot
less than that ...
  (You might want to take a look at the electronic supplements to Bolker
et al 2009 _Trends in Ecology and Evolution_ for an example of fitting a
model with response to [simulated] herbivory that varies across groups
[genotypes in that case] ...)

You appear to have 320 total observations (4 treatments * 10
trees/species * 8 species), so the model I suggested has (way) too many
parameters (something like 32 fixed effect parameters plus 32 random
effect *variances* for the species case (plus 16*31 correlation
parameters, I think!) plus 3 parameters (2 variances plus 1 correlation
parameter) for the among-tree variability.
  I don't know your system at all, but I would suggest you try to make
some strategic choices about which interactions are most likely to be
relevant/interpretable, e.g. use something like

  Treat*(Condensed+Hydrolyzable+Total) [i.e. no interactions among the
different types of phenolics] which would get you down from 32 to 12,
and consider fitting a smaller subset of these in the random effects
(maybe just additive effects?).

 To move on to your actual question :-)  Are you just doing model
selection because you can't handle the non-nestedness? My general
advice, if you are interested in testing hypotheses, would be to fit the
full model (meaning one that you have restricted _a priori_ to a
reasonable size) and make inferences on that basis, *possibly* [slightly
controversially] removing non-significant/small interactions to aid in
interpretation.

  Models with AIC>10 are considered so much worse that they can be
excluded from further consideration (not just "different").  The
p-values in your case are nearly meaningless: in each case they are
tests against the previous model, which in the case of the "df=0"
contrasts are untestable.  In the anova() table with likelihood ratio
("Chisq") tests, the ordering matters (as mentioned in previous
sentence); in AIC tables ordering doesn't matter, by convention models
are often listed in order of increasing AIC (i.e. best model at the top).

  good luck,
   Ben Bolker
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1Zkn0ACgkQc5UpGjwzenPkawCfbLFwjiVr6VnHu2BCUzbrLtlS
7+oAnRTy5ikam/ZK3B1/Vf4mBV1dlG/x
=lleN
-----END PGP SIGNATURE-----