[R] Comparing GLMMs and GLMs with quasi-binomial errors?

Sun Apr 23 13:53:54 CEST 2006

Dear All,

I am analysing a dataset on levels of herbivory in seedlings in an 
experimental setup in a rainforest.
I have seven classes/categories of seedling damage/herbivory that I want to 
analyse, modelling each separately.

There are twenty maternal trees, with eight groups of seedlings around each. 
Each tree has a TreeID, which I use as the random effect (blocking factor).

There are two fixed effects: DISTANCE - distance to maternal tree; two 
levels 'CLOSE' or 'AWAY' (four groups of seedlings each per tree), and 
PLATEAU - whether the maternal tree grows on the 'UPPER' plateau (bad soil) 
or 'LOWER' plateau (good soil).

In each group of seedlings, we randomly selected one seedling where we 
scored herbivory. Levels of herbivory for each of the seven herbivory 
categories was scored as proportion of leaves attacked. Obviously, I don't 
want to use a more complicated model than necessary - but I equally 
obviously want to take the random effect 'TreeID' into account.
Hence, for each herbivory category, I initially fitted a GLMM using the 
'glmmPQL' command from the MASS library(after using the 'cbind()' command on 
the two columns with total number of leaves per seedling and number of 
leaves attacked by that herbivory category) - and then compared these models 
to GLMs without the random effect.

## model example1: leaf mines GLMM
proportion.leafmines <- cbind(leaves.affected, total.leaves - 
leaves.affected)
leafminesGLMM <- glmmPQL(proportion.leafmines  ~ PLATEAU * DISTANCE, 
random=~1| TreeID, family=binomial(link=logit))
##AIC(leafminesGLMM) =  474.773

## model example2: leaf mines GLM
leafminesGLM <- glm(proportion.leafmines  ~ PLATEAU * DISTANCE, 
family=binomial(link=logit))
##AIC(leafminesGLM) = 207.9465

...and so on, for all seven herbivory categories. In four of the cases, the 
AIC is much lower (as in the example bove) for the GLMs than for the GLMMs - 
whereas in three other cases, clearly TreeID is an important random factor, 
as the AIC values of the GLMs are much higher than the ones for the GLMMs. 
There is not a big difference in significance levels - some marginally 
significant ones now become significant, while some significant ones now 
become marginal.
However, there is one complication to simply using the AIC scores to 
evaluate which model is the best; for almost all the cases where the GLM has 
the lower AIC, the data are overdispersed, and I need to fit the model with 
a quasibinomial, rather than with a binomial error structure. BUT - using a 
GLM with quasibinomial error structure, I of course no longer get an AIC 
score...

-so, my main question is: can I simply use the GLM with quasibinomial error 
structure instead of the GLMM if the GLM with binomial error structure has a 
lower AIC score than the GLMM?

Any input on how I can compare such models would be greatly appreciated!

Dennis

-----------------------------------------------------------
Dennis Marinus Hansen
Institute of Environmental Sciences
University of Zurich
Winterthurerstrasse 190
8057 Zurich
Switzerland
tel: +41 (0) 44635 6122
fax: +41 (0) 44635 5711
www.uwinst.unizh.ch