[R-sig-ME] Could the random effect at the level of each observation be a trap?

Sat Dec 11 03:39:15 CET 2010

Hi all!

I'm a relatively newbie ecologist student getting adventures at the
mixed models world and facing some trouble to interpret random
effects. I hope someone could help me.
Quickly, I'm constructing different models using glmer() to discover
which factors could influence females' reproductive decisions. I have
sampled several males and classified them as successful or
unsuccessful.  Therefore, I'm modelling logistic regressions with more
than one fixed variable and random variables.
I have sampled individuals monthly and, sometimes, the same individual
(MaleID) was sampled more than once, in different status. Then, I used
"MaleID" as a random variable.
Well, I built a bunch of models considering only MaleID as the random
variable as:

m1 <- glmer(y ~ 1 + (1|MaleID), family=binomial)
m2 <- glmer(y ~ x + (1|MaleID), family=binomial)
m3 <- glmer(y ~ z + (1|MaleID), family=binomial)

Moreover, in several posts here I've read about count data show high
overdispersion, even using family=binomial for the error. One
recurrent solution suggested is create a vector to each observation
as:

resid <- as.factor(1:dim(data)[1])

Then, I built models considering this random variable too, trying to
understand that, as following

m4 <- glmer(y ~ 1 + (1|resid), family=binomial)
m5 <- glmer(y ~ x + (1|resid), family=binomial)
m6 <- glmer(y ~ w + (1|resid), family=binomial)

and

m7 <- glmer(y ~ 1 + (1|MaleID:resid ), family=binomial)
m8 <- glmer(y ~ x + (1|MaleID:resid ), family=binomial)
m9 <- glmer(y ~ z + (1|MaleID:resid ), family=binomial)

Using a model selection approach and looking for the deviance and the
AIC values, I observed that the correspondent models from the second
block (m4, m5 and m6 in the example above) and the third block (m7, m8
and m9) showed the same values. Is that due to the "resid" random
effect?

Besides that, those models in fact showed a lower deviance (an
improvement of more than 300) compared to the models that only
included "MaleID". When using only "MaleID" as random variable, a
model considering the interaction between x and w was the most
plausible. On the other hand, using the "resid" as random variable,
the null model (considering no effect of x or w) was selected.

Is there any possibility that, with this "resid" procedure, I am being
trapped in some statistical artifact?

Thank you all for any help
-- 
Gustavo Requena
PhD student - Laboratory of Arthropod Behavior and Evolution
Universidade de São Paulo
Correspondence adress:
a/c Glauco Machado
Departamento de Ecologia - IBUSP
Rua do Matão - Travessa 14 no 321 Cidade Universitária, São Paulo - SP, Brasil
CEP 05508-900
Phone number: 55 11 3091-7488

http://ecologia.ib.usp.br/opilio/gustavo.html