[R-sig-ME] Could the random effect at the level of each observation be a trap?

Mon Dec 13 20:39:41 CET 2010

This question of overdispersion in binomial models is near and dear to
my heart - so I tried to make an example to see if I can replicate the
type of situation you're describing Billy (code below). I guess I
don't understand the data you have, because as long as each male is
either successful or unsuccessful (once), you will never observe
overdispersion, at least not the way I understand it?

setseed(32493)
x = rnorm(100)
z = rnorm(100)
maleID = sample(1:40,size=100,replace=TRUE)
n.male = length(unique(maleID))
sigmaMale = 1
sigmaResid = 0.2
Maleeffect = rnorm(40,sd=sigmaMale)
Resideffect = rnorm(100,sd=sigmaResid)
beta = c(0,2,1,0.54) # effect of x, z, and x*z

logodds.y = cbind(1,x,z,x*z) %*% beta + Maleeffect[maleID] + Resideffect

y = rbinom(100,1,1/(1+exp(-logodds.y)))

df = data.frame(x=x,z=z,maleID=maleID,y=y)

glm(y~x*z,data=df,family=binomial) # OK, seems to work

library(lme4)

models = list(y~x*z + (1|maleID),
	y~x+z + (1|maleID),
	y~x + (1|maleID),
	y~z + (1|maleID),
	y~1 + (1|maleID))

fits = lapply(models,function(ff)glmer(ff,family=binomial,data=df))

sapply(fits,function(fm)summary(fm)@AICtab) # model 2 is best,
parameters about right

# ok try this residual idea
resid <- as.factor(1:dim(df)[1])

models = list(y~x*z + (1|maleID)+(1|resid),
	y~x+z + (1|maleID)+(1|resid),
	y~x + (1|maleID)+(1|resid),
	y~z + (1|maleID)+(1|resid),
	y~1 + (1|maleID)+(1|resid))

fits = lapply(models,function(ff)glmer(ff,family=binomial,data=df))

# lots of warnings, otherwise results unaffected - estimated variance
of resid very small

On Mon, Dec 13, 2010 at 11:33 AM, Billy <billy.requena at gmail.com> wrote:
> Hi Ben,
>
> thanks for your reply.
> In fact, I've tried the additive random effect of MaleID and resid,
> but I got the same problem: quantitatively and qualitatively
> differences. Models considering Male ID + resid showed smaller
> deviances than models that only considered MaleID AND among models
> considering MaleID + resid, the best model was the one which consider
> no effect of any explanatory variable, while among models considering
> only MaleID as random variable, the best model was the one considering
> effect of the interaction between 2 explanatory variables.
> I still have doubts about the usage of this "resid" variable, mainly
> because the interpretation of the results is completely different. Is
> the decrease of the whole deviance enough justification to use that
> and interpret only the results obtained among models considering
> MaleID + resid?
>
> Thanks again for you all,
>
> Billy
>
> On Sat, Dec 11, 2010 at 12:27 PM, Ben Bolker <bbolker at gmail.com> wrote:
>> On 10-12-10 09:39 PM, Billy wrote:
>>> Hi all!
>>>
>>> I'm a relatively newbie ecologist student getting adventures at the
>>> mixed models world and facing some trouble to interpret random
>>> effects. I hope someone could help me.
>>> Quickly, I'm constructing different models using glmer() to discover
>>> which factors could influence females' reproductive decisions. I have
>>> sampled several males and classified them as successful or
>>> unsuccessful.  Therefore, I'm modelling logistic regressions with more
>>> than one fixed variable and random variables.
>>> I have sampled individuals monthly and, sometimes, the same individual
>>> (MaleID) was sampled more than once, in different status. Then, I used
>>> "MaleID" as a random variable.
>>> Well, I built a bunch of models considering only MaleID as the random
>>> variable as:
>>>
>>> m1 <- glmer(y ~ 1 + (1|MaleID), family=binomial)
>>> m2 <- glmer(y ~ x + (1|MaleID), family=binomial)
>>> m3 <- glmer(y ~ z + (1|MaleID), family=binomial)
>>>
>>>
>>> Moreover, in several posts here I've read about count data show high
>>> overdispersion, even using family=binomial for the error. One
>>> recurrent solution suggested is create a vector to each observation
>>> as:
>>>
>>> resid <- as.factor(1:dim(data)[1])
>>>
>>> Then, I built models considering this random variable too, trying to
>>> understand that, as following
>>>
>>> m4 <- glmer(y ~ 1 + (1|resid), family=binomial)
>>> m5 <- glmer(y ~ x + (1|resid), family=binomial)
>>> m6 <- glmer(y ~ w + (1|resid), family=binomial)
>>>
>>> and
>>>
>>> m7 <- glmer(y ~ 1 + (1|MaleID:resid ), family=binomial)
>>> m8 <- glmer(y ~ x + (1|MaleID:resid ), family=binomial)
>>> m9 <- glmer(y ~ z + (1|MaleID:resid ), family=binomial)
>>>
>>> Using a model selection approach and looking for the deviance and the
>>> AIC values, I observed that the correspondent models from the second
>>> block (m4, m5 and m6 in the example above) and the third block (m7, m8
>>> and m9) showed the same values. Is that due to the "resid" random
>>> effect?
>>>
>>
>>   I think you want
>>
>>  ~ [fixed predictor(s)] + (1|MaleID)+(1|resid)
>>
>> ?
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
>
>
> --
> Gustavo Requena
> PhD student - Laboratory of Arthropod Behavior and Evolution
> Universidade de São Paulo
> Correspondence adress:
> a/c Glauco Machado
> Departamento de Ecologia - IBUSP
> Rua do Matão - Travessa 14 no 321 Cidade Universitária, São Paulo - SP, Brasil
> CEP 05508-900
> Phone number: 55 11 3091-7488
>
> http://ecologia.ib.usp.br/opilio/gustavo.html
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

-- 
Drew Tyre

School of Natural Resources
University of Nebraska-Lincoln
416 Hardin Hall, East Campus
3310 Holdrege Street
Lincoln, NE 68583-0974

phone: +1 402 472 4054
fax: +1 402 472 2946
email: atyre2 at unl.edu
http://snr.unl.edu/tyre
http://aminpractice.blogspot.com
http://www.flickr.com/photos/atiretoo