[R-sig-ME] Question on estimating intraclass correlation (and significance and confidence intervals) with binomial data using mixed model and/or GEE
Ben Bolker
bbolker at gmail.com
Fri Aug 30 21:32:34 CEST 2013
Tom Wenseleers <Tom.Wenseleers at ...> writes:
>
> Dear Ben,
> Many thanks for this - that's great!
> With the code you mention I get an NA
> though in the section that extracts the variance
> bcolvar <- (fixef(.)["colony"])^2 :
>
> If I try
> gmm2<- glmer(trait~colony+(1|patriline),
> family = binomial(link="logit"), data = dataset )
> vcor=VarCorr(gmm2) # variance components
> bpatrvar=vcor[["patriline"]]
> bcolvar <- (fixef(gmm2)["colony"])^2
>
> I get an NA for the bcolvar, since fixef(gmm2) returns me
> > fixef(gmm2)
> (Intercept) colony2
> 0.013433529 0.004875263
>
> Does that mean that I would have to use
> bcolvar <- (fixef(gmm2)["colony2"])^2 ? Or did you use other types of
> contrasts or something like that?
Yes, that seems to be a typo/thinko on my part: I'm not sure why
my code worked in the first place -- probably because I failed to
convert 'colony' to a factor (which wouldn't matter to the estimate
since colony only has two levels).
> Also a couple of remaining questions:
> 1) Is there any way / any model / other approach
> (eg using a GEE?) that would allow for negative intraclass
> correlations? (right now ICCs are always positive,
> but in rare cases one might expect negative
> intraclass correlations)
Don't know. In principle, one way to do it is to allow compound
symmetric structures rather than using grouping variables at a
level of individual latent effects (i.e. if this were a LMM fitted
via nlme::lme you would use pdCompSymm to structure the random effects).
This seems as though it should be possible via lme4 eventually -- we
have a *very* bleeding-edge development 'flexLambda' branch on
github -- but holding your breath would be a bad idea.
>
> 2) If I wanted to allow patriline effects to be different for
> different colonies, would the appropriate
> model then be
> gmm<- glmer(trait~colony+(colony|patriline),
> family = binomial(link="logit"), data = dataset ) or
> would it be
> gmm<- glmer(trait~colony+(0+colony|patriline),
> family = binomial(link="logit"), data = dataset )?
Do you mean different among-patriline variances for each colony?
I'd have to think about it more carefully, but I don't think either
of these will work; I think you may need the finer level of control
that nlme offers (which you could use via MASS::glmmPQL, although there are
definitely concerns about the actual definitions of the models being
fitted in that way). There was a thread about this:
http://thread.gmane.org/gmane.comp.lang.r.lme4.devel/205/focus=207
I don't remember the details.
> 3) If I want to treat all my factors as fixed, ie using a nested model as in
> gm <- glm(trait~colony+patriline, family = binomial(link="logit"),
> data = dataset ) # patriline is in
> fact colony:patriline since all patrilines have unique codes
> how would I extract the variance components then?
In principle this should work:
## patrilines all have unique numbers
## I then fitted the hierarchically nested binomial mixed
gmm <- glmer(trait~(1|colony/patriline),
family = binomial(link="logit"), data = dataset )
gmm2<- glmer(trait~colony+(1|patriline),
family = binomial(link="logit"), data = dataset )
gmm3<- glm(trait~colony+patriline,
family = binomial(link="logit"), data = dataset,
contrasts=list(patriline=contr.sum,colony=contr.sum))
## And estimated the intraclass correlation using:
ICCfun <- function(.,type=1,verbose=FALSE) {
if (type<3) vcor=VarCorr(.)
if (type==1) {
bpatrvar=vcor[["patriline:colony"]] # among patriline(colony) variance
bcolvar=vcor[["colony"]] # among colony variance
} else if (type==2) {
bpatrvar=vcor[["patriline"]]
## difference between colonies, squared, / (n-1) [n=2]
bcolvar <- (fixef(.)["colony2"])^2
} else if (type==3) {
f <- coef(.)
pp <- na.omit(f[grep("^patriline",names(f))])
cp <- f[grep("^colony",names(f))]
## variance based on contrasts (n rather than n-1)
vfun <- function(x) sum(x^2)/length(x)
bpatrvar <- vfun(pp)
bcolvar <- vfun(cp)
}
## error variance is (pi^2)/3 for
## binomial logit model, and 1 for binomial probit model
## (Snijders & Bosker 1999)
if (family(.)$link=="probit") errvar=1 else errvar=(pi^2)/3
r <- c(bpatrvar/(bpatrvar+bcolvar+errvar)) # intraclass correlation
if (verbose) cat(bpatrvar,bcolvar,errvar,r,"\n")
r
}
ICCfun2 <- function(.,...) ICCfun(.,type=2,...)
ICCfun3 <- function(.,...) ICCfun(.,type=3,...)
ICC1 <- ICCfun(gmm,verbose=TRUE)
ICC2 <- ICCfun2(gmm2,verbose=TRUE)
ICC3 <- ICCfun3(gmm3,verbose=TRUE)
... but at the moment I'm not getting sensible answers,
so I don't know if I'm just doing something wrong, or whether
the shrinkage due to treating patriline as a random effect is
really so strong ...
> And is there any analogous function as bootMer for glm's?
> I guess it is not possible to analyse fixed effect
> models with lme4 by any chance?
No.
> Or would I have to do parametric bootstrapping using simulate() myself then?
Yes: that should be easy.
sumFun(update(model,data=simulate(model)))
should give you a PB replicate.
More information about the R-sig-mixed-models
mailing list