[R-sig-ME] Question on estimating intraclass correlation (and significance and confidence intervals) with binomial data using mixed model and/or GEE

Fri Aug 30 21:32:34 CEST 2013

Tom Wenseleers <Tom.Wenseleers at ...> writes:

> 
> Dear Ben,
> Many thanks for this - that's great! 
> With the code you mention I get an NA 
> though in the section that extracts the variance 
> bcolvar <- (fixef(.)["colony"])^2  :
> 
> If I try
> gmm2<- glmer(trait~colony+(1|patriline), 
>     family = binomial(link="logit"), data = dataset )
> vcor=VarCorr(gmm2)  # variance components
> bpatrvar=vcor[["patriline"]]
> bcolvar <- (fixef(gmm2)["colony"])^2
> 
> I get an NA for the bcolvar, since fixef(gmm2) returns me
> > fixef(gmm2)
> (Intercept)     colony2 
> 0.013433529 0.004875263
> 
> Does that mean that I would have to use
>  bcolvar <- (fixef(gmm2)["colony2"])^2 ? Or did you use other types of
> contrasts or something like that?

  Yes, that seems to be a typo/thinko on my part: I'm not sure why
my code worked in the first place -- probably because I failed to
convert 'colony' to a factor (which wouldn't matter to the estimate
since colony only has two levels).

> Also a couple of remaining questions:
> 1) Is there any way / any model / other approach 
> (eg using a GEE?) that would allow for negative intraclass
> correlations? (right now ICCs are always positive, 
> but in rare cases one might expect negative
> intraclass correlations)

  Don't know.  In principle, one way to do it is to allow compound
symmetric structures rather than using grouping variables at a
level of individual latent effects (i.e. if this were a LMM fitted
via nlme::lme you would use pdCompSymm to structure the random effects).
This seems as though it should be possible via lme4 eventually -- we
have a *very* bleeding-edge development 'flexLambda' branch on 
github -- but holding your breath would be a bad idea.

> 
> 2) If I wanted to allow patriline effects to be different for 
> different colonies, would the appropriate
> model then be
> gmm<- glmer(trait~colony+(colony|patriline), 
> family = binomial(link="logit"), data = dataset ) or
> would it be
> gmm<- glmer(trait~colony+(0+colony|patriline), 
>   family = binomial(link="logit"), data = dataset )?

   Do you mean different among-patriline variances for each colony?
   I'd have to think about it more carefully, but I don't think either
of these will work; I think you may need the finer level of control
that nlme offers (which you could use via MASS::glmmPQL, although there are
definitely concerns about the actual definitions of the models being
fitted in that way).  There was a thread about this:
http://thread.gmane.org/gmane.comp.lang.r.lme4.devel/205/focus=207
I don't remember the details.

> 3) If I want to treat all my factors as fixed, ie using a nested model as in
> gm <- glm(trait~colony+patriline, family = binomial(link="logit"), 
>  data = dataset )                      # patriline is in
> fact colony:patriline since all patrilines have unique codes
> how would I extract the variance components then?

In principle this should work:

## patrilines all have unique numbers
## I then fitted the hierarchically nested binomial mixed
gmm <- glmer(trait~(1|colony/patriline), 
   family = binomial(link="logit"), data = dataset )
gmm2<- glmer(trait~colony+(1|patriline), 
   family = binomial(link="logit"), data = dataset )
gmm3<- glm(trait~colony+patriline, 
    family = binomial(link="logit"), data = dataset,
           contrasts=list(patriline=contr.sum,colony=contr.sum))

## And estimated the intraclass correlation using:
ICCfun <- function(.,type=1,verbose=FALSE) {
    if (type<3) vcor=VarCorr(.) 
    if (type==1) {
        bpatrvar=vcor[["patriline:colony"]]  #  among patriline(colony) variance
        bcolvar=vcor[["colony"]]  # among colony variance
    } else if (type==2) {
        bpatrvar=vcor[["patriline"]]
        ## difference between colonies, squared, / (n-1) [n=2]
        bcolvar <- (fixef(.)["colony2"])^2 
    } else if (type==3) {
        f <- coef(.)
        pp <- na.omit(f[grep("^patriline",names(f))])
        cp <- f[grep("^colony",names(f))]
        ## variance based on contrasts (n rather than n-1)
        vfun <- function(x) sum(x^2)/length(x)
        bpatrvar <- vfun(pp)
        bcolvar <- vfun(cp)
    }
    ## error variance is (pi^2)/3 for
    ## binomial logit model, and 1 for binomial probit model 
    ## (Snijders & Bosker 1999)
    if (family(.)$link=="probit") errvar=1 else errvar=(pi^2)/3 
    r <- c(bpatrvar/(bpatrvar+bcolvar+errvar)) # intraclass correlation
    if (verbose) cat(bpatrvar,bcolvar,errvar,r,"\n")
    r
}

ICCfun2 <- function(.,...) ICCfun(.,type=2,...)
ICCfun3 <- function(.,...) ICCfun(.,type=3,...)

ICC1 <- ICCfun(gmm,verbose=TRUE)
ICC2 <- ICCfun2(gmm2,verbose=TRUE)
ICC3 <- ICCfun3(gmm3,verbose=TRUE)

  ... but at the moment I'm not getting sensible answers,
so I don't know if I'm just doing something wrong, or whether
the shrinkage due to treating patriline as a random effect is
really so strong ...

> And is there any analogous function as bootMer for glm's? 
> I guess it is not possible to analyse fixed effect
> models with lme4 by any chance? 

  No.

> Or would I have to do parametric bootstrapping using simulate() myself then?

 Yes: that should be easy.

  sumFun(update(model,data=simulate(model)))

 should give you a PB replicate.