[R-sig-ME] Bootstrapping glmer random effects

Sun Jun 17 13:12:06 CEST 2012

Hi Joe,

I am not sure if I can give you good advice on this but I can try. What I
noticed about your outputs was that you had the same number of observations for
your analysis on the original data and for the last bootstrap output. The number
of cities varied only by 1. This looks like you used your individual
observations as the resampling unit. However, as one other responder has
mentioned already you should use cities as your resampling unit as you have
non-independence between observations within each city. When using city as the
resampling unit you need to include all observations from that city when a city
is picked. You need to also remember to rename your cities after picking them.
For example, if city1 is picked and has, say 4 observations you bring all 4
observations into your bootstrap data and call the city now city1_1 (for
example). Then, when city1 is picked again, bring all 4 observations into your
bootstrap data again and rename city1 to city1_2 for these 4 observations. The
reason you want to do that is that city is your grouping factor for the random
effect and you want to end up with the same number of different cities in your
bootstrap data as in your original data (218 - I believe). I hope this will
work. If you have questions about coding this up I might be able to help you.

Cheers, Cornelia

<>< <>< <>< <>< <>< <>< <><
Cornelia Oedekoven
CREEM
University of St Andrews
cornelia at mcs.st-and.ac.uk
www.creem.st-and.ac.uk
<>< <>< <>< <>< <>< <>< <><

The University of St Andrews is a charity registered in Scotland : No SC013532

Quoting Joe King <joeking1809 at yahoo.com>:

> Dear all
>
> I am attempting to obtain a bootstrap confidence interval for the random
> effect in a simple (random intercept) model using glmer.
>
> The problem I have is that the interval I obtain consistently does not
> contain the value I am trying to get an interval for ! For example I get the
> following output when I run glmer on the full data:
>
> Generalized linear mixed model fit by the Laplace approximation
> Formula: wg~ (1 | city)
>    Data: dt
>    AIC   BIC logLik deviance
>  10115 10131  -5056    10111
> Random effects:
>  Groups   Name        Variance Std.Dev.
>  city(Intercept)       0.14155  0.37623
> Number of obs: 19318, groups: city, 218
>
> Fixed effects:
>             Estimate Std. Error z value Pr(>|z|)   
> (Intercept) -2.58566    0.04045  -63.93   <2e-16 ***
>
> So I am trying to obtain the confidence interval for random effect variance :
> 0.14155.  Yet, the confidence interval I got was  0.2839343 , 0.3534999.
> Moreover, the value in every one of the bootstrap replicates is greater than
> 0.14155. For example, the output from glmer in the last replicate the last
> bootstrap replicate was
>
> Generalized linear mixed model fit by the Laplace approximation
> Formula: wg~ (1 | city)
>    Data: sam
>    AIC   BIC logLik deviance
>  10480 10496  -5238    10476
> Random effects:
>  Groups   Name        Variance Std.Dev.
>  city(Intercept)     0.32769  0.57245
> Number of obs: 19318, groups: city, 217
>
> Fixed effects:
>              Estimate Std. Error z value Pr(>|z|)   
> (Intercept) -2.58779    0.05142  -50.33   <2e-16 ***
>
> There are no missing data. This is the code I have used to obtain the
> interval:
>
> for (i in 1:k) {
>     sam <- dt[sample(nrow(dt), replace=T, size=nrow(dt)), ]
>     m1<- glmer(wg~(1|city), data=sam, family=binomial)   
>     bs[i] <- VarCorr(m1)$city[1]
> }
> quantile(bs,c(0.025,0.975))
>
> Could anyone suggest why this is happening, and what I might be able to do
> about it ?
>
> Thank you
> JK
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>

------------------------------------------------------------------
University of St Andrews Webmail: https://webmail.st-andrews.ac.uk