[R-sig-ME] GLMM with lme4/ marginal predictions

Fri Feb 3 19:54:42 CET 2012

Hi everybody,

I have a (overdispersed) Poisson GLMM  and would like to calculate
marginal predictions while taking random effects into account. To do
that, I am following a simulation based approach which is briefly
described in the technical appendix of the following paper by David
Atkins and colleagues (p. 26):

http://depts.washington.edu/cshrb/newweb/stats%20documents/Draft.Longitudinal.Regression.pdf

The basic idea is to draw N number of samples from the posterior
distributions of the random effects and then averaging the predictions
over all N realizations of the data. So far so good. My problem is now
that I would like to get marginal counts from a model that, besides
including my main variable of interest, includes other covariates as
well. Now, if I would just doing predictions from fixed effects, I
would just average the predictions across my data. However, I cannot
think of a way to average across the random draws from the posterior
random effects distributions _and_ averaging predictions across my
data at the same time? That is, integrating out the random effects
_and_ integrating predictions over other covariates.

I was hoping that somebody here may have done something similar in the
past and could give me a hint?

To be a bit more concrete, I will post some code to illustrate the problem:

My data are yearly observations (5 years) from (~900) schools that are
nested within (~170) school districts.
So, a model with only my independent variable of interest (hfc2) could
look like this:

model.1=glmer(paidcount ~ hfc2  +
    (1|obs_effect) + (1+hfc2|school.id) + (1+hfc2|dist.id), family="poisson",
    offset=log(offspaid0), verbose=FALSE, data=mc1)

What follows is the code to draw from the posterior random effects
distributions and averaging predictions across draws:

(This code is mostly adopted from the above mentioned paper's
accompanying webpage:
http://depts.washington.edu/cshrb/newweb/stats%20documents/Rcode2011.R )

#-------------START-----------#

### Number of simulations
M <- 10000

### Fixed-effect estimates
beta <- fixef(model.1)

### Variance-covariance information
vc <- VarCorr(model.1)

### Pull-out over-dispersion term (as SD)
sdover <- as.numeric(attr(vc$obs_effect, "stddev"))

### Pull-out var-cov of random-effects as matrix (school level)
sdid.school <- as.matrix(vc$school.id)[1:2, 1:2]

### Pull-out var-cov of random-effects as matrix (district level)
sdid.dist <- as.matrix(vc$dist.id)[1:2, 1:2]

### Generate M random draws from variances
rover <- rnorm(M, sd = sdover)
rid1 <- mvrnorm(M, mu = rep(0,2), Sigma = sdid.school)
rid2 <- mvrnorm(M, mu = rep(0,2), Sigma = sdid.dist)

### Get predictions including fixed-effects as well as M
### simulations from random-effects (and averaging)

mean.offset=mean(mc1$offspaid0) #for multiplication with exponentiated baseline

marg.nohfc <- mean(exp(beta[1])*mean.offset * #baseline
              exp(rover + rid1 + rid2 ))

marg.hfc <- mean(exp(beta[1])*mean.offset * #baseline
              exp(beta[2] + #covariate
              rover + rid1 + rid2 ))

#Displaying marginal counts
round(cbind(marg.hfc.el, marg.nohfc.el), 2)

#-------------END-----------#

At this point, I get the marginal counts for my independent (binary)
variable. I am just not sure how to average predictions across my data
as well, if I have other covariates in the model?

Any help is much appreciated!

Joerg