[R-sig-ME] Simulate samples from glmmPQL

Tue Oct 23 23:35:52 CEST 2012

Andrew Digby <andrewdigby at ...> writes:

>  I want to use a parametric bootstrap on a GLMM to to determine
> parameter bias
> (http://www.bris.ac.uk/cmm/team/hg/bias-correction.pdf) and
> hopefully to estimate effect confidence intervals
> (http://glmm.wikidot.com/faq).

> I'm using glmmPQL{MASS} since I need to include an AR1 correlation
> structure, but there are no existing parametric bootstrap routines
> for this. I'm writing my own, but am coming unstuck when trying to
> simulate a response from the model:

> m1<-glmmPQL(y ~ trt , random = ~ 1 | ID, correlation=corAR1(form=~week), 
>  family = binomial, data = bacteria)
> simulate(m1)
> 
> 	Error in simulate.lme(m1) : 
>   		Models with corStruct and/or varFunc objects not allowed.

> I'm trying to write my own simulation code by modifying that from
> lme4 [getMethod('simulate',signature='mer')]. I'm not sure if or how
> I should be including the correlation structure though, and guess
> there's a very good reason why simulate.lme doesn't work with a
> corStruct.

 > If anyone can give me an idea of how to simulate samples from a
> glmmPQL object with a corStruct, or if it's possible, then I'd be
> very grateful. Or just a way of obtaining a rough estimate of the
> bias would be great.  Would a non-parametric bootstrap be completely
> out of the question here
> (e.g.
> http://cran.r-project.org/doc/contrib/Fox-Companion/
>   appendix-bootstrapping.pdf)?

  This seems tricky to me.

1. generating correlated _Gaussian_ variables is no big deal (you can use
the RandomFields package, which is a bit of a sledgehammer/nutcracker scenario
for an AR(1) variable, or more simply construct your variance-covariance matrix
and use MASS::mvrnorm), but ...

2. Integrating this in order to write your own version of simulate.lme() that
did corStruct models might be hairy -- the internals of simulate.lme() are
a little complicated (but it still might be the best solution).

3. As Doug Bates has said here, it's not entirely clear whether the
model glmmPQL is fitting (essentially, a marginal model of the
variance-covariance structure) actually corresponds to anything you
could actually simulate.  A well-formed model of this sort would look
like:

   y ~ Binomial(plogis(X %*% beta + Z %*% b + eps))
   b ~ Sigma_b (from G-side/block structure)
   eps ~ MVN(AR(1))

this is a little crude, but the idea is that you actually break the model into 
well-defined hierarchical bits.  It's not clear (to me at least) whether the
model glmmPQL fits in this case actually corresponds to this statistical model
or not.  (INLA appears like it might be the best way forward on this front ...)
(The advantage of hacking simulate.lme is that you'd be taking advantage
of whatever simulate.lme is doing ...)

4. It turns out that corAR1 is not the only problem: even a 'plain' glmmPQL
call (without any correlations) uses a varStruct internally, which breaks
simulate.lme (there is no simulate.glmmPQL, so simulate falls back on
simulate.lme).  It seems to me that if you could hack simulate.lme you
could get a result with the right _variance-covariance structure_, but it
wouldn't be Bernoulli -- it would be normally distributed.  And, just
to add insult to injury, simulate.lme doesn't work the way simulate
methods work in the rest of R -- simulate.lme was written earlier,
and does a bunch of refitting (essentially, parametric bootstrapping!)
stuff rather than returning the raw simulation values.