[R-sig-ME] predict.fixef? -- Help: making predictions for applied problems using glmm

Mon Mar 9 18:07:18 CET 2009

Perhaps I can tag on to Matt's question.  Back when I used lme() for my 
analyses, I would often use:

predict(lmeobj, newdata, level=0)

to get predictions from the model, only using the fixed-effects.  This 
is quite useful with interactions, nonlinear terms, and just generally 
for understanding the results of the model (as I'm sure many listmembers 
know...).

With lmer():

1. There have been a handful of emails over the years noting that it is 
particularly tricky to know what to do with *random-effects* in 
predictions.  Frankly, I'm not concerned about this for my own work, as 
I invariably was looking at predictions based on fixed-effects only.

2. There have also been emails showing how to "roll you own" 
predictions, using newdata, model.matrix, and multiplying through by the 
vector of fixed-effects coefficients.

Thus, #2 can do what I'd like (and I've used it), though I've also 
gotten bit once or twice by not having the columns line-up correctly 
between coefficients and newdata (okay, I concede operator error...). 
But, it sure would be handy to have a predict(..., level=0), or perhaps 
predict.fixef() command for use with lmer() objects.

I did actually look at the predict.lme code at one point, but I fear 
it's a bit beyond my limited coding talents...

[And, let me end by saying I'm reticent to ever suggest that Doug (or 
someone else) "ought" to code this up, because I'm acutely aware of how 
much work Doug has done to benefit the lmer() users.  I thought I'd 
throw out this request b/c Matt had suggested something similar, and 
perhaps someone has even cooked up such code already...]

cheers, Dave

-- 
Dave Atkins, PhD
Research Associate Professor
Center for the Study of Health and Risk Behaviors
Department of  Psychiatry and Behavioral Science
1100 NE 45th Street, Suite 300
Seattle, WA  98105
206-616-3879
datkins at u.washington.edu

Dear R users,

My query concerns how best to make inferences for applied problems from 
the outcomes of a mixed modelling approach. I am an ecologist and have 
been modelling the presence or absence of a species of vole in patches 
of suitable habitat in upland areas in relation to a number of 
biologically meaningful covariates. My sampling design includes 310 vole 
habitat patches within 9 river subcatchments and I have included 
subcatchment as a random effect in order to incorporate this structure 
in the data. I have used the information-theoretic approach with model 
weighting and averaging to make inferences about model selection and the 
consistency of parameter estimates. However, I am faced with one 
outstanding issue: as my question is an applied one, it is highly 
desirable to predict the probability of a vole habitat patch being 
occupied given a particular set of covariate values. This is easily done 
in glm using the predict() function, but I am aware that no such 
function exist for glmm.

However, would it be sound to take a similar approach to the 
coefficients and se for glmm i.e by back-transforming and applying to a 
range of x?

Or, would such an approach fail to take into account the model structure 
based on the random effect?

If so, would a reasonable approach be to estimate coefficients from 
re-samples of the data whilst maintaining the model structure? Predicted 
values could then be presented with confidence intervals that reflected 
variation in the structure of the data.

If not, what (if there is such a thing) would be the appropriate way to 
make predictions from glmm for applied problems such as this?

I hope I have been reasonably clear and am not being too ignorant. Any 
advice or comments would be much appreciated.

Many thanks

Matt

Dr Matthew Oliver
Research Fellow
School of Biological Sciences
University of Aberdeen
Zoology building
Tillydrone Avenue
Aberdeen AB24 2TZ
UK
tel + 44(0)1224 272789