[R] gamm() and predict()

Mon Oct 13 18:18:35 CEST 2008

Dear All,
I have a query relating to the use of the ‘predict’ and ‘gamm’ functions.  I am dealing with large (approx. 5000) sets of presence/absence data, which I am trying to model as a function of different of environmental covariates.  Ideally my models should include individual and colony as random factors. I have been trying to fit binomial models using the gamm function to achieve this. For the sake of simplicity I have adapted a some of the example code from ?gamm to illustrate some problems I have been having predicting values using this approach.
### Begin example ###
library(mgcv)
## Generate some example data
set.seed(0)
n <- 400
sig <- 2
x0 <- runif(n, 0, 1)
x1 <- runif(n, 0, 1)
x2 <- runif(n, 0, 1)
x3 <- runif(n, 0, 1)
f <- 2 * sin(pi * x0)
f <- f + exp(2 * x1) - 3.75887
f <- f+0.2*x2^11*(10*(1-x2))^6+10*(10*x2)^3*(1-x2)^10-1.396
e <- rnorm(n, 0, sig)
y <- f + e
## Change the response to binary
y <-round(y/max(y))
## Add a factor to the linear predictor, to be modelled as random
fac <- rep(1:4,n/4)
f <- f + fac*3
fac<-as.factor(fac)
## Fit an additive model
mod<-gamm(y~s(x0)+s(x1)+s(x2)+s(x3),family=binomial,random=list(fac=~1))
## Generate some new example data
new.dat = data.frame(x0 = runif(n, 0, 1), x1 = runif(n, 0, 1), x2 = runif(n, 0, 1),x3 = runif(n, 0, 1), fac = fac)
## Predict response values using the original data and the gam part of the model…
predict(mod$gam, type = "response")
## Predict response values using the new data and the gam part of the model…
predict(mod$gam, type = "response", new.dat)
## Predict response values using the original data and the glmm part of the model…
predict(mod$lme, level = 0, type = "response")
## Predict response values using the new data and the gam part of the model…
predict(mod$lme, level = 0, type = "response", new.dat)
# This produces the error message ‘Error in eval(expr, envir, enclos) : object "fixed" not found’
## End example ###
My questions are as follows:
1. I presume predict(mod.$gam) produces population level predictions. Is this correct?
2. Is it possible to extract standard errors using predict(mod.$gam) or is there a more suitable approach to estimating confidence in prections made with gamms?
3. It seems that predict(mod.$lme) results in predictions at the level of random factors. Furthermore, these appear to be on the scale of the linear predictor regardless of how level is specified (see ?glmmPQL). Is this correct?
4. The code predict(mod$lme, new.dat) produces an error message, seemingly indicating the the fixed effects are missing from my new data frame (see example). Am I doing something wrong here?
5. Is it possible to predict both population and random factor level predictions using new data with gamm objects?
I have read all the relevant help files including those associated with glmmPQL and also Simon Woods book and I am still a bit confused so any help would be gratefully received. I am using R 2.6.1 with Windows XP pro, mgcv version 1.3-29.
Thanks,
Ewan Wakefield
British Antarctic Survey
High Cross
Madingley Road
Cambridge
UK