[R-sig-ME] confidence interval question

Tue Mar 22 19:57:09 CET 2011

Dear R-sig mixed users,
I am trying to plot the results of a linear mixed model, and would like 
to add confidence intervals to show how the response varies with time 
(fixed effect).  I am using only random intercepts (not slopes).  My 
model also includes a weight term to account for precision.

I came across several ways to do this:
1.the recommendations on the GLMMfaq page
2.Solution based on Gelman and Hill's "simulate" function
3. MCMC-based confidence intervals in the package "languageR"

I am not a statistician, but I thought these should all give me roughly 
the same answer.  For the non-weighted test case, they do.  When the 
model includes weights, the CIs using method (1) are MUCH narrower than 
using (2) or (3).  Why is this?  And can anyone advise on which method 
is preferable or how the interpretation should differ based on which 
method I use??

Reproducible example follows.  Hopefully it will illuminate where I am 
stuck or someone can point to where I've gone wrong.

Thanks so much for your help.
Sarah

library(lme4)
library(ggplot2) # Plotting
library(MEMSS) # for Orthodont

#modified example slightly to remove subject specific slopes, use only 
males and added weights
#the resulting example is structured more like my actual data

fm1 = lmer(
     formula = distance ~ age + (1|Subject)
     , data = Orthodont[Orthodont$Sex=="Male",], weights=c(1:64)
)

#unweighted version below gives similar answers for all 3 methods of 
making CIs
#fm1 = lmer(
#    formula = distance ~ age + (1|Subject)
#   , data = Orthodont[Orthodont$Sex=="Male",]
#)

newdat <- expand.grid(
     age=c(8,10,12,14)
     , distance = 0
)
mm = model.matrix(terms(fm1),newdat)
newdat$distance = mm %*% fixef(fm1)
pvar1 <- diag(mm %*% tcrossprod(vcov(fm1),mm))
tvar1 <- pvar1+VarCorr(fm1)$Subject[1]
newdat <- data.frame(
     newdat
     , plo = newdat$distance-2*sqrt(pvar1)
     , phi = newdat$distance+2*sqrt(pvar1)
     , tlo = newdat$distance-2*sqrt(tvar1)
     , thi = newdat$distance+2*sqrt(tvar1)
)

#using languageR

library (languageR)

mcmc = pvals.fnc(fm1, nsim=1000, withMCMC=TRUE)
mcmcvers<-plotLMER.fnc(fm1, mcmcMat=mcmc$mcmc, withList=TRUE)

detach (package:languageR)
detach (package:zipfR)

#using Gelman and Hill sim function
library ("arm")

sim.fm1<-sim(fm1, n.sims=1000)
newdat.arm <- expand.grid(
     age=c(8,10,12,14)
     , distance = 0
)

#estimate mean distance per age for each simulation
mm.dat= model.matrix(terms(fm1),newdat.arm)
outputfull<-NULL
for (i in (1:1000))    {
     #i<-1
     output<-NULL
     yi = mm.dat %*% coef(sim.fm1)$fixef[i,]
     xi<-newdat.arm$age
     sim<-rep(i, length (yi))
     output<-cbind(yi, xi, sim)
     outputfull<-rbind (outputfull, output)
     }

#take the 0.025 and 0.975 quantiles of predictions over all simulations
outputfull<-as.data.frame (outputfull)
names (outputfull)<-c("distance", "age", "sim")
newdat.arm$ci.lb<-rep(NA, nrow (newdat.arm))
newdat.arm$ci.ub<-rep(NA, nrow (newdat.arm))
newdat.arm$mean<-rep(NA, nrow (newdat.arm))
for (k in 1:nrow(newdat.arm))    {
     age<-newdat.arm$age[k]

newdat.arm$ci.lb[k]<-quantile(outputfull$distance[outputfull$age==age],0.025)
     newdat.arm$mean[k]<-mean (outputfull$distance[outputfull$age==age])
     newdat.arm$ci.ub[k]<-quantile 
(outputfull$distance[outputfull$age==age],0.975)
     }

detach (package:arm)
detach (package:abind)
detach (package:car)
detach (package:nnet)
detach (package:survival)
detach (package:splines)

#possible to plot them all out and see the differences in CIs, but 
easiest to just look at the intervals for age=8 as a point example where 
there are clear differences
mcmcvers$age$lower[mcmcvers$age$x==8]
mcmcvers$age$upper[mcmcvers$age$x==8]

newdat.arm$ci.lb[newdat.arm$age==8]
newdat.arm$ci.ub[newdat.arm$age==8]

newdat$plo[newdat$age==8]
newdat$phi[newdat$age==8]

-- 
Postdoctoral Fellow
Department of Geography
University of British Columbia
1984 West Mall
Vancouver, BC, Canada
V6T 1Z2
Phone 604.822.9105
Fax 604.822.6150