[R-sig-ME] confidence interval question
Sarah Elmendorf
scelmend at interchange.ubc.ca
Tue Mar 22 19:57:09 CET 2011
Dear R-sig mixed users,
I am trying to plot the results of a linear mixed model, and would like
to add confidence intervals to show how the response varies with time
(fixed effect). I am using only random intercepts (not slopes). My
model also includes a weight term to account for precision.
I came across several ways to do this:
1.the recommendations on the GLMMfaq page
2.Solution based on Gelman and Hill's "simulate" function
3. MCMC-based confidence intervals in the package "languageR"
I am not a statistician, but I thought these should all give me roughly
the same answer. For the non-weighted test case, they do. When the
model includes weights, the CIs using method (1) are MUCH narrower than
using (2) or (3). Why is this? And can anyone advise on which method
is preferable or how the interpretation should differ based on which
method I use??
Reproducible example follows. Hopefully it will illuminate where I am
stuck or someone can point to where I've gone wrong.
Thanks so much for your help.
Sarah
library(lme4)
library(ggplot2) # Plotting
library(MEMSS) # for Orthodont
#modified example slightly to remove subject specific slopes, use only
males and added weights
#the resulting example is structured more like my actual data
fm1 = lmer(
formula = distance ~ age + (1|Subject)
, data = Orthodont[Orthodont$Sex=="Male",], weights=c(1:64)
)
#unweighted version below gives similar answers for all 3 methods of
making CIs
#fm1 = lmer(
# formula = distance ~ age + (1|Subject)
# , data = Orthodont[Orthodont$Sex=="Male",]
#)
newdat <- expand.grid(
age=c(8,10,12,14)
, distance = 0
)
mm = model.matrix(terms(fm1),newdat)
newdat$distance = mm %*% fixef(fm1)
pvar1 <- diag(mm %*% tcrossprod(vcov(fm1),mm))
tvar1 <- pvar1+VarCorr(fm1)$Subject[1]
newdat <- data.frame(
newdat
, plo = newdat$distance-2*sqrt(pvar1)
, phi = newdat$distance+2*sqrt(pvar1)
, tlo = newdat$distance-2*sqrt(tvar1)
, thi = newdat$distance+2*sqrt(tvar1)
)
#using languageR
library (languageR)
mcmc = pvals.fnc(fm1, nsim=1000, withMCMC=TRUE)
mcmcvers<-plotLMER.fnc(fm1, mcmcMat=mcmc$mcmc, withList=TRUE)
detach (package:languageR)
detach (package:zipfR)
#using Gelman and Hill sim function
library ("arm")
sim.fm1<-sim(fm1, n.sims=1000)
newdat.arm <- expand.grid(
age=c(8,10,12,14)
, distance = 0
)
#estimate mean distance per age for each simulation
mm.dat= model.matrix(terms(fm1),newdat.arm)
outputfull<-NULL
for (i in (1:1000)) {
#i<-1
output<-NULL
yi = mm.dat %*% coef(sim.fm1)$fixef[i,]
xi<-newdat.arm$age
sim<-rep(i, length (yi))
output<-cbind(yi, xi, sim)
outputfull<-rbind (outputfull, output)
}
#take the 0.025 and 0.975 quantiles of predictions over all simulations
outputfull<-as.data.frame (outputfull)
names (outputfull)<-c("distance", "age", "sim")
newdat.arm$ci.lb<-rep(NA, nrow (newdat.arm))
newdat.arm$ci.ub<-rep(NA, nrow (newdat.arm))
newdat.arm$mean<-rep(NA, nrow (newdat.arm))
for (k in 1:nrow(newdat.arm)) {
age<-newdat.arm$age[k]
newdat.arm$ci.lb[k]<-quantile(outputfull$distance[outputfull$age==age],0.025)
newdat.arm$mean[k]<-mean (outputfull$distance[outputfull$age==age])
newdat.arm$ci.ub[k]<-quantile
(outputfull$distance[outputfull$age==age],0.975)
}
detach (package:arm)
detach (package:abind)
detach (package:car)
detach (package:nnet)
detach (package:survival)
detach (package:splines)
#possible to plot them all out and see the differences in CIs, but
easiest to just look at the intervals for age=8 as a point example where
there are clear differences
mcmcvers$age$lower[mcmcvers$age$x==8]
mcmcvers$age$upper[mcmcvers$age$x==8]
newdat.arm$ci.lb[newdat.arm$age==8]
newdat.arm$ci.ub[newdat.arm$age==8]
newdat$plo[newdat$age==8]
newdat$phi[newdat$age==8]
--
Postdoctoral Fellow
Department of Geography
University of British Columbia
1984 West Mall
Vancouver, BC, Canada
V6T 1Z2
Phone 604.822.9105
Fax 604.822.6150
More information about the R-sig-mixed-models
mailing list