[R-sig-ME] confidence interval question
Ben Bolker
bbolker at gmail.com
Tue Mar 22 22:10:45 CET 2011
Sarah Elmendorf <scelmend at ...> writes:
>
> Dear R-sig mixed users,
> I am trying to plot the results of a linear mixed model, and would like
> to add confidence intervals to show how the response varies with time
> (fixed effect). I am using only random intercepts (not slopes). My
> model also includes a weight term to account for precision.
>
> I came across several ways to do this:
> 1.the recommendations on the GLMMfaq page
> 2.Solution based on Gelman and Hill's "simulate" function
> 3. MCMC-based confidence intervals in the package "languageR"
>
> I am not a statistician, but I thought these should all give me roughly
> the same answer. For the non-weighted test case, they do. When the
> model includes weights, the CIs using method (1) are MUCH narrower than
> using (2) or (3). Why is this? And can anyone advise on which method
> is preferable or how the interpretation should differ based on which
> method I use??
>
> Reproducible example follows. Hopefully it will illuminate where I am
> stuck or someone can point to where I've gone wrong.
>
> Thanks so much for your help.
> Sarah
This is stumping me a bit, to be honest.
The languageR approach should be the gold standard here:
it is (a) using code built into lme4 (mcmcsamp) at its core,
which ought to know what to do with weights unless there is
a serious bug; (b) accounting for uncertainty in the random
effects parameters, which neither of the other methods does.
(In principle, you should be able to build a solution that
looks a lot like your sim- and vcov-based solutions that is
based on mcmcsamp() instead of going through all of the
languageR machinery ...)
I don't know exactly what the 'sim' function is doing,
except that it is obviously getting larger variances for the
parameters than simulating directly from vcov() does:
compare
var(coef(sim.fm1)$fixef)
and
vcov(fm1)
...
I don't see why setting 'weights' or not makes a difference, unless
there's something deep in the guts of the 'sim' method for mer objects
that takes them into account ...
[removing > so that Gmane doesn't complain]
library(lme4)
library(ggplot2) # Plotting
library(MEMSS) # for Orthodont
#modified example slightly to remove subject specific slopes, use only
males and added weights
#the resulting example is structured more like my actual data
fm1 = lmer(
formula = distance ~ age + (1|Subject)
, data = Orthodont[Orthodont$Sex=="Male",], weights=c(1:64)
)
#unweighted version below gives similar answers for all 3 methods of
making CIs
#fm1 = lmer(
# formula = distance ~ age + (1|Subject)
# , data = Orthodont[Orthodont$Sex=="Male",]
#)
newdat <- expand.grid(
age=c(8,10,12,14)
, distance = 0
)
mm = model.matrix(terms(fm1),newdat)
newdat$distance = mm %*% fixef(fm1)
pvar1 <- diag(mm %*% tcrossprod(vcov(fm1),mm))
tvar1 <- pvar1+VarCorr(fm1)$Subject[1]
newdat <- data.frame(
newdat
, plo = newdat$distance-2*sqrt(pvar1)
, phi = newdat$distance+2*sqrt(pvar1)
, tlo = newdat$distance-2*sqrt(tvar1)
, thi = newdat$distance+2*sqrt(tvar1)
)
#using languageR
library (languageR)
mcmc = pvals.fnc(fm1, nsim=1000, withMCMC=TRUE)
mcmcvers<-plotLMER.fnc(fm1, mcmcMat=mcmc$mcmc, withList=TRUE)
detach (package:languageR)
detach (package:zipfR)
#using Gelman and Hill sim function
library ("arm")
sim.fm1<-sim(fm1, n.sims=1000)
newdat.arm <- expand.grid(
age=c(8,10,12,14)
, distance = 0
)
#estimate mean distance per age for each simulation
mm.dat= model.matrix(terms(fm1),newdat.arm)
You can replace the loop here:
outputfull <- data.frame(y=c(apply(coef(sim.fm1)$fixef,
1,function(v) mm.dat%*%v)),
age=newdat.arm$age,sim=rep(1:1000,each=4))
#take the 0.025 and 0.975 quantiles of predictions over all simulations
outputfull<-as.data.frame (outputfull)
names (outputfull)<-c("distance", "age", "sim")
newdat.arm$ci.lb<-rep(NA, nrow (newdat.arm))
newdat.arm$ci.ub<-rep(NA, nrow (newdat.arm))
newdat.arm$mean<-rep(NA, nrow (newdat.arm))
for (k in 1:nrow(newdat.arm)) {
age<-newdat.arm$age[k]
newdat.arm$ci.lb[k]<-quantile(outputfull$distance[outputfull$age==age],0.025)
newdat.arm$mean[k]<-mean (outputfull$distance[outputfull$age==age])
newdat.arm$ci.ub[k]<-quantile
(outputfull$distance[outputfull$age==age],0.975)
}
detach (package:arm)
detach (package:abind)
detach (package:car)
detach (package:nnet)
detach (package:survival)
detach (package:splines)
#possible to plot them all out and see the differences in CIs, but
easiest to just look at the intervals for age=8 as a point example where
there are clear differences
mcmcvers$age$lower[mcmcvers$age$x==8]
mcmcvers$age$upper[mcmcvers$age$x==8]
newdat.arm$ci.lb[newdat.arm$age==8]
newdat.arm$ci.ub[newdat.arm$age==8]
newdat$plo[newdat$age==8]
newdat$phi[newdat$age==8]
More information about the R-sig-mixed-models
mailing list