[R-meta] Prediction Intervals for estimates of random slope rma.mv model with predict.rma()
James Pustejovsky
jepu@to @end|ng |rom gm@||@com
Thu Mar 21 17:59:49 CET 2024
It looks like you are trying to generate a prediction interval for a new
effect size given a specific set of predictors, based on a model with
random slopes terms. A model like look like the following:
Y_ij = b0 + b1 X_ij + u_0j + u_1j X_ij + v_ij + e_ij,
where (u_0j, u_1j) are the study-level random effects with
variance-covariance matrix T, v_ij is a within-study random effect with
variance omega^2, and e_ij is the sampling error with known variance V_ij.
Your goal might be to predict a new effect size for a study with X_ij = a,
or the distribution of
delta_ij = b0 + b1 * (a) + u_0j + u_1j * (a) + v_ij
The tricky bit with these models (and probably the reason that PIs are not
currently implemented in metafor) is that the study-level random effects
terms can be correlated and you need to take the variances and covariances
into account to get a prediction interval.
My understanding is that metafor calculates prediction intervals by taking
the predicted value given the fixed effects terms, plus or minus a critical
value (call it z_p) times a quantity S, where S^2 is the sum of the
variance of the predicted value and the variance of the random effects at
the predicted value. For the example above:
PI = [bhat0 + bhat1 * (a)] +/- z_p * S, where S = sqrt[ Var(bhat0 + bhat1 *
(a)) + Var(u_0j + u_1j * (a) + v_ij) ]
You can calculate the first term directly from the beta estimates that come
out of metafor. You can calculate the first component of S by using the
vcov matrix that comes out of metafor. And you can calculate the third term
from the estimated variance components.
To make the calculations simpler, it would be convenient to re-center the
predictor at the value for which you're trying to get a prediction
interval. In other words, if you set X'_ij = X_ij - a, then you're
predicting at the value X'_ij = 0, and the prediction interval simplifies to
PI = bhat0 +/- z_p * sqrt[ Var(bhat0) + Var(u_0j + v_ij)]
Example code using Konstantopoulos2011 below.
James
# load packages and data
library(metafor)
library(tidyverse)
data("dat.konstantopoulos2011")
df <-
dat.konstantopoulos2011 %>%
mutate(
factor=sample(c("A","B"),nrow(.),replace=TRUE)
)
# RANDOM INTERCEPT --------------------------------------------------------
# fit random intercept model
m1 <- rma.mv(
yi ~ year + factor, V = vi,
random = list(~1|study, ~1|district, ~1|school),
data = df, method = "REML"
)
# Prediction interval at year = 1989 (for Taylor) and factor = A
predict(m1, newmods = cbind(1989, 0), level = 80)
# Same thing, manual calcualtion
vec <- c(1, 1989, 0)
pred <- sum(m1$beta * vec)
Vb <- as.numeric(t(vec) %*% vcov(m1) %*% vec)
V_RE <- sum(m1$sigma2)
S <- sqrt(Vb + V_RE)
z_crit <- qnorm((1 + 0.8) / 2)
pred + c(-1, 1) * z_crit * S
# RANDOM SLOPE (x1) -------------------------------------------------------
# fit single random slope model
# Re-center the year predictor
df$year_c <- df$year - 1989
m2 <- rma.mv(
yi ~ year_c + factor, vi,
random = list(~ year_c | study, ~1|district, ~1|school),
data = df, method = "REML", struct = "GEN"
)
# Does not return prediction interval
predict(m2, newmods = cbind(0, 0), level = 80)
# Calculate prediction interval for year_c = 0, factor = A
vec <- c(1, 0, 0)
pred <- sum(m2$beta * vec)
Vb <- as.numeric(t(vec) %*% vcov(m2) %*% vec)
V_RE <- sum(m2$sigma2) + m2$tau2[1] # Using only the random intercept at
the study level
S <- sqrt(Vb + V_RE)
z_crit <- qnorm((1 + 0.8) / 2)
pred + c(-1, 1) * z_crit * S
On Thu, Mar 21, 2024 at 11:19 AM Zac Robinson via R-sig-meta-analysis <
r-sig-meta-analysis using r-project.org> wrote:
> Wolfgang,
>
> Apologies for the incorrect method to reply - hopefully got it right this
> time. If I'm understanding things correctly - the coding can (potentially)
> end up giving me what I'm after, but I'm just specifying the sigma / sd for
> the prediction interval incorrectly (i.e., the variances can't just be
> summed).
>
> Thus, I guess my follow up question is - how would you recommend going
> about this calculation? The closest thing I can find in the literature with
> a similar variance calculation is in the context of R2 from Johnson et al.,
> where they calculate the "mean random effects variance" which seems like it
> could be useful here:
>
> https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.12225
>
> Let me know what you think and thank you for your time and expertise,
>
> Zac
>
> On Thu, Mar 14, 2024 at 9:23 AM Viechtbauer, Wolfgang (NP) <
> wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
>
> > Dear Zac,
> >
> > Please respond to the mailing list, not just to me personally.
> >
> > Just summing up all the variances isn't correct - the correct computation
> > of the prediction interval for such a random-effects structure is
> > unfortunately more complex.
> >
> > Best,
> > Wolfgang
> >
> > > -----Original Message-----
> > > From: Zac Robinson <zacrobinson2015 using gmail.com>
> > > Sent: Monday, March 11, 2024 16:07
> > > To: Viechtbauer, Wolfgang (NP) <
> > wolfgang.viechtbauer using maastrichtuniversity.nl>
> > > Subject: Re: [R-meta] Prediction Intervals for estimates of random
> slope
> > rma.mv
> > > model with predict.rma()
> > >
> > > Hey Wolfgang,
> > >
> > > Thank you very much for the reply. I actually have played around with
> > things a
> > > bit more and think I've found a solution. I may not be specifying the
> > sigma
> > > argument perfectly for a random slope model - but I think in concept
> > this works!
> > > Looks like emmeans / predict() have cross compatibility that allows for
> > > prediction intervals when sigma is specified
> > >
> > > Let me know what you think,
> > > Zac
> > >
> > > # SETUP
> > -------------------------------------------------------------------
> > > #load packages and data
> > > library(metafor)
> > > library(tidyverse)
> > > library(emmeans)
> > >
> > > data("dat.konstantopoulos2011")
> > > df=dat.konstantopoulos2011%>%
> > > as.data.frame()%>%
> > > mutate(factor=sample(c("A","B"),56,replace=TRUE))
> > >
> > > # RANDOM SLOPE (x1)
> > -------------------------------------------------------
> > > #fit single random slope model
> > > m2= rma.mv(yi ~ year+factor, vi,
> > > random = list(~year|study, ~1|district, ~1|school),
> > > data = df, method = "REML",struct = "GEN",
> > > control = list(optimizer="bobyqa"))
> > >
> > > #extract estimates with emmmprep + emmeans + predict
> > > est.ci=emmprep(m2,
> > > at=list(year=1976:2000))%>%
> > > emmeans(~year,
> > > weights = "prop",
> > > sigma=sqrt(sum(
> > > m2$sigma2,
> > > m2$tau2,
> > > m2$gamma2
> > > )))
> > >
> > > pi= est.ci%>%
> > > predict(interval="prediction")%>%
> > > select(lower.PL,upper.PL)
> > >
> > > preds=cbind(est.ci,pi)
> > > preds # prediction intervals are now added
> > >
> > > ggplot(data = preds,
> > > aes(x=year,
> > > y=emmean))+theme_classic()+
> > > geom_point(data=df,
> > > aes(y=yi,
> > > size=1/vi))+
> > > geom_ribbon(aes(ymin=lower.PL,
> > > ymax=upper.PL),
> > > linetype="dashed",
> > > alpha=0,
> > > color="black")+
> > > geom_ribbon(aes(ymin=asymp.LCL,
> > > ymax=asymp.UCL),
> > > alpha=0.25)+
> > > geom_line()
> > >
> > > On Mon, Mar 11, 2024 at 2:06 AM Viechtbauer, Wolfgang (NP)
> > > <mailto:wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
> > > Dear Zac,
> > >
> > > predict() currently does not provide predicition intervals for
> > http://rma.mv
> > > models with a GEN random effects structure. Not sure when I will get
> > around to
> > > implementing that.
> > >
> > > Best,
> > > Wolfgang
> > >
> > > > -----Original Message-----
> > > > From: R-sig-meta-analysis <mailto:
> > r-sig-meta-analysis-bounces using r-project.org>
> > > On Behalf
> > > > Of Zac Robinson via R-sig-meta-analysis
> > > > Sent: Sunday, March 10, 2024 16:26
> > > > To: mailto:r-sig-meta-analysis using r-project.org
> > > > Cc: Zac Robinson <mailto:zacrobinson2015 using gmail.com>
> > > > Subject: [R-meta] Prediction Intervals for estimates of random slope
> > > http://rma.mv
> > > > model with predict.rma()
> > > >
> > > > Dear All,
> > > >
> > > > I am working on extracting the adjusted effects from a dose-response
> > > > multilevel meta-regression. One issue I've run into is getting the
> > > > appropriate prediction intervals on my estimates when the model
> > > > includes random slopes (specifically when struct = "GEN" is
> included).
> > From
> > > > what I gather from the help files, this comes down to appropriately
> > > > specifying the `tau2.levels` and `gamma2.levels` arguments within
> > > > predict.rma() - however, I am unsure how to get this to work.
> > > >
> > > > In the code below I've tried to demonstrate my issue and explore the
> > other
> > > > solutions I'm aware of through the orchaRd package). My end goal
> > requires
> > > > adjusting for both continuous and categorical moderators so ideally
> the
> > > > solution stays within the emmprep() / emmeans() framework (which
> makes
> > > > proportional weights and setting non-focal continuous variables at
> the
> > mean
> > > > easy), but I realize this may not be possible.
> > > >
> > > > Thank you very much in advance!
> > > >
> > > > Zac
> > > >
> > > > # SETUP
> > -------------------------------------------------------------------
> > > > #load packages and data
> > > > library(metafor)
> > > > library(tidyverse)
> > > > library(orchaRd)
> > > > library(emmeans)
> > > >
> > > > data("dat.konstantopoulos2011")
> > > > df=dat.konstantopoulos2011%>%
> > > > as.data.frame()%>%
> > > > mutate(factor=sample(c("A","B"),nrow(df),replace=TRUE))
> > > >
> > > > # RANDOM INTERCEPT
> > --------------------------------------------------------
> > > > #fit random intercept model
> > > > m1=http://rma.mv(yi ~ year+factor, vi,
> > > > random = list(~1|study, ~1|district, ~1|school),
> > > > data = df, method = "REML")
> > > >
> > > > #extract estimates with predict.rma
> > > > predict(m1,
> > > > newmods =cbind(rep(c(1976:2000),2),
> > > > rep(c(0,1),each=25)),
> > > > addx = TRUE) # works fine
> > > >
> > > > #extract estimates with emmprep
> > > > emmprep(m1,
> > > > at=list(year=1976:2000))%>%
> > > > emmeans(~year,
> > > > weights = "prop")%>%
> > > > pred_interval_esmeans(model=m1,
> > > > mm=.,
> > > > mod="year") # works fine
> > > >
> > > > #extract estimates with orchaRd
> > > > mod_results(m1,
> > > > mod = "year",
> > > > group = "study") # works fine
> > > >
> > > > # RANDOM SLOPE (x1)
> > -------------------------------------------------------
> > > > #fit single random slope model
> > > > m2=http://rma.mv(yi ~ year+factor, vi,
> > > > random = list(~year|study, ~1|district, ~1|school),
> > > > data = df, method = "REML",struct = "GEN",
> > > > control = list(optimizer="bobyqa"))
> > > >
> > > > #extract estimates with predict.rma
> > > > predict(m2, newmods =cbind(rep(c(1976:2000),2),
> > > > rep(c(0,1),each=25)),
> > > > addx = TRUE) # no prediction intervals
> > > >
> > > > predict(m2, newmods =cbind(rep(c(1976:2000),2),
> > > > rep(c(0,1),each=25)),
> > > > addx = TRUE,
> > > > tau2.levels = levels(year), # try adding tau2.levels
> > > > ) # still no prediction intervals
> > > >
> > > > #extract estimates with emmprep
> > > > emmprep(m2,
> > > > at=list(year=1976:2000))%>%
> > > > emmeans(~year,
> > > > weights = "prop")%>%
> > > > pred_interval_esmeans(model=m2,
> > > > mm=.,
> > > > mod="year") # works but gives an error
> > > >
> > > > #extract estimates with orchaRd
> > > > mod_results(m2,
> > > > mod = "year",
> > > > group = "study") # works but gives an error
> > > >
> > > >
> > > > # RANDOM SLOPE (x2)
> > -------------------------------------------------------
> > > > #fit double random slope model
> > > > m3=http://rma.mv(yi ~ year+factor, vi,
> > > > random = list(~year|study, ~year|district, ~1|school),
> > > > data = df, method = "REML",struct = c("GEN","GEN"),
> > > > control = list(optimizer="bobyqa"))
> > > >
> > > > #extract estimates with predict.rma
> > > > predict(m3, newmods =cbind(rep(c(1976:2000),2),
> > > > rep(c(0,1),each=25)),
> > > > addx = TRUE) # no prediction intervals
> > > >
> > > > predict(m3, newmods =cbind(rep(c(1976:2000),2),
> > > > rep(c(0,1),each=25)),
> > > > addx = TRUE,
> > > > tau2.levels = levels(year),# try adding tau2.levels
> > > > gamma2.levels = levels(year)# try adding gamma2.levels
> > > > ) # still no prediction intervals
> > > >
> > > > #extract estimates with emmprep
> > > > emmprep(m3,
> > > > at=list(year=1976:2000))%>%
> > > > emmeans(~year,
> > > > weights = "prop")%>%
> > > > pred_interval_esmeans(model=m3,
> > > > mm=.,
> > > > mod="year") # works but gives an error
> > > >
> > > > #extract estimates with orchaRd
> > > > mod_results(m3,
> > > > mod = "year",
> > > > group = "study") # works but gives an error
> >
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-meta-analysis mailing list @ R-sig-meta-analysis using r-project.org
> To manage your subscription to this mailing list, go to:
> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>
[[alternative HTML version deleted]]
More information about the R-sig-meta-analysis
mailing list