[R-meta] Prediction Intervals for estimates of random slope rma.mv model with predict.rma()

Fri Mar 22 00:04:18 CET 2024

James,

Thank you very much for your reply - it was extremely helpful!

I have taken your code, modified it a bit, and attempted to create a
function that works alongside emmprep / emmeans. From my tests it seems to
get the same results as your manual calculations but let me know if I'm
missing something - only thing I wondered is if "rho" and "phi" need to be
incorporated directly, as well as the covariances you mentioned

Thanks again,
Zac

# load packages and data
library(metafor)
library(tidyverse)
library(emmeans)

data("dat.konstantopoulos2011")
df <-
  dat.konstantopoulos2011 %>%
  mutate(
    factor=sample(c("A","B"),nrow(.),replace=TRUE)
  )

# create function
get.pi.emmeans.rma.mv <- function(model, mm){

  tmp <- summary(mm)
  tmp <- tmp[ , ]
  test.stat <- stats::qt(mm using misc$level+(1-mm using misc$level)/2, tmp$df[[1]])

  n_g <- colnames(model$mf.g)
  n_g2 <- ifelse(is.null(n_g),0,n_g[n_g!="intrcpt"&n_g!="outer"])
  l_g <- ifelse(is.null(n_g),0,mm using linfct[,n_g2])
  v_g <- ifelse(is.null(n_g),0,diag(model$G)%>%as.data.frame()%>%.[n_g2,])

  n_h <- colnames(model$mf.h)
  n_h2 <- ifelse(is.null(n_h),0,n_h[n_h!="intrcpt"&n_h!="outer"])
  l_h <- ifelse(is.null(n_h),0,mm using linfct[,n_h2])
  v_h <- ifelse(is.null(n_h),0,diag(model$H)%>%as.data.frame()%>%.[n_h2,])

  V_b <- tmp$SE^2
  V_ri <- sum(model$sigma2 + model$tau2[1] + model$gamma2[1])
  V_g <- l_g*v_g
  V_h <- l_h*v_h

  PI <- test.stat * sqrt(V_b + V_ri + V_g + V_h)

  tmp$lower.PI <- tmp$emmean - PI
  tmp$upper.PI <- tmp$emmean + PI

  return(tmp)
}

# RANDOM INTERCEPT --------------------------------------------------------
# fit random intercept model
m1 <- rma.mv(
  yi ~ year + factor, V = vi,
  random = list(~1|district, ~1|interaction(district,school)),
  data = df, method = "REML"
)

# Prediction interval at year = 1989 (for Taylor) and factor = A
predict(m1, newmods = cbind(1989, 0), level = 95)

# Using function
emmprep(m1,
        at=list(year=1989,
                factor="A"))%>%
  emmeans(~year+factor,
          weights = "prop")%>%
  get.pi.emmeans.rma.mv(model=m1) #same

# RANDOM SLOPE (x1) -------------------------------------------------------
# fit single random slope model
m2 <- rma.mv(
  yi ~ year + factor, vi,
  random = list(~year|district, ~1|interaction(district,school)),
  data = df, method = "REML", struct = "GEN",control =
list(optimizer="optim")
)

# Calculate prediction interval for year = 1989, factor = A
vec <- c(1, 1989, 0)
pred <- sum(m2$beta * vec)
Vb <- as.numeric(t(vec) %*% vcov(m2) %*% vec)
V_RE <- sum(m2$sigma2) + m2$tau2[1] + m2$tau2[2]*1989
S <- sqrt(Vb + V_RE)
z_crit <- qnorm((1 + 0.95) / 2)
pred + c(-1, 1) * z_crit * S

# Using function
emmprep(m2,
        at=list(year=1989,
                factor="A"))%>%
  emmeans(~year+factor,
          weights = "prop")%>%
  get.pi.emmeans.rma.mv(model=m2) # same

On Thu, Mar 21, 2024 at 1:00 PM James Pustejovsky <jepusto using gmail.com> wrote:

> It looks like you are trying to generate a prediction interval for a new
> effect size given a specific set of predictors, based on a model with
> random slopes terms. A model like look like the following:
>
> Y_ij = b0 + b1 X_ij + u_0j + u_1j X_ij + v_ij + e_ij,
>
> where (u_0j, u_1j) are the study-level random effects with
> variance-covariance matrix T, v_ij is a within-study random effect with
> variance omega^2, and e_ij is the sampling error with known variance V_ij.
> Your goal might be to predict a new effect size for a study with X_ij = a,
> or the distribution of
>
> delta_ij = b0 + b1 * (a) + u_0j + u_1j * (a) + v_ij
>
> The tricky bit with these models (and probably the reason that PIs are not
> currently implemented in metafor) is that the study-level random effects
> terms can be correlated and you need to take the variances and covariances
> into account to get a prediction interval.
>
> My understanding is that metafor calculates prediction intervals by taking
> the predicted value given the fixed effects terms, plus or minus a critical
> value (call it z_p) times a quantity S, where S^2 is the sum of the
> variance of the predicted value and the variance of the random effects at
> the predicted value. For the example above:
>
> PI = [bhat0 + bhat1 * (a)] +/- z_p * S, where S = sqrt[ Var(bhat0 + bhat1
> * (a)) + Var(u_0j + u_1j * (a) + v_ij) ]
>
> You can calculate the first term directly from the beta estimates that
> come out of metafor. You can calculate the first component of S by using
> the vcov matrix that comes out of metafor. And you can calculate the third
> term from the estimated variance components.
>
> To make the calculations simpler, it would be convenient to re-center the
> predictor at the value for which you're trying to get a prediction
> interval. In other words, if you set X'_ij = X_ij - a, then you're
> predicting at the value X'_ij = 0, and the prediction interval simplifies to
>
> PI = bhat0 +/- z_p * sqrt[ Var(bhat0) + Var(u_0j + v_ij)]
>
> Example code using Konstantopoulos2011 below.
>
> James
>
>
> # load packages and data
> library(metafor)
> library(tidyverse)
>
> data("dat.konstantopoulos2011")
> df <-
>   dat.konstantopoulos2011 %>%
>   mutate(
>     factor=sample(c("A","B"),nrow(.),replace=TRUE)
>   )
>
> # RANDOM INTERCEPT --------------------------------------------------------
> # fit random intercept model
> m1 <- rma.mv(
>   yi ~ year + factor, V = vi,
>   random = list(~1|study, ~1|district, ~1|school),
>   data = df, method = "REML"
> )
>
> # Prediction interval at year = 1989 (for Taylor) and factor = A
> predict(m1, newmods = cbind(1989, 0), level = 80)
>
> # Same thing, manual calcualtion
> vec <- c(1, 1989, 0)
> pred <- sum(m1$beta * vec)
> Vb <- as.numeric(t(vec) %*% vcov(m1) %*% vec)
> V_RE <- sum(m1$sigma2)
> S <- sqrt(Vb + V_RE)
> z_crit <- qnorm((1 + 0.8) / 2)
> pred + c(-1, 1) * z_crit * S
>
> # RANDOM SLOPE (x1) -------------------------------------------------------
> # fit single random slope model
> # Re-center the year predictor
> df$year_c <- df$year - 1989
> m2 <- rma.mv(
>   yi ~ year_c + factor, vi,
>   random = list(~ year_c | study, ~1|district, ~1|school),
>   data = df, method = "REML", struct = "GEN"
> )
>
> # Does not return prediction interval
> predict(m2, newmods = cbind(0, 0), level = 80)
>
> # Calculate prediction interval for year_c = 0, factor = A
> vec <- c(1, 0, 0)
> pred <- sum(m2$beta * vec)
> Vb <- as.numeric(t(vec) %*% vcov(m2) %*% vec)
> V_RE <- sum(m2$sigma2) + m2$tau2[1] # Using only the random intercept at
> the study level
> S <- sqrt(Vb + V_RE)
> z_crit <- qnorm((1 + 0.8) / 2)
> pred + c(-1, 1) * z_crit * S
>
>
> On Thu, Mar 21, 2024 at 11:19 AM Zac Robinson via R-sig-meta-analysis <
> r-sig-meta-analysis using r-project.org> wrote:
>
>> Wolfgang,
>>
>> Apologies for the incorrect method to reply - hopefully got it right this
>> time. If I'm understanding things correctly - the coding can (potentially)
>> end up giving me what I'm after, but I'm just specifying the sigma / sd
>> for
>> the prediction interval incorrectly (i.e., the variances can't just be
>> summed).
>>
>> Thus, I guess my follow up question is - how would you recommend going
>> about this calculation? The closest thing I can find in the literature
>> with
>> a similar variance calculation is in the context of R2 from Johnson et
>> al.,
>> where they calculate the "mean random effects variance" which seems like
>> it
>> could be useful here:
>>
>> https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.12225
>>
>> Let me know what you think and thank you for your time and expertise,
>>
>> Zac
>>
>> On Thu, Mar 14, 2024 at 9:23 AM Viechtbauer, Wolfgang (NP) <
>> wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
>>
>> > Dear Zac,
>> >
>> > Please respond to the mailing list, not just to me personally.
>> >
>> > Just summing up all the variances isn't correct - the correct
>> computation
>> > of the prediction interval for such a random-effects structure is
>> > unfortunately more complex.
>> >
>> > Best,
>> > Wolfgang
>> >
>> > > -----Original Message-----
>> > > From: Zac Robinson <zacrobinson2015 using gmail.com>
>> > > Sent: Monday, March 11, 2024 16:07
>> > > To: Viechtbauer, Wolfgang (NP) <
>> > wolfgang.viechtbauer using maastrichtuniversity.nl>
>> > > Subject: Re: [R-meta] Prediction Intervals for estimates of random
>> slope
>> > rma.mv
>> > > model with predict.rma()
>> > >
>> > > Hey Wolfgang,
>> > >
>> > > Thank you very much for the reply. I actually have played around with
>> > things a
>> > > bit more and think I've found a solution. I may not be specifying the
>> > sigma
>> > > argument perfectly for a random slope model - but I think in concept
>> > this works!
>> > > Looks like emmeans / predict() have cross compatibility that allows
>> for
>> > > prediction intervals when sigma is specified
>> > >
>> > > Let me know what you think,
>> > > Zac
>> > >
>> > > # SETUP
>> > -------------------------------------------------------------------
>> > > #load packages and data
>> > > library(metafor)
>> > > library(tidyverse)
>> > > library(emmeans)
>> > >
>> > > data("dat.konstantopoulos2011")
>> > > df=dat.konstantopoulos2011%>%
>> > >   as.data.frame()%>%
>> > >   mutate(factor=sample(c("A","B"),56,replace=TRUE))
>> > >
>> > > # RANDOM SLOPE (x1)
>> > -------------------------------------------------------
>> > > #fit single random slope model
>> > > m2= rma.mv(yi ~ year+factor, vi,
>> > >           random = list(~year|study, ~1|district, ~1|school),
>> > >           data = df, method = "REML",struct = "GEN",
>> > >           control = list(optimizer="bobyqa"))
>> > >
>> > > #extract estimates with emmmprep + emmeans + predict
>> > > est.ci=emmprep(m2,
>> > >         at=list(year=1976:2000))%>%
>> > >   emmeans(~year,
>> > >           weights = "prop",
>> > >           sigma=sqrt(sum(
>> > >             m2$sigma2,
>> > >             m2$tau2,
>> > >             m2$gamma2
>> > >           )))
>> > >
>> > > pi= est.ci%>%
>> > >   predict(interval="prediction")%>%
>> > >   select(lower.PL,upper.PL)
>> > >
>> > > preds=cbind(est.ci,pi)
>> > > preds # prediction intervals are now added
>> > >
>> > > ggplot(data = preds,
>> > >        aes(x=year,
>> > >            y=emmean))+theme_classic()+
>> > >   geom_point(data=df,
>> > >              aes(y=yi,
>> > >                  size=1/vi))+
>> > >   geom_ribbon(aes(ymin=lower.PL,
>> > >                   ymax=upper.PL),
>> > >               linetype="dashed",
>> > >               alpha=0,
>> > >               color="black")+
>> > >   geom_ribbon(aes(ymin=asymp.LCL,
>> > >                   ymax=asymp.UCL),
>> > >               alpha=0.25)+
>> > >   geom_line()
>> > >
>> > > On Mon, Mar 11, 2024 at 2:06 AM Viechtbauer, Wolfgang (NP)
>> > > <mailto:wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
>> > > Dear Zac,
>> > >
>> > > predict() currently does not provide predicition intervals for
>> > http://rma.mv
>> > > models with a GEN random effects structure. Not sure when I will get
>> > around to
>> > > implementing that.
>> > >
>> > > Best,
>> > > Wolfgang
>> > >
>> > > > -----Original Message-----
>> > > > From: R-sig-meta-analysis <mailto:
>> > r-sig-meta-analysis-bounces using r-project.org>
>> > > On Behalf
>> > > > Of Zac Robinson via R-sig-meta-analysis
>> > > > Sent: Sunday, March 10, 2024 16:26
>> > > > To: mailto:r-sig-meta-analysis using r-project.org
>> > > > Cc: Zac Robinson <mailto:zacrobinson2015 using gmail.com>
>> > > > Subject: [R-meta] Prediction Intervals for estimates of random slope
>> > > http://rma.mv
>> > > > model with predict.rma()
>> > > >
>> > > > Dear All,
>> > > >
>> > > > I am working on extracting the adjusted effects from a dose-response
>> > > > multilevel meta-regression. One issue I've run into is getting the
>> > > > appropriate prediction intervals on my estimates when the model
>> > > > includes random slopes (specifically when struct = "GEN" is
>> included).
>> > From
>> > > > what I gather from the help files, this comes down to appropriately
>> > > > specifying the `tau2.levels` and `gamma2.levels` arguments within
>> > > > predict.rma() - however, I am unsure how to get this to work.
>> > > >
>> > > > In the code below I've tried to demonstrate my issue and explore the
>> > other
>> > > > solutions I'm aware of through the orchaRd package). My end goal
>> > requires
>> > > > adjusting for both continuous and categorical moderators so ideally
>> the
>> > > > solution stays within the emmprep() / emmeans() framework (which
>> makes
>> > > > proportional weights and setting non-focal continuous variables at
>> the
>> > mean
>> > > > easy), but I realize this may not be possible.
>> > > >
>> > > > Thank you very much in advance!
>> > > >
>> > > > Zac
>> > > >
>> > > > # SETUP
>> > -------------------------------------------------------------------
>> > > > #load packages and data
>> > > > library(metafor)
>> > > > library(tidyverse)
>> > > > library(orchaRd)
>> > > > library(emmeans)
>> > > >
>> > > > data("dat.konstantopoulos2011")
>> > > > df=dat.konstantopoulos2011%>%
>> > > >   as.data.frame()%>%
>> > > >   mutate(factor=sample(c("A","B"),nrow(df),replace=TRUE))
>> > > >
>> > > > # RANDOM INTERCEPT
>> > --------------------------------------------------------
>> > > > #fit random intercept model
>> > > > m1=http://rma.mv(yi ~ year+factor, vi,
>> > > >           random = list(~1|study, ~1|district, ~1|school),
>> > > >           data = df, method = "REML")
>> > > >
>> > > > #extract estimates with predict.rma
>> > > > predict(m1,
>> > > >         newmods =cbind(rep(c(1976:2000),2),
>> > > >                        rep(c(0,1),each=25)),
>> > > >         addx = TRUE) # works fine
>> > > >
>> > > > #extract estimates with emmprep
>> > > > emmprep(m1,
>> > > >         at=list(year=1976:2000))%>%
>> > > >   emmeans(~year,
>> > > >           weights = "prop")%>%
>> > > >   pred_interval_esmeans(model=m1,
>> > > >                         mm=.,
>> > > >                         mod="year") # works fine
>> > > >
>> > > > #extract estimates with orchaRd
>> > > > mod_results(m1,
>> > > >             mod = "year",
>> > > >             group = "study") # works fine
>> > > >
>> > > > # RANDOM SLOPE (x1)
>> > -------------------------------------------------------
>> > > > #fit single random slope model
>> > > > m2=http://rma.mv(yi ~ year+factor, vi,
>> > > >           random = list(~year|study, ~1|district, ~1|school),
>> > > >           data = df, method = "REML",struct = "GEN",
>> > > >           control = list(optimizer="bobyqa"))
>> > > >
>> > > > #extract estimates with predict.rma
>> > > > predict(m2, newmods =cbind(rep(c(1976:2000),2),
>> > > >                            rep(c(0,1),each=25)),
>> > > >         addx = TRUE) # no prediction intervals
>> > > >
>> > > > predict(m2, newmods =cbind(rep(c(1976:2000),2),
>> > > >                            rep(c(0,1),each=25)),
>> > > >         addx = TRUE,
>> > > >         tau2.levels = levels(year), # try adding tau2.levels
>> > > >         ) # still no prediction intervals
>> > > >
>> > > > #extract estimates with emmprep
>> > > > emmprep(m2,
>> > > >         at=list(year=1976:2000))%>%
>> > > >   emmeans(~year,
>> > > >           weights = "prop")%>%
>> > > >   pred_interval_esmeans(model=m2,
>> > > >                         mm=.,
>> > > >                         mod="year") # works but gives an error
>> > > >
>> > > > #extract estimates with orchaRd
>> > > > mod_results(m2,
>> > > >             mod = "year",
>> > > >             group = "study") # works but gives an error
>> > > >
>> > > >
>> > > > # RANDOM SLOPE (x2)
>> > -------------------------------------------------------
>> > > > #fit double random slope model
>> > > > m3=http://rma.mv(yi ~ year+factor, vi,
>> > > >           random = list(~year|study, ~year|district, ~1|school),
>> > > >           data = df, method = "REML",struct = c("GEN","GEN"),
>> > > >           control = list(optimizer="bobyqa"))
>> > > >
>> > > > #extract estimates with predict.rma
>> > > > predict(m3, newmods =cbind(rep(c(1976:2000),2),
>> > > >                            rep(c(0,1),each=25)),
>> > > >         addx = TRUE) # no prediction intervals
>> > > >
>> > > > predict(m3, newmods =cbind(rep(c(1976:2000),2),
>> > > >                             rep(c(0,1),each=25)),
>> > > >         addx = TRUE,
>> > > >         tau2.levels = levels(year),# try adding tau2.levels
>> > > >         gamma2.levels = levels(year)# try adding gamma2.levels
>> > > >         ) # still no prediction intervals
>> > > >
>> > > > #extract estimates with emmprep
>> > > > emmprep(m3,
>> > > >              at=list(year=1976:2000))%>%
>> > > >   emmeans(~year,
>> > > >           weights = "prop")%>%
>> > > >   pred_interval_esmeans(model=m3,
>> > > >                         mm=.,
>> > > >                         mod="year") # works but gives an error
>> > > >
>> > > > #extract estimates with orchaRd
>> > > > mod_results(m3,
>> > > >             mod = "year",
>> > > >             group = "study") # works but gives an error
>> >
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-meta-analysis mailing list @ R-sig-meta-analysis using r-project.org
>> To manage your subscription to this mailing list, go to:
>> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>>
>

	[[alternative HTML version deleted]]