[R-meta] Prediction Intervals for estimates of random slope rma.mv model with predict.rma()
James Pustejovsky
jepu@to @end|ng |rom gm@||@com
Fri Mar 22 15:03:05 CET 2024
This doesn't look correct to me. The step of centering the predictor is
essential for the code that I posted to work. If you want to be able to
predict at arbitrary points, then you'll need to take into account the
covariance between the random effects, so calculating
Var(u_0j + u_1j * (a) + v_ij) = Var(u_0j) + a^2 * Var(u_1j) + 2 * a *
Cov(u_0j, u_1j) + Var(v_ij)
James
On Thu, Mar 21, 2024 at 6:04 PM Zac Robinson <zacrobinson2015 using gmail.com>
wrote:
> James,
>
> Thank you very much for your reply - it was extremely helpful!
>
> I have taken your code, modified it a bit, and attempted to create a
> function that works alongside emmprep / emmeans. From my tests it seems to
> get the same results as your manual calculations but let me know if I'm
> missing something - only thing I wondered is if "rho" and "phi" need to be
> incorporated directly, as well as the covariances you mentioned
>
> Thanks again,
> Zac
>
> # load packages and data
> library(metafor)
> library(tidyverse)
> library(emmeans)
>
> data("dat.konstantopoulos2011")
> df <-
> dat.konstantopoulos2011 %>%
> mutate(
> factor=sample(c("A","B"),nrow(.),replace=TRUE)
> )
>
> # create function
> get.pi.emmeans.rma.mv <- function(model, mm){
>
> tmp <- summary(mm)
> tmp <- tmp[ , ]
> test.stat <- stats::qt(mm using misc$level+(1-mm using misc$level)/2, tmp$df[[1]])
>
> n_g <- colnames(model$mf.g)
> n_g2 <- ifelse(is.null(n_g),0,n_g[n_g!="intrcpt"&n_g!="outer"])
> l_g <- ifelse(is.null(n_g),0,mm using linfct[,n_g2])
> v_g <- ifelse(is.null(n_g),0,diag(model$G)%>%as.data.frame()%>%.[n_g2,])
>
> n_h <- colnames(model$mf.h)
> n_h2 <- ifelse(is.null(n_h),0,n_h[n_h!="intrcpt"&n_h!="outer"])
> l_h <- ifelse(is.null(n_h),0,mm using linfct[,n_h2])
> v_h <- ifelse(is.null(n_h),0,diag(model$H)%>%as.data.frame()%>%.[n_h2,])
>
> V_b <- tmp$SE^2
> V_ri <- sum(model$sigma2 + model$tau2[1] + model$gamma2[1])
> V_g <- l_g*v_g
> V_h <- l_h*v_h
>
> PI <- test.stat * sqrt(V_b + V_ri + V_g + V_h)
>
> tmp$lower.PI <- tmp$emmean - PI
> tmp$upper.PI <- tmp$emmean + PI
>
> return(tmp)
> }
>
>
> # RANDOM INTERCEPT --------------------------------------------------------
> # fit random intercept model
> m1 <- rma.mv(
> yi ~ year + factor, V = vi,
> random = list(~1|district, ~1|interaction(district,school)),
> data = df, method = "REML"
> )
>
> # Prediction interval at year = 1989 (for Taylor) and factor = A
> predict(m1, newmods = cbind(1989, 0), level = 95)
>
> # Using function
> emmprep(m1,
> at=list(year=1989,
> factor="A"))%>%
> emmeans(~year+factor,
> weights = "prop")%>%
> get.pi.emmeans.rma.mv(model=m1) #same
>
> # RANDOM SLOPE (x1) -------------------------------------------------------
> # fit single random slope model
> m2 <- rma.mv(
> yi ~ year + factor, vi,
> random = list(~year|district, ~1|interaction(district,school)),
> data = df, method = "REML", struct = "GEN",control =
> list(optimizer="optim")
> )
>
> # Calculate prediction interval for year = 1989, factor = A
> vec <- c(1, 1989, 0)
> pred <- sum(m2$beta * vec)
> Vb <- as.numeric(t(vec) %*% vcov(m2) %*% vec)
> V_RE <- sum(m2$sigma2) + m2$tau2[1] + m2$tau2[2]*1989
> S <- sqrt(Vb + V_RE)
> z_crit <- qnorm((1 + 0.95) / 2)
> pred + c(-1, 1) * z_crit * S
>
> # Using function
> emmprep(m2,
> at=list(year=1989,
> factor="A"))%>%
> emmeans(~year+factor,
> weights = "prop")%>%
> get.pi.emmeans.rma.mv(model=m2) # same
>
>
>
> On Thu, Mar 21, 2024 at 1:00 PM James Pustejovsky <jepusto using gmail.com>
> wrote:
>
>> It looks like you are trying to generate a prediction interval for a new
>> effect size given a specific set of predictors, based on a model with
>> random slopes terms. A model like look like the following:
>>
>> Y_ij = b0 + b1 X_ij + u_0j + u_1j X_ij + v_ij + e_ij,
>>
>> where (u_0j, u_1j) are the study-level random effects with
>> variance-covariance matrix T, v_ij is a within-study random effect with
>> variance omega^2, and e_ij is the sampling error with known variance V_ij.
>> Your goal might be to predict a new effect size for a study with X_ij = a,
>> or the distribution of
>>
>> delta_ij = b0 + b1 * (a) + u_0j + u_1j * (a) + v_ij
>>
>> The tricky bit with these models (and probably the reason that PIs are
>> not currently implemented in metafor) is that the study-level random
>> effects terms can be correlated and you need to take the variances and
>> covariances into account to get a prediction interval.
>>
>> My understanding is that metafor calculates prediction intervals by
>> taking the predicted value given the fixed effects terms, plus or minus a
>> critical value (call it z_p) times a quantity S, where S^2 is the sum of
>> the variance of the predicted value and the variance of the random effects
>> at the predicted value. For the example above:
>>
>> PI = [bhat0 + bhat1 * (a)] +/- z_p * S, where S = sqrt[ Var(bhat0 + bhat1
>> * (a)) + Var(u_0j + u_1j * (a) + v_ij) ]
>>
>> You can calculate the first term directly from the beta estimates that
>> come out of metafor. You can calculate the first component of S by using
>> the vcov matrix that comes out of metafor. And you can calculate the third
>> term from the estimated variance components.
>>
>> To make the calculations simpler, it would be convenient to re-center the
>> predictor at the value for which you're trying to get a prediction
>> interval. In other words, if you set X'_ij = X_ij - a, then you're
>> predicting at the value X'_ij = 0, and the prediction interval simplifies to
>>
>> PI = bhat0 +/- z_p * sqrt[ Var(bhat0) + Var(u_0j + v_ij)]
>>
>> Example code using Konstantopoulos2011 below.
>>
>> James
>>
>>
>> # load packages and data
>> library(metafor)
>> library(tidyverse)
>>
>> data("dat.konstantopoulos2011")
>> df <-
>> dat.konstantopoulos2011 %>%
>> mutate(
>> factor=sample(c("A","B"),nrow(.),replace=TRUE)
>> )
>>
>> # RANDOM INTERCEPT
>> --------------------------------------------------------
>> # fit random intercept model
>> m1 <- rma.mv(
>> yi ~ year + factor, V = vi,
>> random = list(~1|study, ~1|district, ~1|school),
>> data = df, method = "REML"
>> )
>>
>> # Prediction interval at year = 1989 (for Taylor) and factor = A
>> predict(m1, newmods = cbind(1989, 0), level = 80)
>>
>> # Same thing, manual calcualtion
>> vec <- c(1, 1989, 0)
>> pred <- sum(m1$beta * vec)
>> Vb <- as.numeric(t(vec) %*% vcov(m1) %*% vec)
>> V_RE <- sum(m1$sigma2)
>> S <- sqrt(Vb + V_RE)
>> z_crit <- qnorm((1 + 0.8) / 2)
>> pred + c(-1, 1) * z_crit * S
>>
>> # RANDOM SLOPE (x1)
>> -------------------------------------------------------
>> # fit single random slope model
>> # Re-center the year predictor
>> df$year_c <- df$year - 1989
>> m2 <- rma.mv(
>> yi ~ year_c + factor, vi,
>> random = list(~ year_c | study, ~1|district, ~1|school),
>> data = df, method = "REML", struct = "GEN"
>> )
>>
>> # Does not return prediction interval
>> predict(m2, newmods = cbind(0, 0), level = 80)
>>
>> # Calculate prediction interval for year_c = 0, factor = A
>> vec <- c(1, 0, 0)
>> pred <- sum(m2$beta * vec)
>> Vb <- as.numeric(t(vec) %*% vcov(m2) %*% vec)
>> V_RE <- sum(m2$sigma2) + m2$tau2[1] # Using only the random intercept at
>> the study level
>> S <- sqrt(Vb + V_RE)
>> z_crit <- qnorm((1 + 0.8) / 2)
>> pred + c(-1, 1) * z_crit * S
>>
>>
>> On Thu, Mar 21, 2024 at 11:19 AM Zac Robinson via R-sig-meta-analysis <
>> r-sig-meta-analysis using r-project.org> wrote:
>>
>>> Wolfgang,
>>>
>>> Apologies for the incorrect method to reply - hopefully got it right this
>>> time. If I'm understanding things correctly - the coding can
>>> (potentially)
>>> end up giving me what I'm after, but I'm just specifying the sigma / sd
>>> for
>>> the prediction interval incorrectly (i.e., the variances can't just be
>>> summed).
>>>
>>> Thus, I guess my follow up question is - how would you recommend going
>>> about this calculation? The closest thing I can find in the literature
>>> with
>>> a similar variance calculation is in the context of R2 from Johnson et
>>> al.,
>>> where they calculate the "mean random effects variance" which seems like
>>> it
>>> could be useful here:
>>>
>>> https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.12225
>>>
>>> Let me know what you think and thank you for your time and expertise,
>>>
>>> Zac
>>>
>>> On Thu, Mar 14, 2024 at 9:23 AM Viechtbauer, Wolfgang (NP) <
>>> wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
>>>
>>> > Dear Zac,
>>> >
>>> > Please respond to the mailing list, not just to me personally.
>>> >
>>> > Just summing up all the variances isn't correct - the correct
>>> computation
>>> > of the prediction interval for such a random-effects structure is
>>> > unfortunately more complex.
>>> >
>>> > Best,
>>> > Wolfgang
>>> >
>>> > > -----Original Message-----
>>> > > From: Zac Robinson <zacrobinson2015 using gmail.com>
>>> > > Sent: Monday, March 11, 2024 16:07
>>> > > To: Viechtbauer, Wolfgang (NP) <
>>> > wolfgang.viechtbauer using maastrichtuniversity.nl>
>>> > > Subject: Re: [R-meta] Prediction Intervals for estimates of random
>>> slope
>>> > rma.mv
>>> > > model with predict.rma()
>>> > >
>>> > > Hey Wolfgang,
>>> > >
>>> > > Thank you very much for the reply. I actually have played around with
>>> > things a
>>> > > bit more and think I've found a solution. I may not be specifying the
>>> > sigma
>>> > > argument perfectly for a random slope model - but I think in concept
>>> > this works!
>>> > > Looks like emmeans / predict() have cross compatibility that allows
>>> for
>>> > > prediction intervals when sigma is specified
>>> > >
>>> > > Let me know what you think,
>>> > > Zac
>>> > >
>>> > > # SETUP
>>> > -------------------------------------------------------------------
>>> > > #load packages and data
>>> > > library(metafor)
>>> > > library(tidyverse)
>>> > > library(emmeans)
>>> > >
>>> > > data("dat.konstantopoulos2011")
>>> > > df=dat.konstantopoulos2011%>%
>>> > > as.data.frame()%>%
>>> > > mutate(factor=sample(c("A","B"),56,replace=TRUE))
>>> > >
>>> > > # RANDOM SLOPE (x1)
>>> > -------------------------------------------------------
>>> > > #fit single random slope model
>>> > > m2= rma.mv(yi ~ year+factor, vi,
>>> > > random = list(~year|study, ~1|district, ~1|school),
>>> > > data = df, method = "REML",struct = "GEN",
>>> > > control = list(optimizer="bobyqa"))
>>> > >
>>> > > #extract estimates with emmmprep + emmeans + predict
>>> > > est.ci=emmprep(m2,
>>> > > at=list(year=1976:2000))%>%
>>> > > emmeans(~year,
>>> > > weights = "prop",
>>> > > sigma=sqrt(sum(
>>> > > m2$sigma2,
>>> > > m2$tau2,
>>> > > m2$gamma2
>>> > > )))
>>> > >
>>> > > pi= est.ci%>%
>>> > > predict(interval="prediction")%>%
>>> > > select(lower.PL,upper.PL)
>>> > >
>>> > > preds=cbind(est.ci,pi)
>>> > > preds # prediction intervals are now added
>>> > >
>>> > > ggplot(data = preds,
>>> > > aes(x=year,
>>> > > y=emmean))+theme_classic()+
>>> > > geom_point(data=df,
>>> > > aes(y=yi,
>>> > > size=1/vi))+
>>> > > geom_ribbon(aes(ymin=lower.PL,
>>> > > ymax=upper.PL),
>>> > > linetype="dashed",
>>> > > alpha=0,
>>> > > color="black")+
>>> > > geom_ribbon(aes(ymin=asymp.LCL,
>>> > > ymax=asymp.UCL),
>>> > > alpha=0.25)+
>>> > > geom_line()
>>> > >
>>> > > On Mon, Mar 11, 2024 at 2:06 AM Viechtbauer, Wolfgang (NP)
>>> > > <mailto:wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
>>> > > Dear Zac,
>>> > >
>>> > > predict() currently does not provide predicition intervals for
>>> > http://rma.mv
>>> > > models with a GEN random effects structure. Not sure when I will get
>>> > around to
>>> > > implementing that.
>>> > >
>>> > > Best,
>>> > > Wolfgang
>>> > >
>>> > > > -----Original Message-----
>>> > > > From: R-sig-meta-analysis <mailto:
>>> > r-sig-meta-analysis-bounces using r-project.org>
>>> > > On Behalf
>>> > > > Of Zac Robinson via R-sig-meta-analysis
>>> > > > Sent: Sunday, March 10, 2024 16:26
>>> > > > To: mailto:r-sig-meta-analysis using r-project.org
>>> > > > Cc: Zac Robinson <mailto:zacrobinson2015 using gmail.com>
>>> > > > Subject: [R-meta] Prediction Intervals for estimates of random
>>> slope
>>> > > http://rma.mv
>>> > > > model with predict.rma()
>>> > > >
>>> > > > Dear All,
>>> > > >
>>> > > > I am working on extracting the adjusted effects from a
>>> dose-response
>>> > > > multilevel meta-regression. One issue I've run into is getting the
>>> > > > appropriate prediction intervals on my estimates when the model
>>> > > > includes random slopes (specifically when struct = "GEN" is
>>> included).
>>> > From
>>> > > > what I gather from the help files, this comes down to appropriately
>>> > > > specifying the `tau2.levels` and `gamma2.levels` arguments within
>>> > > > predict.rma() - however, I am unsure how to get this to work.
>>> > > >
>>> > > > In the code below I've tried to demonstrate my issue and explore
>>> the
>>> > other
>>> > > > solutions I'm aware of through the orchaRd package). My end goal
>>> > requires
>>> > > > adjusting for both continuous and categorical moderators so
>>> ideally the
>>> > > > solution stays within the emmprep() / emmeans() framework (which
>>> makes
>>> > > > proportional weights and setting non-focal continuous variables at
>>> the
>>> > mean
>>> > > > easy), but I realize this may not be possible.
>>> > > >
>>> > > > Thank you very much in advance!
>>> > > >
>>> > > > Zac
>>> > > >
>>> > > > # SETUP
>>> > -------------------------------------------------------------------
>>> > > > #load packages and data
>>> > > > library(metafor)
>>> > > > library(tidyverse)
>>> > > > library(orchaRd)
>>> > > > library(emmeans)
>>> > > >
>>> > > > data("dat.konstantopoulos2011")
>>> > > > df=dat.konstantopoulos2011%>%
>>> > > > as.data.frame()%>%
>>> > > > mutate(factor=sample(c("A","B"),nrow(df),replace=TRUE))
>>> > > >
>>> > > > # RANDOM INTERCEPT
>>> > --------------------------------------------------------
>>> > > > #fit random intercept model
>>> > > > m1=http://rma.mv(yi ~ year+factor, vi,
>>> > > > random = list(~1|study, ~1|district, ~1|school),
>>> > > > data = df, method = "REML")
>>> > > >
>>> > > > #extract estimates with predict.rma
>>> > > > predict(m1,
>>> > > > newmods =cbind(rep(c(1976:2000),2),
>>> > > > rep(c(0,1),each=25)),
>>> > > > addx = TRUE) # works fine
>>> > > >
>>> > > > #extract estimates with emmprep
>>> > > > emmprep(m1,
>>> > > > at=list(year=1976:2000))%>%
>>> > > > emmeans(~year,
>>> > > > weights = "prop")%>%
>>> > > > pred_interval_esmeans(model=m1,
>>> > > > mm=.,
>>> > > > mod="year") # works fine
>>> > > >
>>> > > > #extract estimates with orchaRd
>>> > > > mod_results(m1,
>>> > > > mod = "year",
>>> > > > group = "study") # works fine
>>> > > >
>>> > > > # RANDOM SLOPE (x1)
>>> > -------------------------------------------------------
>>> > > > #fit single random slope model
>>> > > > m2=http://rma.mv(yi ~ year+factor, vi,
>>> > > > random = list(~year|study, ~1|district, ~1|school),
>>> > > > data = df, method = "REML",struct = "GEN",
>>> > > > control = list(optimizer="bobyqa"))
>>> > > >
>>> > > > #extract estimates with predict.rma
>>> > > > predict(m2, newmods =cbind(rep(c(1976:2000),2),
>>> > > > rep(c(0,1),each=25)),
>>> > > > addx = TRUE) # no prediction intervals
>>> > > >
>>> > > > predict(m2, newmods =cbind(rep(c(1976:2000),2),
>>> > > > rep(c(0,1),each=25)),
>>> > > > addx = TRUE,
>>> > > > tau2.levels = levels(year), # try adding tau2.levels
>>> > > > ) # still no prediction intervals
>>> > > >
>>> > > > #extract estimates with emmprep
>>> > > > emmprep(m2,
>>> > > > at=list(year=1976:2000))%>%
>>> > > > emmeans(~year,
>>> > > > weights = "prop")%>%
>>> > > > pred_interval_esmeans(model=m2,
>>> > > > mm=.,
>>> > > > mod="year") # works but gives an error
>>> > > >
>>> > > > #extract estimates with orchaRd
>>> > > > mod_results(m2,
>>> > > > mod = "year",
>>> > > > group = "study") # works but gives an error
>>> > > >
>>> > > >
>>> > > > # RANDOM SLOPE (x2)
>>> > -------------------------------------------------------
>>> > > > #fit double random slope model
>>> > > > m3=http://rma.mv(yi ~ year+factor, vi,
>>> > > > random = list(~year|study, ~year|district, ~1|school),
>>> > > > data = df, method = "REML",struct = c("GEN","GEN"),
>>> > > > control = list(optimizer="bobyqa"))
>>> > > >
>>> > > > #extract estimates with predict.rma
>>> > > > predict(m3, newmods =cbind(rep(c(1976:2000),2),
>>> > > > rep(c(0,1),each=25)),
>>> > > > addx = TRUE) # no prediction intervals
>>> > > >
>>> > > > predict(m3, newmods =cbind(rep(c(1976:2000),2),
>>> > > > rep(c(0,1),each=25)),
>>> > > > addx = TRUE,
>>> > > > tau2.levels = levels(year),# try adding tau2.levels
>>> > > > gamma2.levels = levels(year)# try adding gamma2.levels
>>> > > > ) # still no prediction intervals
>>> > > >
>>> > > > #extract estimates with emmprep
>>> > > > emmprep(m3,
>>> > > > at=list(year=1976:2000))%>%
>>> > > > emmeans(~year,
>>> > > > weights = "prop")%>%
>>> > > > pred_interval_esmeans(model=m3,
>>> > > > mm=.,
>>> > > > mod="year") # works but gives an error
>>> > > >
>>> > > > #extract estimates with orchaRd
>>> > > > mod_results(m3,
>>> > > > mod = "year",
>>> > > > group = "study") # works but gives an error
>>> >
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-meta-analysis mailing list @ R-sig-meta-analysis using r-project.org
>>> To manage your subscription to this mailing list, go to:
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>>>
>>
[[alternative HTML version deleted]]
More information about the R-sig-meta-analysis
mailing list