[R-meta] Clarifications about workflow for complex meta-analyses with dependent effect sizes

Tue Nov 7 10:16:16 CET 2023

Dear Alberto,

Please see below for some responses.

Best,
Wolfgang

> -----Original Message-----
> From: R-sig-meta-analysis <r-sig-meta-analysis-bounces using r-project.org> On Behalf
> Of Alberto Enrico Maraolo via R-sig-meta-analysis
> Sent: Friday, November 3, 2023 23:29
> To: r-sig-meta-analysis using r-project.org
> Cc: Alberto Enrico Maraolo <albertomaraolo using mail.com>
> Subject: [R-meta] Clarifications about workflow for complex meta-analyses with
> dependent effect sizes
>
> ATTACHMENT(S) REMOVED: ATT00001.txt
>
> Dear all,
>
> this is my first time writing a message here but I have already exploited great
> info from previous mail and threads.
>
> I am seeking for advice about a multivariate meta-analysis with metafor.
> I have a dataset very similar to dat.ishak2007 as nested structure, but the
> effect size is different (LogOR, related to vaccine effectiveness evalutated at
> several time points).
> Therefore, I guess that a multivariate model with heteroscedastic AR(1)
> structure for the true effects would serve the purpose, as well explained also
> in Musekiwa et al. (PlosOne 2016, models 4-6).
>
> I found in the Web from the workshop on "Exploring the Limits of Advanced Meta-
> Analysis" by Wolfgang Viechtbauer the following code:
>
> res <- rma.mv(yi, V, mods = ~ factor(time) - 1, random = ~ time | study,
> struct="HAR", data=dat, digits=2)
>
> Moreover, nice plotting through:
>
> par(mar=c(4,4,2,2))
> with(dat, interaction.plot(time, study, yi, type="b", pch=19, col="gray70",
> lty="solid", xaxt="n", legend=FALSE, bty="l", xlab="Time Point", ylab="Mean
> Difference")) axis(side=1, at=1:4, lab=c("1 (3 months)", "2 (6 months)", "3 (12
> months)", "4 (12+ months)")) points(1:4, coef(res), type="o", pch=19, lwd=4,
> cex=2)
>
> The first, more trivial question, is how to derive prediction interval for the 4
> factorized timepoints and how to plot them as error bars.

You can get prediction intervals for the 4 timepoints with:

predict(res, newmods=diag(4), tau2.levels=1:4)

To add these to the plot, you could do:

pred <- predict(res, newmods=diag(4), tau2.levels=1:4)
arrows(1:4, pred$pi.lb, 1:4, pred$pi.ub, lwd=3, code=3, angle=90, length=0.1)

> But more important question: how to derive in an easy way for a non-
> biostatistician the V matrix choosing the phi values for:
>
> V <- vcalc(vi, cluster=study, time1=time, data=dat, phi=0.8) - 0.8 is in the
> example related to dat.ishak2007.

The 0.8 in the presentation was chosen a bit arbitrarily. In the example code here:

https://wviechtb.github.io/metadat/reference/dat.ishak2007.html

V is calculated using phi=0.97.

Technically, there is a different value of the autocorrelation coefficient for each study. If one had the raw data for a study, one could easily compute this. However, in practice, this is often not available. Often, we just use a 'guestimate' (which may be based on one or two studies for which the raw data are actually available) and assume that this applies to all studies.

In the actual analyses of these data, the authors (Ishak et al., 2007) actually estimated the value of phi from the studies at hand. In essence, we just profile the likelihood over different values of phi and check for which value of phi the likelihood is maximized:

phis <- seq(0.8, 0.99, by=0.01)
lls <- sapply(phis, function(phi) {
V <- vcalc(vi, cluster=study, time1=time, data=dat, phi=phi)
res <- rma.mv(yi, V, mods = ~ factor(time) - 1, random = ~ time | study,
              struct="HAR", data=dat, digits=2)
logLik(res)
})

plot(phis, lls, type="o", pch=21, bg="lightgray")
abline(v=phis[which.max(lls)])
phis[which.max(lls)]

This leads to the estimate of 0.97. In my experience, trying to do this with other datasets often doesn't work so well and the estimate of the correlation drifts towards -1 or +1. In fact, the estimated value above seems awfully high to me. So this is why I just (somewhat naively) used 0.8 in the presentation. However, the assumed value of phi has only a minor impact on the results, so it doesn't matter (at least in this example) too much what we use in the first place.

In general, whenever we construct a V matrix that is really just an approximation, I would recommend to use cluster-robust inference methods as the last step:

robust(res, cluster=study, clubSandwich=TRUE)

Here, the SEs are barely changed and the degrees of freedom are quite high, so this provides evidence that the assumed model is an adequate approximation.

> Of course I read it is tricky and there are some complex equations and
> calculations in SAS, but from my point of view an easier alternative would be
> very precious.
>
> As an alternative, could I go opt for a three-level meta-analysis as done by Wu
> and colleagues (https://pubmed.ncbi.nlm.nih.gov/36780914/)? In the Supplementary
> they shared this code following the advice in Chapter 10 of "Doing Meta-Analysis
> with R":
>   cov.mod2 <-  rma.mv(Log.RR,
>                    var.log,
>                    random = ~ 1 | Study.ID/effect.number, #each different ES
> across studies are numbered (1,2,3 the first study; 4,5,6 the subsequent and so
> on) and the variable is called effect.number
>                    test="t",
>                    data = COVEND2,
>                    method = "REML",
>                    mods = ~ FUP,  #FUP stands for the factorized time interval
> (follow-up)
>                    control=list(maxiter=10^6))
>
> In this case as grouping variable, instead of effect.number, could FUP (time as
> factor) be put?

I don't know this paper, but if they have repeated Log.RR values based on the same sample of subjects, then this model seems to ignore this (since it doesn't use a proper V matrix and just the var.log values for the sampling variances, in essence assuming that V is diagonal). Also, 'random = ~ 1 | Study.ID/effect.number' implies that the correlation between multiple effect sizes within studies is constant, but this is often not so sensible for longitudinal data (where the correlation tends to become weaker for effect sizes further apart in time).

> I hope my request for help is clear.
>
> Warm regards,
> Alberto
>
> Alberto Enrico Maraolo, MD, MSc (Antimicrobial Stewardship, Evidence Synthesis),
> FESCMID
>
> Infectious Diseases Specialist, Member of the Steering Committee of SIMIT (ID
> Italian Society)
>
> Cotugno Hospital, AORN dei Colli, Naples, Italy
> mail: mailto:albertomaraolo using mail.com
>
> Alberto Enrico Maraolo, MD, MSc (Antimicrobial Stewardship, Evidence Synthesis),
> FESCMID
>
> Specialista in Malattie Infettive, Consigliere Nazionale Direttivo SIMIT
> (Società Italiana di Malattie Infettive e Tropicali)
>
> Dirigente Medico, AORN dei Colli - Ospedale Cotugno, Napoli
>
> I Divisione - Malattie Infettive emergenti e ad alta contagiosità (ex Malattie
> Infettive ad indirizzo neurologico)