[R-meta] Multiple time point meta-analysis in metafor

Mon Sep 18 06:48:55 CEST 2017

Hi Wolfgang,

Thanks for the detailed response, this has been a huge help.

Considering the difficulties in estimating effect dependence, I’m thinking of using cluster-robust inferences. I actually have 29 studies in total (I only included 10 in my original question for illustrative purposes).

When it comes to reporting results, I will report each of the four models from the cluster-robust test (outcomes B and W, time 1 and time 2) and test/estimate contrasts, but what else would be normally reported in this circumstance? I am thinking of reporting the following from the random/mixed-effects model (assuming a diagonal V matrix): tau^2 for each model, profile plots for tau^2, test for residual heterogeneity, test of moderators, and Egger’s regression test, and funnel plot. Is there anything else I should report here?

Many thanks,
Daniel

> On 14 Sep 2017, at 9:35 pm, Viechtbauer Wolfgang (SP) <wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
> 
> Hi Daniel,
> 
> In essence, the raw data in each study consist of 8 columns of the form:
> 
> Group:      T        C
> Outcome:  A   B    A   B
> Time:    1 2 1 2  1 2 1 2
> 
> So, there is in principle an 8x8 var-cov matrix corresponding to these data:
> 
> [ s2_TA1                      |                             ]
> [        s2_TA2               |                             ]
> [               s2_TB1        |                             ]
> [                      s2_TB2 |                             ]
> [-----------------------------------------------------------]
> [                             | s2_CA1                      ]
> [                             |        s2_CA2               ]
> [                             |               s2_CB1        ]
> [                             |                      s2_CB2 ]
> 
> In the upper left and lower right block, there are also covariances. The upper right and lower left block contain all 0s (since the groups are independent). I assume you know the variances (i.e., the diagonal), since those are needed in order to compute the SMD values. So, you computed:
> 
> dA1 = (mean_TA1 - mean_CA1) / s_pooled_TA1_CA1
> dA2 = (mean_TA2 - mean_CA2) / s_pooled_TA2_CA2
> dB1 = (mean_TB1 - mean_CB1) / s_pooled_TB1_CB1
> dB2 = (mean_TB2 - mean_CB2) / s_pooled_TB2_CB2
> 
> For these 4 SMD values, there is a corresponding 4x4 var-cov matrix. The diagonal of that matrix are the variances you give below. The covariances in that matrix are a function of the covariances/correlations from the 8x8 matrix above. Note that there isn't just one within-study correlation -- there are 6 of them (6 covariances/correlations in a 4x4 matrix). And that 0.97 you mentioned comes from a very specific example -- it isn't applicable here.
> 
> It would take a bit of work to figure out the necessary equations. Some hints as to how one could tackle this can be found on James' blog; in particular:
> 
> http://jepusto.github.io/alternative-formulas-for-the-SMD
> http://jepusto.github.io/distribution-of-sample-variances
> 
> But this would be pretty tedious work. Note that actually computing the covariances would require knowing or guestimating the correlations in that 8x8 matrix above.
> 
> Alternatively, you could do something along the lines of:
> 
> http://jepusto.github.io/imputing-covariance-matrices-for-multi-variate-meta-analysis
> 
> Here, we don't use the 'exact' equations to compute the covariances, but directly assume that we know (or have a good guess) about the correlations among the SMD values. Note that you would still need to guestimate 6 correlations. That might be the more feasible approach here.
> 
> So, you first have to restructure your data into a long format. Then you construct a 4x4 correlation matrix for the SMD values. And then you construct the full V matrix. This should do this:
> 
> dat.long <- data.frame(
> yi = unlist(dat[,c("y1i.w", "y2i.w", "y1i.p", "y2i.p")]),
> vi = unlist(dat[,c("v1i.w", "v2i.w", "v1i.p", "v2i.p")]))
> dat.long$study <- 1:10
> dat.long$outcome <- rep(c("W","B"), each=20)
> dat.long$time    <- rep(c(1,2), each=10)
> rownames(dat.long) <- 1:nrow(dat.long)
> dat.long <- dat.long[order(dat.long$study, dat.long$outcome, dat.long$time),]
> dat.long$type <- 1:4
> dat.long
> 
> R <- matrix(c( 1, .7, .8, .6,
>              .7,  1, .6, .8,
>              .8, .6,  1, .7,
>              .6, .8, .7,  1), nrow=4, ncol=4)
> R <- diag(10) %x% R
> V <- diag(sqrt(dat.long$vi)) %*% R %*% diag(sqrt(dat.long$vi))
> 
> Note that the R matrix above assumes a correlation of 0.7 for Time 1 vs Time 2 for both outcomes, a correlation of 0.8 for Outcome B vs Outcome W for both time points, and a correlation of 0.6 for Time 1 Outcome B vs Time 2 Outcome W and Time 2 Outcome B vs Time 1 Outcome W. You need to adjust this to whatever you think makes sense for your data.
> 
> Now as a starter, you can do:
> 
> res <- rma.mv(yi, V, mods = ~ outcome:factor(time) - 1, data=dat.long)
> res
> 
> With 'mods = ~ outcome:factor(time) - 1', you directly get the estimated pooled SMD for each outcome-time combination. Then you can test/estimate contrasts with predict() and anova(). For example:
> 
> predict(res, newmods=c(1,-1,0,0))
> anova(res, L=c(1,-1,0,0))
> 
> But the model above is a fixed-effects model. If you want to make generalizable inferences, then you need a random/mixed-effects model. You could try a fully unstructured var-cov matrix for the random effects:
> 
> res <- rma.mv(yi, V, mods = ~ outcome:factor(time) - 1, random = ~ factor(type) | study, struct="UN", data=dat.long)
> res
> 
> But this is probably asking too much with just 10 studies. Maybe:
> 
> res <- rma.mv(yi, V, mods = ~ outcome:factor(time) - 1, random = ~ factor(type) | study, struct="HCS", data=dat.long)
> res
> 
> This allows for different amounts of heterogeneity for the 4 SMD types, but assumes a single correlation among them. One could also try other structures.
> 
> Alternatively, if the construction of V is too tedious for you, you could just assume a diagonal V matrix and then use cluster-robust inferences. Like this:
> 
> res <- rma.mv(yi, vi, mods = ~ outcome:factor(time) - 1, random = ~ factor(type) | study, struct="HCS", data=dat.long)
> res
> sav <- robust(res, cluster=dat.long$study)
> sav
> 
> Note that you don't need V here, just the variances (which you already have). I am somewhat doubtful though that 10 studies is enough for this to be justifiable. The cluster-robust approach relies on asymptotics and 10 studies may not be enough for that. You might also want to use the 'clubSandwich' package for this (https://cran.r-project.org/web/packages/clubSandwich/index.html), since it includes some small-sample adjustments to the cluster-robust approach that might be beneficial here.
> 
> Best,
> Wolfgang
> 
> -- 
> Wolfgang Viechtbauer, Ph.D., Statistician | Department of Psychiatry and 
> Neuropsychology | Maastricht University | P.O. Box 616 (VIJV1) | 6200 MD 
> Maastricht, The Netherlands | +31 (43) 388-4170 | http://www.wvbauer.com 
> 
> -----Original Message-----
> From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-project.org] On Behalf Of Daniel Quintana
> Sent: Thursday, 14 September, 2017 12:50
> To: r-sig-meta-analysis at r-project.org
> Subject: [R-meta] Multiple time point meta-analysis in metafor
> 
> Hi all,
> 
> I have a question regarding a multiple time point meta-analysis that I would like to conduct using metafor.
> 
> For this meta-analysis of a treatment for psychiatric illness, I’m interested in two different treatment outcomes (Outcome A and Outcome B) at two time points (time 1 and time 2). Most studies in this area compare an active treatment with a placebo using a between-participants design, and then assess both Outcome A and Outcome B over two time points.
> 
> I have estimated the standardised mean difference (SMD) and variance for both outcomes and both time points for each study, which can be found in the following data frame:
> 
> study <- c("study 1", "study 2",  "study 3",  "study 4",  "study 5",  "study 6",  "study 7",  "study 8",  "study 9",  "study 10")
> y1i.w  <- c (0.332, -0.421, 0.129, 0.000, 2.836, 0.267, 0.114, 0.051, 0.207, 0.132)
> v1i.w <- c (0.043, 0.114, 0.034, 0.052, 0.078, 0.059, 0.027, 0.108, 0.030,  0.200)
> y2i.w <- c (0.231, NA, -0.199, 0.000, 0.933, 0.005, 0.255, NA, 0.136, NA)
> v2i.w <- c (0.065, NA, 0.034, 0.058, 0.048, 0.049, 0.033, NA, 0.030, NA)
> y1i.p <- c (-0.226, 0.386, -0.179,  -0.170, -3.137, 0.069, 0.149, 0.120, NA, 0.663)
> v1i.p <- c (0.043, 0.114, 0.034, 0.053, 0.087, 0.058, 0.027, 0.108, NA, 0.211)
> y2i.p <- c (-0.191, NA, -0.199, -0.218,  -1.550, -0.272, 0.130, NA, -0.106, NA)
> v2i.p <- c (0.065, NA, 0.034, 0.059, 0.056, 0.049, 0.033, NA, 0.030, NA)
> dat <- data.frame(study, y1i.w, v1i.w, y2i.w, v2i.w, y1i.p, v1i.p, y2i.p, v2i.p)
> 
> Here, y1i.w = SMD at time 1 for Outcome A, v1i.w = variance at time 1 for Outcome A,  y2i.w = SMD at time 2 for Outcome A, v2i.w = variance at time 2 for Outcome A, y1i.p = SMD at time 1 for Outcome B, v1i.p = variance at time 1 for Outcome B,  y2i.p = SMD at time 2 for Outcome B, v2i.p = variance at time 2 for Outcome B.
> 
> Ideally, I would like to investigate effects at each time point (Outcome A and Outcome B separately), compare effects between timepoints (Outcome A and Outcome B separately), and compare effects at each timepoint (Outcome A vs. Outcome B).
> 
> There’s no data available for within-study correlations, so I’m assuming this is 0.97. I’ve tried to follow the “dat.ishak2007" example in the metafor package documentation to create the V matrix, but I can’t figure out how to create this matrix for the dataset I have.
> 
> I would appreciate any assistance!
> 
> Kind regards,
> Daniel Quintana