[R-meta] Multiple time point meta-analysis in metafor
Daniel Quintana
daniel.quintana at medisin.uio.no
Fri Sep 22 12:02:26 CEST 2017
Hi Wolfgang,
Thank you again for your detailed response to my original question.
I just have a second brief follow-up question. In my original question, I was interested in the impact of a psychiatric treatment on two different treatment outcomes (Outcome “B" and Outcome “W") at two time points (time 1 and time 2). I have now come across a few studies that include an additional treatment arm (i.e., treatment 1 vs. placebo and treatment 2 vs. placebo) that I would like to include.
Here is the updated data in long format (studies 30, 31, and 32 provide an additional treatment arm, thus providing an addition set of effect sizes and variances).
yi <- c(0.226, 0.191, 0.332, 0.231, -0.386, NA,
-0.421, NA, 0.179, 0.199, 0.129, -0.199, 0.17, 0.218, 0, 0, 3.137,
1.55, 2.836, 0.933, -0.069, 0.272, 0.267, 0.005, -0.149, -0.13,
0.114, 0.255, -0.12, NA, 0.051, NA, NA, 0.106, 0.207, 0.136,
-0.663, NA, 0.132, NA, 0.123, NA, 0.368, NA, 0.263, NA, 0.718,
NA, -0.408, -0.472, -0.131, -0.55, -0.539, 0.015, 0.267, 0.983,
-0.117, NA, 0.006, NA, -1.113, NA, -0.502, NA, 0, 0, -0.094,
0.043, 0.725, NA, 0.584, NA, -0.466, NA, 0.177, NA, -0.311, NA,
0.103, NA, 0.387, 0.568, 0, 0.099, 0.153, NA, 1.123, NA, -0.446,
1.145, 1.011, 1.076, -0.034, NA, 0.141, NA, 0.752, 0.657, 0.082,
0.366, 0.376, NA, 0.211, NA, 0.128, NA, -0.621, NA, -0.197, 0.241,
0.417, 0.417, -0.596, -0.552, 0.407, 0.417, -0.055, 0.281, -0.18,
-0.04, -0.218, -0.167, 0.161, 0.246, 0.283, NA, 0, NA, 0.469,
NA, 0.253, NA, -0.332, 0.902, -2.423, 0.423, -1.136, 0.439, -1.628,
-0.613)
vi <- c(0.043, 0.065, 0.043, 0.065, 0.114, NA, 0.114,
NA, 0.034, 0.034, 0.034, 0.034, 0.053, 0.059, 0.052, 0.058, 0.087,
0.056, 0.078, 0.048, 0.058, 0.049, 0.059, 0.049, 0.027, 0.033,
0.027, 0.033, 0.108, NA, 0.108, NA, NA, 0.03, 0.03, 0.03, 0.211,
NA, 0.2, NA, 0.104, NA, 0.106, NA, 0.104, NA, 0.109, NA, 0.227,
0.228, 0.223, 0.231, 0.198, 0.191, 0.193, 0.214, 0.417, NA, 0.417,
NA, 0.481, NA, 0.443, NA, 0.056, 0.059, 0.056, 0.059, 0.345,
NA, 0.338, NA, 0.137, NA, 0.134, NA, 0.239, NA, 0.236, NA, 0.075,
0.084, 0.073, 0.081, 0.05, NA, 0.058, NA, 0.035, 0.042, 0.038,
0.041, 0.076, NA, 0.076, NA, 0.07, 0.074, 0.066, 0.072, 0.082,
NA, 0.081, NA, 0.161, NA, 0.168, NA, 0.167, 0.122, 0.199, 0.199,
0.182, 0.202, 0.178, 0.199, 0.06, 0.038, 0.06, 0.037, 0.051,
0.038, 0.051, 0.038, 0.168, NA, 0.167, NA, 0.179, NA, 0.176,
NA, 0.035, 0.036, 0.059, 0.033, 0.04, 0.038, 0.045, 0.039)
study <- c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5,
5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8, 9,
9, 9, 9, 10, 10, 10, 10, 11, 11, 11, 11, 12, 12,
12, 12, 13, 13, 13, 13, 14, 14, 14, 14, 15, 15, 15,
15, 16, 16, 16, 16, 17, 17, 17, 17, 18, 18, 18, 18,
19, 19, 19, 19, 20, 20, 20, 20, 21, 21, 21, 21, 22,
22, 22, 22, 23, 23, 23, 23, 24, 24, 24, 24, 25, 25,
25, 25, 26, 26, 26, 26, 27, 27, 27, 27, 28, 28, 28,
28, 29, 29, 29, 29, 30, 30, 30, 30, 30, 30, 30, 30,
31, 31, 31, 31, 31, 31, 31, 31, 32, 32, 32, 32, 32,
32, 32, 32)
outcome <- rep(c("B", "B", "W", "W"), 35)
time <- rep(c("1", "2"), 70)
type <- rep(c(1,2,3,4), 35)
dat <- data.frame(yi, vi, study, outcome, time, type)
If I were to perform cluster-robust analysis, would the following script be correct in order to also account for the dependency of the two different treatment arm effects (both for outcome “B” and outcome “W) at both time points?
res <- rma.mv(yi, vi, mods = ~ outcome:factor(time) - 1, random = ~ factor(type) | study, struct="HCS", data=dat)
res
sav <- robust(res, cluster=dat$study)
sav
Many thanks,
Daniel
On 14 Sep 2017, at 9:35 pm, Viechtbauer Wolfgang (SP) <wolfgang.viechtbauer at maastrichtuniversity.nl<mailto:wolfgang.viechtbauer at maastrichtuniversity.nl>> wrote:
Hi Daniel,
In essence, the raw data in each study consist of 8 columns of the form:
Group: T C
Outcome: A B A B
Time: 1 2 1 2 1 2 1 2
So, there is in principle an 8x8 var-cov matrix corresponding to these data:
[ s2_TA1 | ]
[ s2_TA2 | ]
[ s2_TB1 | ]
[ s2_TB2 | ]
[-----------------------------------------------------------]
[ | s2_CA1 ]
[ | s2_CA2 ]
[ | s2_CB1 ]
[ | s2_CB2 ]
In the upper left and lower right block, there are also covariances. The upper right and lower left block contain all 0s (since the groups are independent). I assume you know the variances (i.e., the diagonal), since those are needed in order to compute the SMD values. So, you computed:
dA1 = (mean_TA1 - mean_CA1) / s_pooled_TA1_CA1
dA2 = (mean_TA2 - mean_CA2) / s_pooled_TA2_CA2
dB1 = (mean_TB1 - mean_CB1) / s_pooled_TB1_CB1
dB2 = (mean_TB2 - mean_CB2) / s_pooled_TB2_CB2
For these 4 SMD values, there is a corresponding 4x4 var-cov matrix. The diagonal of that matrix are the variances you give below. The covariances in that matrix are a function of the covariances/correlations from the 8x8 matrix above. Note that there isn't just one within-study correlation -- there are 6 of them (6 covariances/correlations in a 4x4 matrix). And that 0.97 you mentioned comes from a very specific example -- it isn't applicable here.
It would take a bit of work to figure out the necessary equations. Some hints as to how one could tackle this can be found on James' blog; in particular:
http://jepusto.github.io/alternative-formulas-for-the-SMD
http://jepusto.github.io/distribution-of-sample-variances
But this would be pretty tedious work. Note that actually computing the covariances would require knowing or guestimating the correlations in that 8x8 matrix above.
Alternatively, you could do something along the lines of:
http://jepusto.github.io/imputing-covariance-matrices-for-multi-variate-meta-analysis
Here, we don't use the 'exact' equations to compute the covariances, but directly assume that we know (or have a good guess) about the correlations among the SMD values. Note that you would still need to guestimate 6 correlations. That might be the more feasible approach here.
So, you first have to restructure your data into a long format. Then you construct a 4x4 correlation matrix for the SMD values. And then you construct the full V matrix. This should do this:
dat.long <- data.frame(
yi = unlist(dat[,c("y1i.w", "y2i.w", "y1i.p", "y2i.p")]),
vi = unlist(dat[,c("v1i.w", "v2i.w", "v1i.p", "v2i.p")]))
dat.long$study <- 1:10
dat.long$outcome <- rep(c("W","B"), each=20)
dat.long$time <- rep(c(1,2), each=10)
rownames(dat.long) <- 1:nrow(dat.long)
dat.long <- dat.long[order(dat.long$study, dat.long$outcome, dat.long$time),]
dat.long$type <- 1:4
dat.long
R <- matrix(c( 1, .7, .8, .6,
.7, 1, .6, .8,
.8, .6, 1, .7,
.6, .8, .7, 1), nrow=4, ncol=4)
R <- diag(10) %x% R
V <- diag(sqrt(dat.long$vi)) %*% R %*% diag(sqrt(dat.long$vi))
Note that the R matrix above assumes a correlation of 0.7 for Time 1 vs Time 2 for both outcomes, a correlation of 0.8 for Outcome B vs Outcome W for both time points, and a correlation of 0.6 for Time 1 Outcome B vs Time 2 Outcome W and Time 2 Outcome B vs Time 1 Outcome W. You need to adjust this to whatever you think makes sense for your data.
Now as a starter, you can do:
res <- rma.mv(yi, V, mods = ~ outcome:factor(time) - 1, data=dat.long)
res
With 'mods = ~ outcome:factor(time) - 1', you directly get the estimated pooled SMD for each outcome-time combination. Then you can test/estimate contrasts with predict() and anova(). For example:
predict(res, newmods=c(1,-1,0,0))
anova(res, L=c(1,-1,0,0))
But the model above is a fixed-effects model. If you want to make generalizable inferences, then you need a random/mixed-effects model. You could try a fully unstructured var-cov matrix for the random effects:
res <- rma.mv(yi, V, mods = ~ outcome:factor(time) - 1, random = ~ factor(type) | study, struct="UN", data=dat.long)
res
But this is probably asking too much with just 10 studies. Maybe:
res <- rma.mv(yi, V, mods = ~ outcome:factor(time) - 1, random = ~ factor(type) | study, struct="HCS", data=dat.long)
res
This allows for different amounts of heterogeneity for the 4 SMD types, but assumes a single correlation among them. One could also try other structures.
Alternatively, if the construction of V is too tedious for you, you could just assume a diagonal V matrix and then use cluster-robust inferences. Like this:
res <- rma.mv(yi, vi, mods = ~ outcome:factor(time) - 1, random = ~ factor(type) | study, struct="HCS", data=dat.long)
res
sav <- robust(res, cluster=dat.long$study)
sav
Note that you don't need V here, just the variances (which you already have). I am somewhat doubtful though that 10 studies is enough for this to be justifiable. The cluster-robust approach relies on asymptotics and 10 studies may not be enough for that. You might also want to use the 'clubSandwich' package for this (https://cran.r-project.org/web/packages/clubSandwich/index.html), since it includes some small-sample adjustments to the cluster-robust approach that might be beneficial here.
Best,
Wolfgang
--
Wolfgang Viechtbauer, Ph.D., Statistician | Department of Psychiatry and
Neuropsychology | Maastricht University | P.O. Box 616 (VIJV1) | 6200 MD
Maastricht, The Netherlands | +31 (43) 388-4170 | http://www.wvbauer.com
-----Original Message-----
From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-project.org] On Behalf Of Daniel Quintana
Sent: Thursday, 14 September, 2017 12:50
To: r-sig-meta-analysis at r-project.org
Subject: [R-meta] Multiple time point meta-analysis in metafor
Hi all,
I have a question regarding a multiple time point meta-analysis that I would like to conduct using metafor.
For this meta-analysis of a treatment for psychiatric illness, I’m interested in two different treatment outcomes (Outcome A and Outcome B) at two time points (time 1 and time 2). Most studies in this area compare an active treatment with a placebo using a between-participants design, and then assess both Outcome A and Outcome B over two time points.
I have estimated the standardised mean difference (SMD) and variance for both outcomes and both time points for each study, which can be found in the following data frame:
study <- c("study 1", "study 2", "study 3", "study 4", "study 5", "study 6", "study 7", "study 8", "study 9", "study 10")
y1i.w <- c (0.332, -0.421, 0.129, 0.000, 2.836, 0.267, 0.114, 0.051, 0.207, 0.132)
v1i.w <- c (0.043, 0.114, 0.034, 0.052, 0.078, 0.059, 0.027, 0.108, 0.030, 0.200)
y2i.w <- c (0.231, NA, -0.199, 0.000, 0.933, 0.005, 0.255, NA, 0.136, NA)
v2i.w <- c (0.065, NA, 0.034, 0.058, 0.048, 0.049, 0.033, NA, 0.030, NA)
y1i.p <- c (-0.226, 0.386, -0.179, -0.170, -3.137, 0.069, 0.149, 0.120, NA, 0.663)
v1i.p <- c (0.043, 0.114, 0.034, 0.053, 0.087, 0.058, 0.027, 0.108, NA, 0.211)
y2i.p <- c (-0.191, NA, -0.199, -0.218, -1.550, -0.272, 0.130, NA, -0.106, NA)
v2i.p <- c (0.065, NA, 0.034, 0.059, 0.056, 0.049, 0.033, NA, 0.030, NA)
dat <- data.frame(study, y1i.w, v1i.w, y2i.w, v2i.w, y1i.p, v1i.p, y2i.p, v2i.p)
Here, y1i.w = SMD at time 1 for Outcome A, v1i.w = variance at time 1 for Outcome A, y2i.w = SMD at time 2 for Outcome A, v2i.w = variance at time 2 for Outcome A, y1i.p = SMD at time 1 for Outcome B, v1i.p = variance at time 1 for Outcome B, y2i.p = SMD at time 2 for Outcome B, v2i.p = variance at time 2 for Outcome B.
Ideally, I would like to investigate effects at each time point (Outcome A and Outcome B separately), compare effects between timepoints (Outcome A and Outcome B separately), and compare effects at each timepoint (Outcome A vs. Outcome B).
There’s no data available for within-study correlations, so I’m assuming this is 0.97. I’ve tried to follow the “dat.ishak2007" example in the metafor package documentation to create the V matrix, but I can’t figure out how to create this matrix for the dataset I have.
I would appreciate any assistance!
Kind regards,
Daniel Quintana
[[alternative HTML version deleted]]
More information about the R-sig-meta-analysis
mailing list