[R-meta] Multiple time point meta-analysis in metafor
Viechtbauer Wolfgang (SP)
wolfgang.viechtbauer at maastrichtuniversity.nl
Sat Sep 23 23:55:28 CEST 2017
This all sounds like things one should indeed report. The profile plots are probably of minor interest to most readers, but those can always go in an appendix. I think it is also a good idea to examine the data for outliers/influential estimates/clusters/studies and consider sensitivity analyses if any are found.
From: Daniel Quintana [mailto:daniel.quintana at medisin.uio.no]
Sent: Monday, 18 September, 2017 6:49
To: Viechtbauer Wolfgang (SP)
Cc: r-sig-meta-analysis at r-project.org
Subject: Re: Multiple time point meta-analysis in metafor
Thanks for the detailed response, this has been a huge help.
Considering the difficulties in estimating effect dependence, I’m thinking of using cluster-robust inferences. I actually have 29 studies in total (I only included 10 in my original question for illustrative purposes).
When it comes to reporting results, I will report each of the four models from the cluster-robust test (outcomes B and W, time 1 and time 2) and test/estimate contrasts, but what else would be normally reported in this circumstance? I am thinking of reporting the following from the random/mixed-effects model (assuming a diagonal V matrix): tau^2 for each model, profile plots for tau^2, test for residual heterogeneity, test of moderators, and Egger’s regression test, and funnel plot. Is there anything else I should report here?
> On 14 Sep 2017, at 9:35 pm, Viechtbauer Wolfgang (SP) <wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
> Hi Daniel,
> In essence, the raw data in each study consist of 8 columns of the form:
> Group: T C
> Outcome: A B A B
> Time: 1 2 1 2 1 2 1 2
> So, there is in principle an 8x8 var-cov matrix corresponding to these data:
> [ s2_TA1 | ]
> [ s2_TA2 | ]
> [ s2_TB1 | ]
> [ s2_TB2 | ]
> [ | s2_CA1 ]
> [ | s2_CA2 ]
> [ | s2_CB1 ]
> [ | s2_CB2 ]
> In the upper left and lower right block, there are also covariances. The upper right and lower left block contain all 0s (since the groups are independent). I assume you know the variances (i.e., the diagonal), since those are needed in order to compute the SMD values. So, you computed:
> dA1 = (mean_TA1 - mean_CA1) / s_pooled_TA1_CA1
> dA2 = (mean_TA2 - mean_CA2) / s_pooled_TA2_CA2
> dB1 = (mean_TB1 - mean_CB1) / s_pooled_TB1_CB1
> dB2 = (mean_TB2 - mean_CB2) / s_pooled_TB2_CB2
> For these 4 SMD values, there is a corresponding 4x4 var-cov matrix. The diagonal of that matrix are the variances you give below. The covariances in that matrix are a function of the covariances/correlations from the 8x8 matrix above. Note that there isn't just one within-study correlation -- there are 6 of them (6 covariances/correlations in a 4x4 matrix). And that 0.97 you mentioned comes from a very specific example -- it isn't applicable here.
> It would take a bit of work to figure out the necessary equations. Some hints as to how one could tackle this can be found on James' blog; in particular:
> But this would be pretty tedious work. Note that actually computing the covariances would require knowing or guestimating the correlations in that 8x8 matrix above.
> Alternatively, you could do something along the lines of:
> Here, we don't use the 'exact' equations to compute the covariances, but directly assume that we know (or have a good guess) about the correlations among the SMD values. Note that you would still need to guestimate 6 correlations. That might be the more feasible approach here.
> So, you first have to restructure your data into a long format. Then you construct a 4x4 correlation matrix for the SMD values. And then you construct the full V matrix. This should do this:
> dat.long <- data.frame(
> yi = unlist(dat[,c("y1i.w", "y2i.w", "y1i.p", "y2i.p")]),
> vi = unlist(dat[,c("v1i.w", "v2i.w", "v1i.p", "v2i.p")]))
> dat.long$study <- 1:10
> dat.long$outcome <- rep(c("W","B"), each=20)
> dat.long$time <- rep(c(1,2), each=10)
> rownames(dat.long) <- 1:nrow(dat.long)
> dat.long <- dat.long[order(dat.long$study, dat.long$outcome, dat.long$time),]
> dat.long$type <- 1:4
> R <- matrix(c( 1, .7, .8, .6,
> .7, 1, .6, .8,
> .8, .6, 1, .7,
> .6, .8, .7, 1), nrow=4, ncol=4)
> R <- diag(10) %x% R
> V <- diag(sqrt(dat.long$vi)) %*% R %*% diag(sqrt(dat.long$vi))
> Note that the R matrix above assumes a correlation of 0.7 for Time 1 vs Time 2 for both outcomes, a correlation of 0.8 for Outcome B vs Outcome W for both time points, and a correlation of 0.6 for Time 1 Outcome B vs Time 2 Outcome W and Time 2 Outcome B vs Time 1 Outcome W. You need to adjust this to whatever you think makes sense for your data.
> Now as a starter, you can do:
> res <- rma.mv(yi, V, mods = ~ outcome:factor(time) - 1, data=dat.long)
> With 'mods = ~ outcome:factor(time) - 1', you directly get the estimated pooled SMD for each outcome-time combination. Then you can test/estimate contrasts with predict() and anova(). For example:
> predict(res, newmods=c(1,-1,0,0))
> anova(res, L=c(1,-1,0,0))
> But the model above is a fixed-effects model. If you want to make generalizable inferences, then you need a random/mixed-effects model. You could try a fully unstructured var-cov matrix for the random effects:
> res <- rma.mv(yi, V, mods = ~ outcome:factor(time) - 1, random = ~ factor(type) | study, struct="UN", data=dat.long)
> But this is probably asking too much with just 10 studies. Maybe:
> res <- rma.mv(yi, V, mods = ~ outcome:factor(time) - 1, random = ~ factor(type) | study, struct="HCS", data=dat.long)
> This allows for different amounts of heterogeneity for the 4 SMD types, but assumes a single correlation among them. One could also try other structures.
> Alternatively, if the construction of V is too tedious for you, you could just assume a diagonal V matrix and then use cluster-robust inferences. Like this:
> res <- rma.mv(yi, vi, mods = ~ outcome:factor(time) - 1, random = ~ factor(type) | study, struct="HCS", data=dat.long)
> sav <- robust(res, cluster=dat.long$study)
> Note that you don't need V here, just the variances (which you already have). I am somewhat doubtful though that 10 studies is enough for this to be justifiable. The cluster-robust approach relies on asymptotics and 10 studies may not be enough for that. You might also want to use the 'clubSandwich' package for this (https://cran.r-project.org/web/packages/clubSandwich/index.html), since it includes some small-sample adjustments to the cluster-robust approach that might be beneficial here.
More information about the R-sig-meta-analysis