[R-meta] Computing var-covariance matrix with correlations of six non-independent outcomes

Wed Jul 8 10:19:40 CEST 2020

Hi Mika,

Just for completeness sake, there was one omission and one error in the code I posted:

meta <- data.frame(meta) # to turn your matrix into a dataframe

And this line should have been:

corlist <- mapply(function(rmat, mots) rmat[mots,mots], corlist, mots)

As for your questions:

1) Running separate analyses: Nothing inherently wrong with that. But there is the potential to obtain slightly more precise estimates of the 6 average effects by 'borrowing information' from each other. Using the subset of the data you posted, let's compare:

# multivariate model
res <- rma.mv(g, V, mods = ~ factor(motivation) - 1, random = ~ factor(motivation) | study, struct="UN", data=meta)
res

# separate RE models for each level of motivation
sep <- lapply(1:6, function(i) rma(g, v, data=meta, subset=motivation==i))
sep <- do.call(rbind, lapply(sep, function(x) cbind(coef(summary(x)), tau2=x$tau2)))
rownames(sep) <- paste0("motivation", 1:6)

# compare results
res
round(sep, 4)

As you will see, the SE of several estimates (but not all) is lower in the MV model compared to the separate models.

Also, if you want to compare effects, you really need the MV model. For example, I can do this to test whether the effect is different for outcomes 1 and 2:

anova(res, L=c(1,-1,0,0,0,0))

But you cannot do this when you have fitted separate models, because the results in sep[[1]] and sep[[2]] are not independent.

2) Why do people propose fitting a multilevel model (with random effects for study and estimates within studies)?

I can only speculate about this. I suspect it is a combination of the following things.

For one, people find it difficult to assemble the correlations needed to fit a proper multivariate model (so that the V matrix can be constructed). Also, this step is not something that is easily automated and requires some programming.

Also, people may have learned in their multilevel course that one can deal with dependencies by adding appropriate random effects to their model. That is true to some extent, but there are two levels of dependencies in the present (i.e., meta-analytic) context, namely dependencies between the sampling errors (which are captured by the covariances in the V matrix) and dependencies between the underlying true effects (which we can capture by adding correlated random effects within studies to the model). People may not be aware of this additional complexity and hence may think that just adding random effects is sufficient.

Furthermore, there are some papers that discuss the use of the multilevel approach for meta-analyzing correlated outcomes:

Moeyaert, M., Ugille, M., Beretvas, S. N., Ferron, J., Bunuan, R., & Van den Noortgate, W. (2017). Methods for dealing with multiple outcomes in meta-analysis: A comparison between averaging effect sizes, robust variance estimation and multilevel meta-analysis. International Journal of Social Research Methodology, 20(6), 559-572.

Van den Noortgate, W., Lopez-Lopez, J. A., Marin-Martinez, F., & Sanchez-Meca, J. (2013). Three-level meta-analysis of dependent effect sizes. Behavior Research Methods, 45(2), 576-594.

Van den Noortgate, W., Lopez-Lopez, J. A., Marin-Martinez, F., & Sanchez-Meca, J. (2015). Meta-analysis of multiple outcomes: A multilevel approach. Behavior Research Methods, 47(4), 1274-1294.

Based on the simulations, the multilevel approach may be 'good enough' under certain conditions.

In the present example, that would be:

mlm <- rma.mv(g, v, mods = ~ factor(motivation) - 1, random = ~ 1 | study/outcome, data=meta)
mlm

There are three essential differences compared to the multivariate model we fitted earlier:

a) We ignore the covariances in V and treat the sampling errors as independent.
b) The three-level model implies that the correlation of the underlying true effects within studies is the same for all 6*5/2 = 15 pairs of outcomes.
c) The model implies that the amount of heterogeneity is the same for all 6 outcomes.

Since the study-level variance component is estimated to be 0 here, the model actually implies that there is no correlation between the true effects for the different outcomes within studies.

Therefore, the results would be the same as fitting 6 separate models where we constrain the amount of heterogeneity to the outcome-level variance component. We can easily check this:

sep <- lapply(1:6, function(i) rma(g, v, data=meta, subset=motivation==i, tau2=mlm$sigma2[2]))
sep <- do.call(rbind, lapply(sep, function(x) cbind(coef(summary(x)), tau2=x$tau2)))
rownames(sep) <- paste0("motivation", 1:6)
round(sep, 4)

You will see that the results are indeed exactly the same as those from the multilevel model.

Nobody can say which result is "correct" here, but the multilevel model makes a number of simplifying assumptions that I would want to avoid if possible.

Best,
Wolfgang

>-----Original Message-----
>From: Mika Manninen [mailto:mixu89 using gmail.com]
>Sent: Wednesday, 08 July, 2020 9:25
>To: Viechtbauer, Wolfgang (SP)
>Cc: James Pustejovsky; r-sig-meta-analysis using r-project.org
>Subject: Re: [R-meta] Computing var-covariance matrix with correlations of
>six non-independent outcomes
>
>Dear Wolfgang and James,
>
>Thank you so much for your replies and I just want to say that I can only
>admire your attitude and willingness to help and explain. I was banging my
>head against the wall for a while there trying to figure out how the matrix
>should be created. I owe you one!
>
>Two quick questions. The multivariate approach with the var-cov matrix seems
>to make the most sense for this analysis, but if one was to run six separate
>analyses for the outcomes (like I did the first time around), how
>problematic would that be ? Also, can you estimate as to why a 3-level
>approach was recommended to me (just adding hierarchical study level to the
>analysis)? It is hard for me to see the rationale behind this.
>
>I hope you have a great day,
>
>Mika
>
>ti 7. heinäk. 2020 klo 23.45 Viechtbauer, Wolfgang (SP)
>(wolfgang.viechtbauer using maastrichtuniversity.nl) kirjoitti:
>I swear that James and I did not coordinate our responses. But this is
>seriously scary how similar our responses were.
>
>>-----Original Message-----
>>From: James Pustejovsky [mailto:jepusto using gmail.com]
>>Sent: Tuesday, 07 July, 2020 22:41
>>To: Viechtbauer, Wolfgang (SP)
>>Cc: Mika Manninen; r-sig-meta-analysis using r-project.org
>>Subject: Re: [R-meta] Computing var-covariance matrix with correlations of
>>six non-independent outcomes
>>
>>Ha! Following best practice for systematic reviews, a response to your
>>question has been provided by two coders, working independently. I think
>our
>>reliability is pretty decent too.
>>
>>On Tue, Jul 7, 2020 at 3:33 PM Viechtbauer, Wolfgang (SP)
>><wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
>>Dear Mika,
>>
>>Strictly speaking, the correct way to compute the covariances between
>>multiple Cohen's d (or Hedges' g) values would require using methods that
>>are described in this chapter:
>>
>>Gleser, L. J., & Olkin, I. (2009). Stochastically dependent effect sizes.
>In
>>H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), The handbook of research
>>synthesis and meta-analysis (2nd ed., pp. 357–376). New York: Russell Sage
>>Foundation.
>>
>>In particular, equation (19.27) gives the covariance between two d values
>>for two different outcomes computed based on the same sample of subjects
>>(and we assume that the correlation between the two outcomes is the same
>for
>>subjects in the treatment and the control group). An example illustrating
>an
>>application of this is provided here:
>>
>>http://www.metafor-project.org/doku.php/analyses:gleser2009#multiple-
>>endpoint_studies
>>
>>although the code for the example is a bit simpler because each study has
>>exactly 2 d values.
>>
>>However, one can take a simpler approach that gives very similar results.
>>Namely, if v_1 and v_2 are the sampling variances of d_1 and d_2 and r is
>>the correlation between the two outcomes, then r * sqrt(v_1 * v_2) is
>>usually a close approximation to what (19.27) would provide.
>>
>>Based on your example, this can be computed here as follows:
>>
>># this is only required because you coded missings as "NA" (and not as NA)
>>corlist <- lapply(corlist, function(x) matrix(as.numeric(x), nrow=6,
>>ncol=6))
>>
>># we only need a correlation matrix for the levels of motivation actually
>>observed
>>mots <- split(meta$motivation, meta$study)
>>corlist <- mapply(function(rmat, mots) R[mots,mots], corlist, mots)
>>corlist
>>
>>Vfun <- function(vi,rmat) {
>>   if (length(vi) == 1L) {
>>      matrix(vi)
>>   } else {
>>      S <- diag(sqrt(vi))
>>      S %*% rmat %*% S
>>   }
>>}
>>
>>V <- mapply(Vfun, split(meta$v, meta$study), corlist)
>>
>>Now V is the var-cov matrix of your SMD values, so you could proceed with a
>>multivariate model with:
>>
>>res <- rma.mv(g, V, mods = ~ factor(motivation) - 1, random = ~
>>factor(motivation) | study, struct="UN", data=meta)
>>res
>>
>>This is a very general model allowing each level of motivation to have its
>>own heterogeneity variance component and each pair to have its own
>>correlation. I wouldn't recommend this with only 8 studies, but with 25
>>studies, this might be more sensible.
>>
>>Best,
>>Wolfgang
>>
>>>-----Original Message-----
>>>From: Mika Manninen [mailto:mixu89 using gmail.com]
>>>Sent: Tuesday, 07 July, 2020 21:53
>>>To: James Pustejovsky
>>>Cc: Viechtbauer, Wolfgang (SP); r-sig-meta-analysis using r-project.org
>>>Subject: Re: [R-meta] Computing var-covariance matrix with correlations of
>>>six non-independent outcomes
>>>
>>>James,
>>>
>>>Thank you so much for replying.
>>>
>>>The dataset (8 studies out of 25 included) can be created with the
>>following
>>>code:
>>>
>>>###Creating vectors for the outcome, study, motivation, effect size, and
>>>variance estimate
>>>
>>>#Total effects/outcomes
>>>outcome <- c(1:28)
>>>
>>>#Study
>>>study <- c(1,1,1,1,1,2,2,2,2,3,3,3,3,3,4,4,4,4,4,5,6,7,8,8,8,8,8,8)
>>>
>>>#Six different motivation types measured in the studies
>>>motivation <-
>>>c(1,3,4,5,6,1,3,4,5,1,3,4,5,6,1,3,4,5,6,6,6,6,1,2,3,4,5,6)
>>>
>>>#Hedges g:s
>>>g <- c(0.6068, 0.0603, 0.2684, -0.0886, 0.0415, 1.592, 1.4031, 0.7928,
>>>0.2013, 0.541, 0.1169,
>>>       0.3129, -0.0275, -0.3536, 1.5886, 2.7218, -1.6273, -0.4375, -
>1.0695,
>>>-0.1247, -0.1038,
>>>       -0.2706, 0.2045, -0.2701, 0.3763, -0.7371, -0.0666, 0.2789)
>>>
>>>#Variance estimates
>>>v <- c( 0.0162, 0.0155, 0.0157, 0.0155, 0.0155, 0.1889,
>>>0.17875984,0.15484225,
>>>        0.14432401, 0.0329, 0.0318, 0.0322, 0.0318, 0.0323, 0.1886,
>0.2758,
>>>        0.1909, 0.147, 0.164, 0.0067, 0.0028, 0.004, 0.0726, 0.0729,
>>0.0735,
>>>        0.0771, 0.0723, 0.073)
>>>
>>>###Dataset matrix with different levels and motivation types
>>>meta <- cbind(outcome, study, motivation, g, v)
>>>View(meta)
>>>
>>>I created 6x6 correlation matrices (with 15 unique correlations) for the
>>>eight included studies. I am not sure if this is the most sensible
>approach
>>>as only a few studies have measured all six outcomes. Value NA reflects
>the
>>>absence of a correlation (absence of that one or both of the motivation
>>>types in the study). The row and column numbers (1-6) correspond to the
>>>variable motivation in the created dataset.
>>>
>>>### Correlation matrices for studies 1-8
>>>
>>>study1c <- rbind(c(1, "NA", .96, .34, -.33, -.66), c("NA", "NA", "NA",
>>"NA",
>>>"NA", "NA"), c(.96, "NA", 1, .55, -.12, -.50),
>>>                 c(.34, "NA", .55, 1, .53, .05), c(-.33, "NA", -.12, .53,
>>1,
>>>.72), c(-.66, "NA", -.50, .05, .72, 1))
>>>
>>>study2c <- rbind(c(1, "NA", .96, .34, -.33, "NA"), c("NA", "NA", "NA",
>>"NA",
>>>"NA", "NA"), c(.96, "NA", 1, .55, -.12, "NA"),
>>>                c(.34, "NA", .55, 1, .53, "NA"), c(-.33, "NA", -.12, .53,
>>1,
>>>"NA"), c("NA", "NA", "NA", "NA", "NA", "NA"))
>>>
>>>study3c <- rbind(c(1, "NA", .96, .34, -.33, -.66), c("NA", "NA", "NA",
>>"NA",
>>>"NA", "NA"), c(.96, "NA", 1, .55, -.12, -.50),
>>>                 c(.34, "NA", .55, 1, .53, .05), c(-.33, "NA", -.12, .53,
>>1,
>>>.72), c(-.66, "NA", -.50, .05, .72, 1))
>>>
>>>study4c <- rbind(c(1, "NA", .85, .35, .06, -.44), c("NA", "NA", "NA",
>"NA",
>>>"NA", "NA"), c(.85, "NA", 1, .59, .46, -.14),
>>>                 c(.35, "NA", .59, 1, .71, .27), c(.06, "NA", .46, .71, 1,
>>>.52), c(-.44, "NA", -.14, .27, .52, 1))
>>>
>>>study5c <- rbind(c("NA", "NA", "NA", "NA", "NA", "NA"), c("NA", "NA",
>"NA",
>>>"NA", "NA", "NA"), c("NA", "NA", "NA", "NA", "NA", "NA"),
>>>                c("NA", "NA", "NA", "NA", "NA", "NA"), c("NA", "NA", "NA",
>>>"NA", "NA", "NA"), c("NA", "NA", "NA", "NA", "NA", 1))
>>>
>>>study6c <- rbind(c("NA", "NA", "NA", "NA", "NA", "NA"), c("NA", "NA",
>"NA",
>>>"NA", "NA", "NA"), c("NA", "NA", "NA", "NA", "NA", "NA"),
>>>                c("NA", "NA", "NA", "NA", "NA", "NA"), c("NA", "NA", "NA",
>>>"NA", "NA", "NA"), c("NA", "NA", "NA", "NA", "NA", 1))
>>>
>>>study7c <- rbind(c("NA", "NA", "NA", "NA", "NA", "NA"), c("NA", "NA",
>"NA",
>>>"NA", "NA", "NA"), c("NA", "NA", "NA", "NA", "NA", "NA"),
>>>                c("NA", "NA", "NA", "NA", "NA", "NA"), c("NA", "NA", "NA",
>>>"NA", "NA", "NA"), c("NA", "NA", "NA", "NA", "NA", 1))
>>>
>>>study8c <- rbind(c(1, .79, .84, .24, -.20, -.50), c(.79, 1, .93, .47, .00,
>>-
>>>.31), c(.84, .93, 1, .57, -.07, -.52),
>>>                    c(.24, .47, .57, 1, .43, .01), c(-.20, .00, -.07, .43,
>>>1, .55), c(-.50, -.31, -.52, .01, .55, 1))
>>>
>>>#list with all the correlations matrices
>>>
>>>corlist <- list(study1c, study2c, study3c, study4c, study5c, study6c,
>>>study7c, study8c)
>>>
>>>Mika
>>>
>>>ti 7. heinäk. 2020 klo 19.54 James Pustejovsky (jepusto using gmail.com)
>>>kirjoitti:
>>>Hi Mika,
>>>
>>>To add to Wolfgang's question, could you tell us a little bit more about
>>how
>>>you have structured the data on correlations between types of motivation?
>>Is
>>>it just one correlation matrix (6X6 matrix, with 1 + 2 + 3 + 4 + 5 = 15
>>>unique correlations)? Or is it study-specific?
>>>
>>>This sort of calculation is a bit tricky to carry out so I am not
>surprised
>>>that you haven't found a solution in the listserv archives. If you are
>>>willing to share your dataset (or a subset thereof, say 3-4 studies worth
>>of
>>>data), it may make it easier for us to help problem solve.
>>>
>>>James
>>>
>>>On Tue, Jul 7, 2020 at 11:42 AM Viechtbauer, Wolfgang (SP)
>>><wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
>>>Dear Mika,
>>>
>>>What effect size measure are you using for the meta-analysis?
>>>
>>>Best,
>>>Wolfgang
>>>
>>>>-----Original Message-----
>>>>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-
>>>project.org]
>>>>On Behalf Of Mika Manninen
>>>>Sent: Tuesday, 07 July, 2020 18:10
>>>>To: r-sig-meta-analysis using r-project.org
>>>>Subject: [R-meta] Computing var-covariance matrix with correlations of
>six
>>>>non-independent outcomes
>>>>
>>>>Hello,
>>>>
>>>>I am doing a meta-analysis looking at the effect of a teaching
>>intervention
>>>>(versus control) on six types of motivation/behavioral regulation.
>>>>Theoretically and empirically these constructs form a continuum in which
>>>>the continuum neighbors are most strongly positively correlated and the
>>>>furthest from one another most negatively correlated.
>>>>
>>>>I have 95 effects. These effects come from 25 studies, each reporting
>>>>scores for between 1-6 motivation types. The number of effects per
>>>>motivation ranges from 22 to 13. In some studies, they have measured only
>>>>one or two types whereas in others they have measured 5 or all 6 types of
>>>>motivation.
>>>>
>>>>I originally ran a separate random-effects meta-analysis for all the six
>>>>outcomes. However, I got feedback that the dependency of the motivation
>>>>types should be taken into account and a 3-level meta-analysis was
>>>>recommended. After looking into it, the 3-level model seems to be a worse
>>>>approach than the multivariate approach.
>>>>
>>>>As is not usually the case, I have succeeded in gathering all
>correlations
>>>>between all the motivation types for all studies (some from original
>>>>reporting and some from previous meta-analysis findings).
>>>>
>>>>My question is, how do I compute the V-matrix for this data in order to
>>run
>>>>the multivariate analysis? I read the whole archive but I could not find
>a
>>>>clear answer to the problem.
>>>>
>>>>Thank you very much in advance,
>>>>
>>>>Mika