[R-meta] multi-level meta-analysis with dependent effect sizes

Thu Sep 20 21:12:30 CEST 2018

Andrew,

You mentioned that you've figured out how to do this, but I figured that I
would share my approach anyways in case others find it useful.

Cheers,
James

# Function to compute the full variance-covariance matrix of a raw mean
difference:

V_mean_diff <- function(cluster, se_t, se_c,
                        return_list = identical(as.factor(cluster),
sort(as.factor(cluster)))) {

  se_t_list <- split(se_t, cluster)
  se_c_list <- split(se_c, cluster)

  vcov_list <- Map(function(se_A, se_B) se_B^2 + diag(se_A^2, nrow =
length(se_A)), se_A = se_t_list, se_B = se_c_list)

  if (return_list) {
    return(vcov_list)
  }
  else {
    vcov_mat <- metafor::bldiag(vcov_list)
    cluster_index <- order(order(cluster))
    return(vcov_mat[cluster_index, cluster_index])
  }
}

# Here's an example dataset (based on the one you sent me off-list):

example_data <- data.frame(stringsAsFactors=FALSE,
                           sid = c(1L, 1L, 1L, 1L, 1L, 4L, 7L, 7L, 7L, 7L,
7L, 7L, 7L, 7L, 10L,
                                   10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L,
                                   10L, 10L, 10L, 10L),
                           contrast = c("1A", "1B", "1B", "1C", "1D", "4A",
"7A", "7A", "7B", "7B",
                                        "7C", "7C", "7D", "7D", "10A",
"10B", "10C",
                                        "10E", "10F", "10G", "10H", "10I",
"10J",
                                        "10K", "10L", "10N", "10O", "10P"),
                           sampletype = c("invmusc", "invmusc", "invmusc",
"crwhole", "plant", "vbone",
                                          "fmusc", "fmusc", "fmusc",
"fmusc",
                                          "crwhole", "crwhole", "crwhole",
"crwhole", "plant",
                                          "plant", "plant", "plant",
"plant", "plant",
                                          "plant", "plant", "plant",
"plant", "plant",
                                          "plant", "plant", "plant"),
                           nc = c(3L, 10L, 10L, 10L, 6L, 10L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L,
                                  2L, 2L, 2L, 2L, 4L, 2L, 3L, 2L, 2L, 2L,
2L,
                                  2L, 4L, 2L),
                           mc = c(9, 4.345, 4.345, 4.353, 1.58, 16.6,
11.995, 11.995, 10.564,
                                  10.564, 11.461, 11.461, 10.392, 10.392,
6.3,
                                  1.3, 1.3, 4.4, 4.7, 7.1, 5.4, 4.5, 5.9,
-2.3,
                                  3, 1.7, -5, 1.3),
                           sdc = c(0.872066511, 0.256523964, 0.256523964,
0.276130085,
                                   0.75934182, 0.1, 0.296984848,
0.296984848,
                                   0.042751676, 0.042751676, 0.122824448,
                                   0.122824448, 0.241830519, 0.241830519,
0.494974747,
                                   0.919238816, 1.767766953, 0.070710678,
2.303,
                                   0.070710678, 1.728, 0.989949494,
0.353553391,
                                   0.070710678, 0.494974747, 0.070710678,
1.7,
                                   0.212132034),
                           ns = c(3L, 10L, 10L, 10L, 6L, 10L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L,
                                  2L, 2L, 2L, 2L, 4L, 2L, 3L, 2L, 2L, 2L,
2L,
                                  2L, 4L, 2L),
                           ms = c(11.9, 4.811, 5.355, 4.457, -0.24, 16.6,
11.764, 11.835,
                                  10.87, 10.714, 11.38, 11.439, 10.246,
10.31,
                                  6.1, 1, 1.1, 4.1, 4.9, 7.5, 4.9, 4.4, 6.1,
                                  -2.9, 1.6, 1.6, -4.4, 1.5),
                           sds = c(0.916787871, 0.762108916, 0.752622083,
0.695701085,
                                   0.881816307, 0.1, 0.165462987,
0.195161472,
                                   0.296984848, 0.357796031, 0.277185858,
                                   0.144249783, 0.371938167, 0.370523953,
1.202081528,
                                   0.919238816, 2.786000718, 0.070710678,
2.254,
                                   0.636396103, 1.47, 1.060660172,
0.212132034,
                                   0.424264069, 1.13137085, 0.070710678,
1.144,
                                   1.060660172)
)

# The variables in the data are as follows:

# sid - study identifier

# contrast - grouping variable identifying effect sizes that share a control

# sampletype - type of sample

# nc/mc/sdc - n/mean/standard deviation for control

# ns/ms/sds - n/mean/standard deviation for acid treated sample

# fix the sort-order of the grouping variable

example_data$contrast <- factor(example_data$contrast,

                                levels = unique(example_data$contrast))

# compute mean difference

example_data$md <- with(example_data, ms - mc)

# compute variance of mean difference, to check against vcov matrix

example_data$V_md <- with(example_data, sdc^2 / nc + sds^2 / ns)

# compute vcov as block-diagonal matrix list

vcov_list <- with(example_data, V_mean_diff(cluster = contrast,

                                            se_c = sdc / sqrt(nc),

                                            se_t = sds / sqrt(ns)))

# compute vcov as full matrix

vcov_mat <- with(example_data, V_mean_diff(cluster = contrast,

                                           se_c = sdc / sqrt(nc),

                                           se_t = sds / sqrt(ns),

                                           return_list = FALSE))

# check that results are consistent with hand calculation

all.equal(example_data$V_md, diag(vcov_mat))

# fit multi-level meta-analysis model

library(metafor)

rma.mv(md ~ 0 + factor(sampletype),

       V = vcov_list,

       random = ~ 1 | sid / contrast,

       data = example_data,

       method = "REML")

On Tue, Sep 11, 2018 at 1:47 PM James Pustejovsky <jepusto using gmail.com> wrote:

> Andrew,
>
> Your description of the data structure is very clear. (Bravo for that!
> Many meta-analyses that I have reviewed---even quite recent ones---don't do
> a great job of describing the hierarchical/multivariate structure of the
> data, even though it's quite important.) Based on your responses, I think
> the best thing to do would be to use the Gleser & Olkin approach to try and
> approximate the covariances among effect size estimates that share a common
> control condition. The reason I think it's best is that you should have all
> the information you need in order to calculate the covariances (when the
> effect sizes are differences in means, all that is needed are means, SDs,
> and Ns per cell), and so there's not a strong rationale for using other
> approximations. For example, using impute_covariance_matrix() with r = 0.7
> is likely to be a bit too high. Consider an example where two treatment
> cells A and B, and the control cell C all have equal sample sizes n, and
> the SDs are also all equal to S. Then
>
> Var(A - C) =  S^2 * 2 / n
> Var(B - C) = S^2 * 2 / n
> Cov(A - C, B - C) = S^2 / n
> And so Cor(A - C, B - C) = 0.5, rather than 0.7.
>
> A further reason for going this route is that using good estimates of
> sampling covariances is the only way I know of to get defensible,
> interpretable estimates of the random effects variance components of the
> model. If you use something along the lines of your model D,
>
> nmtD <- rma.mv(yi=md, V=vcvndat,  mods = ~ factor(sampletype) + factor(A)
> ... + factor(H), random = ~1|sid/contrast, data=ndat, method="REML")
>
> (but with vcvndat based on Gleser & Olkin), then you would be able to
> interpret the random effects vairances on sid and contrast. You noted that
> one of the variance components (not sure which one is sigma2.1 versus
> sigma2.2) does seem to be sensitive to the assumed r, so Gleser & Olkin
> would be a way to pin that down. It would also let you explore other
> potential specifications of the random effects structure, such as allowing
> variance components to differ across levels of sampletype (or other
> moderators).
>
> All that said, computing the full Gleser & Olkin vcov matrix is a bit
> inconvenient because you have to work with block diagonal matrices. If you
> send a smallish example dataset, I can try to provide some example code for
> how to do these calculations.
>
> James
>
> On Tue, Sep 11, 2018 at 10:54 AM Andrew Guerin <
> Andrew.Guerin using newcastle.ac.uk> wrote:
>
>> Hi James,
>>
>>
>> Thanks for the rapid response.
>>
>>
>> The study is looking at how a preparation technique (acid treatment)
>> affects particular measurements (stable isotope signatures) in sample
>> materials. Experiments compare the isotope signatures of replicates with
>> and without acid treatment - so it is the differences, rather than the
>> absolute values, which are of interest (so of your two suggestions, it
>> would have to be option 1).
>>
>>
>> The hierarchical dependence arises as you have guessed. All the included
>> studies are explicit tests of acid treatment. While some studies look at
>> the effects on just one type of sample material (say, crab muscle or
>> fish scales)  others might test a selection of materials from various
>> sources. So there can be multiple values of 'sampletype' within a study.
>> The other moderators (A-H) concern potentially influential differences in
>> the experimental methods used (eg. type of acid, method of application, how
>> samples were treated before acid exposure, etc). 'Sampletype' is the main
>> moderator of interest - the idea is to be able to say whether or not acid
>> treatment is desirable/necessary for a particular type of material (an
>> overall estimate of effect size for all types of sample combined is not
>> very interesting). The other moderators are less fundamentally interesting,
>> they are there mainly to account for differences in the methods used in the
>> studies included in the analysis, though of course it is interesting to be
>> able to report whether they make any difference.
>>
>>
>> In terms of variation in sampletype and the other moderators, you could
>> put the studies into 4 groups:
>>
>>
>> 1) Single sample type, fixed protocol (all moderators A-H constant; 10
>> studies)
>>
>> 2) Single sample type, multiple protocols (generally one or two
>> moderators will vary, others will remain constant; 2 studies)
>>
>> 3) Multiple sample types, fixed protocol (the most common - 25 studies)
>>
>> 4) Multiple sample types, multiple protocols (such studies are a lot of
>> work, so they are generally limited to 4 or fewer sample types, and a small
>> number of variations in one or two moderators; 7 studies)
>>
>>
>> So there is a bit of both between-study and within-study variation in the
>> moderators. For the two most common types of study (1 and 3) since there is
>> only one protocol there will be multiple independent
>> treatment/control pairs, so no 'multiple treatments' type dependence
>> (though still potential hierarchical dependence). For type 2 there will
>> be one group of treatments per study, but all compared to a single control.
>> Studies in group 4 will presumably exhibit both types of dependence.
>>
>>
>> Cheers,
>>
>> Andrew
>>
>>
>>
>>
>>
>> ------------------------------
>> *From:* James Pustejovsky <jepusto using gmail.com>
>> *Sent:* 11 September 2018 15:23
>> *To:* Andrew Guerin
>> *Cc:* r-sig-meta-analysis using r-project.org
>> *Subject:* Re: [R-meta] multi-level meta-analysis with dependent effect
>> sizes
>>
>> Andrew,
>>
>> In your study, how does the hierarchical dependence arise? Is it that
>> some studies report results from multiple (presumably independent) samples?
>>
>> If so, then I would argue that the best thing to do might be one of the
>> following:
>>
>> 1. Go full Gleser-Olkin. If multiple-treatments dependence is the *only*
>> type of sampling dependence in your data, then it would be better just to
>> work through the calculation of the covariances. This would let you then
>> treat the sampling variance-covariance matrix as known, and you could avoid
>> the need for cluster-robust standard errors all together (unless there are
>> other types of mis-specification to worry about, in which case clustered
>> SEs could still be useful).
>>
>> 2. Since you have a common outcome scale, you could also consider using
>> the mean outcome as the ES. If different treatment groups within a study
>> are independent (i.e., a between-subjects design), then modeling the means
>> directly would let you avoid dealing with the Gleser-Olkin business.
>>
>> Generally, which of these (or the other approaches you described) is most
>> defensible will depend on the goals of your analysis and, in particular,
>> whether the moderators of interest involve between-study variation,
>> within-study variation, or both. For moderators that vary between-study, I
>> have found (based on personal experience) that the approach to
>> modeling/handling dependence tends not to matter much. But for moderators
>> where there is within-study variation, it is more salient.
>>
>> Does sampletype vary within study? And what about the other factors?
>>
>> James
>>
>> On Tue, Sep 11, 2018 at 8:30 AM Andrew Guerin <
>> Andrew.Guerin using newcastle.ac.uk> wrote:
>>
>> I am new to meta-analysis, and have benefited greatly from the excellent
>> documentation and resources for meta-analysis in R, especially the websites
>> for metafor and clubSandwich. I am planning some extensive future work
>> using meta-analysis, so thought it would be useful to get some experience
>> of the whole process by starting with something relatively simple (haha)
>> and trying to work it up for a publication. I would like some advice on
>> whether I am following the correct approach.
>>
>>
>> I selected a topic with which I am quite familiar and which has a
>> relatively limited literature: I did some thorough literature searching and
>> came up with ~70 relevant studies, which I will use for a few models
>> looking at different aspects (I'll focus on one here). Once I had coded all
>> the data it became clear that the analysis was not going to be as simple as
>> I had hoped, since the data have two kinds of dependence that should be
>> accounted for:
>>
>>
>> 1. Hierarchical dependence - a lot of the studies contribute multiple
>> effect sizes.
>>
>> 2. Many of the effect sizes share controls - different treatments are
>> compared against a common 'untreated' control - Gleser and Olkin's (2009)
>> 'multiple treatments' dependence.
>>
>>
>> For this analysis, there are a total of 302 effect sizes, grouped into
>> 244 'contrast' groups which share a control, from 43 studies. The 244
>> contrast groups contain 1-8 effect sizes.
>>
>>
>> There are various moderators. The most important is 'sampletype' -
>> expected to be significant, as it is expected that treatment will have
>> different effects on different sample types. The other factors all relate
>> to variations in the experimental treatments used, to keep things general
>> I'll just call them "A" to "H".
>>
>>
>> ndat = data frame
>>
>> md = effect size measure (raw mean difference, since all data are on a
>> common scale, with a limited range)
>>
>> mdv = sampling variation for mean difference, obtained using escalc().
>>
>> sid = study id (a simple numerical identifier)
>>
>> contrast = identifier showing which effect size measures / treatments
>> share a control (eg. study one contributes 5 effect sizes, with values of
>> 'contrast' 1A, 1B, 1B, 1C, 1D, indicating that of the 5 effect sizes in
>> study 1, 2 share a control).
>>
>>
>> As I see it I have the following options for how to proceed:
>>
>>
>> A) conduct a completely 'cluster naive' mixed effects meta-analysis
>>
>>
>> nmtA <- rma(md, mdv,  mods = ~ factor(sampletype) + factor(A) ... +
>> factor(H), data=ndat, method="REML")
>>
>> anova(nmtA, btt=20:21) ## example hypothesis test for moderator 'D'
>>
>>
>> B) ignore the hierarchical dependence (ie. assume that any samples with
>> different values of 'contrast' within the same study are independent), but
>> account for the correlated errors within clusters / contrast groups by RVE
>> in robumeta, using clubSandwich for the hypothesis tests.
>>
>>
>> nmtB <- robu(md ~ factor(sampletype) + factor(A) ... + factor(H),
>> data=ndat, var.eff.size=mdv, rho = 0.8, modelweights="CORR",
>> studynum=contrast)
>>
>> Wald_test(nmtB, constraints = 20:21, vcov="CR2") ## example hypothesis
>> test for moderator 'D'
>>
>>
>>
>> C) still ignoring the hierarchical dependence, but now explicitly
>> specifying the random effects in rma.mv, after imputing a
>> variance-covariance matrix. Robust parameter estimates and hypothesis test
>> are then carried out using clubSandwich.
>>
>>
>> vcvndat <- impute_covariance_matrix(vi = ndat$mdv, cluster=ndat$contrast,
>> r=0.7)
>>
>> nmtC <- rma.mv(yi=md, V=vcvndat,  mods = ~ factor(sampletype) +
>> factor(A) ... + factor(H), random = ~1|contrast, data=ndat, method="REML")
>>
>> nmtC_robust <- coef_test(nmtC, vcov="CR2")
>>
>> Wald_test(nmtC, constraints = 20:21, vcov="CR2") ## example hypothesis
>> test for moderator 'D'
>>
>>
>> D) attempting to account for the hierarchical and multiple-treatments
>> dependence via a multi-level model, setting random effect structure as
>> random = ~ 1| sid/contrast, and then using clubSandwich for RVE and
>> hypothesis tests.
>>
>>
>> vcvndat <- impute_covariance_matrix(vi = ndat$mdv, cluster=ndat$contrast,
>> r=0.7)
>>
>> nmtD <- rma.mv(yi=md, V=vcvndat,  mods = ~ factor(sampletype) +
>> factor(A) ... + factor(H), random = ~1|sid/contrast, data=ndat,
>> method="REML")
>>
>> nmtD_robust <- coef_test(nmtD, vcov="CR2", cluster=ndat$contrast)
>>
>> Wald_test(nmtD, constraints = 20:21, vcov="CR2", cluster=ndat$contrast)
>> ## example hypothesis test for moderator 'D'
>>
>>
>> My main question really is which (if any) of these approaches is the best
>> way forward? For this particular dataset, the outcomes (at least in terms
>> of which moderators are significant/non-significant, and most parameter
>> estimates) are pretty similar for all the options (including some other
>> options that I tried that I have not mentioned here). All find 'sampletype'
>> highly significant, and find similar sets of the other moderators to be
>> non-significant (except for the naive analysis, A which finds more
>> moderators to be significant). B, C, and D are all very similar. Although D
>> is the only one that explicitly accounts for the hierarchical dependence,
>> does the fact that the outcomes from B and C are substantially similar mean
>> that this could be safely overlooked for this analysis?
>>
>>
>> I have not yet attempted to build my own vcov matrix using the Gleser and
>> Olkin (2009) formula. Is this worth doing instead of using
>> impute_covariance_matrix? I picked 0.7 as a plausible value for r in
>> impute_convariance_matrix(), but I have raw data from one study that might
>> allow me to derive a better estimate. I did investigate the effect of
>> varying r from 0-1. It does not seem to have much impact on parameter
>> estimates, the significance/non-significance of moderators, or the estimate
>> of sigma^2.2 in the multi-level model. The intercept moves about a little,
>> but the main effect is that when r is less than about 0.5, sigma^2.1 is
>> basically 0. Is this indicating a problem?
>>
>>
>> Finally I was wondering whether it is wise to account for multiple
>> comparisons when evaluating the hypothesis tests. Since I have several
>> factors, I repeat Wald_test() several times. Models nmtB, nmtC, and nmtD
>> all have moderators  (aside from 'sampletype') that are significant at p <
>> 0.05, but with p values only just under 0.05. Adjustment for multiple
>> comparisons (eg. using p.adjust(method="fdr")) shifts the outcomes of all
>> these tests to p > 0.05.
>>
>>
>> I would be grateful for any advice on the above.
>>
>>
>> Many thanks,
>>
>> Andrew
>>
>>
>>
>>
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-meta-analysis mailing list
>> R-sig-meta-analysis using r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>>
>>

	[[alternative HTML version deleted]]