[R-meta] Var-cov structure in multilevel/multivariate meta-analysis

Sat Mar 23 16:40:58 CET 2019

Fabian,

That will not work. Cluster-robust standard errors require that the clusters are independent, and so clustering at the level of the sample will fail to capture dependence at higher levels (studies and papers). I think the appropriate approach here is to cluster at the highest level (papers).

James

> On Mar 22, 2019, at 5:41 PM, Fabian Schellhaas <fabian.schellhaas using yale.edu> wrote:
> 
> Hi Wolfgang and mailing list,
> 
> I would like to follow up on point (3) from this old thread. Quick
> refresher of the data structure: We have (approximately) 650 effects from
> 400 treatment-control comparisons, which come from 325 independent samples,
> nested in 275 studies from 200 papers. Many of the samples include more
> than one treatment-control comparison and evaluate their effect on more
> than one outcome measure, resulting in correlated residuals clustered at
> the level of independent samples.
> 
> First, we model the hierarchical dependence of the true effects. LRTs
> indicate that model fit is improved significantly by adding random effects
> for treatment-control comparisons (relative to a single-level model) and
> further improved by adding random effects for papers (relative to the
> two-level model). Adding random effects for samples and studies did not
> improve on the two-level model. So in short, we include random effects for
> papers, treatment-control comparisons, and individual estimates, skipping
> studies and samples. Second, to deal with the nonindependent residuals, due
> to multiple comparisons and multiple outcome measures, we use
> cluster-robust variance estimation. In our data, residuals are clustered at
> the level of independent samples.
> 
> As such, we could fit a model as follows:
> vcv <- clubSandwich::impute_covariance_matrix(vi = data$vi, cluster =
> data$sample_id, r = 0.7)
> m <- metafor::rma.mv(yi, V = vcv, random = ~ 1 | paper_id/comp_id/es_id)
> clubSandwich::coef_test(m, cluster = data$sample_id, vcov = "CR2")
> 
> Are there any problems with computing robust standard errors at a level of
> clustering (here: samples) that does not correspond to the levels at which
> hierarchical dependence of the true effects are modeled (here: papers and
> treatment-control comparisons)? If so, what would be a better approach?
> 
> Many thanks!
> Fabian
> 
> 
> On Fri, Oct 5, 2018 at 11:37 AM Viechtbauer, Wolfgang (SP) <
> wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
> 
>> Hi Fabian,
>> 
>> (very clear description of the structure, thanks!)
>> 
>> 1) Your approach sounds sensible.
>> 
>> 2) If you are going to use cluster-robust inference methods in the end
>> anyway, then getting the var-cov matrix of the sampling errors 'exactly
>> right' is probably not crucial. It can be a huge pain constructing the
>> var-cov matrix, especially when dealing with complex data structures as you
>> describe. So, sticking to the "best guess" approach is probably defensible.
>> 
>> 3) It is difficult to give general advice, but it is certainly possible to
>> add random effects for samples, studies, and papers (plus random effects
>> for the individual estimates) here. One can probably skip a level if the
>> number of units at a particular level is not much higher than the number of
>> units at the next level (the two variance components are then hard to
>> distinguish). So, for example, 200 studies in 180 papers is quite similar,
>> so one could probably leave out the studies level and only add random
>> effects for papers (plus for samples and the individual estimates). You can
>> also run likelihood ratio tests to compare models to see if adding random
>> effects at the studies level actually improves the model fit significantly.
>> 
>> Best,
>> Wolfgang
>> 
>> -----Original Message-----
>> From: R-sig-meta-analysis [mailto:
>> r-sig-meta-analysis-bounces using r-project.org] On Behalf Of Fabian Schellhaas
>> Sent: Thursday, 27 September, 2018 23:55
>> To: r-sig-meta-analysis using r-project.org
>> Subject: [R-meta] Var-cov structure in multilevel/multivariate
>> meta-analysis
>> 
>> Dear all,
>> 
>> My meta-analytic database consists of 350+ effect size estimates, drawn
>> from 240+ samples, which in turn were drawn from 200+ studies, reported in
>> 180+ papers. Papers report results from 1-3 studies each, studies report
>> results from 1-2 samples each, and samples contribute 1-6 effect sizes
>> each. Multiple effects per sample are possible due to (a) multiple
>> comparisons, such that more than one treatment is compared to the same
>> control group, (b) multiple outcomes, such that more than one outcome is
>> measured within the same sample, or (c) both. We coded for a number of
>> potential moderators, which vary between samples, within samples, or both.
>> I included an example of the data below.
>> 
>> There are two main sources of non-independence: First, there is
>> hierarchical dependence of the true effects, insofar as effects nested in
>> the same sample (and possibly those nested in the same study and paper) are
>> correlated. Second, there is dependence arising from correlated sampling
>> errors when effect-size estimates are drawn from the same set of
>> respondents. This is the case whenever a sample contributes more than one
>> effect, i.e. when there are multiple treatments and/or multiple outcomes.
>> 
>> To model these data, I start by constructing a “best guess” of the var-cov
>> matrices following James Pustejovsky's approach (e.g.,
>> https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2017-August/000094.html
>> ),
>> treating samples in my database as independent clusters. Then, I use these
>> var-cov matrices to construct the multilevel/multivariate meta-analytic
>> model. To account for the misspecification of the var-cov structure, I
>> perform all coefficient and moderator tests using cluster-robust variance
>> estimation. This general approach has also been recommended on this mailing
>> list and allows me (I think) to use all available data, test all my
>> moderators, and estimate all parameters with an acceptable degree of
>> precision.
>> 
>> My questions:
>> 
>> 1. Is this approach advisable, given the nature of my data? Any problems I
>> missed?
>> 
>> 2. Most manuscripts don’t report the correlations between multiple
>> outcomes, thus preventing the precise calculation of covariances for this
>> type of dependent effect size. By contrast, it appears to be fairly
>> straightforward to calculate the covariances between multiple-treatment
>> effects (i.e., those sharing a control group), as per Gleser and Olkin
>> (2009). Given my data, is there a practical way to construct the var-cov
>> matrices using a combination of “best guesses” (when correlations cannot be
>> computed) and precise computations (when they can be computed via Gleser
>> and Olkin)? I should note that I’d be happy to just stick with the “best
>> guess” approach entirely, but as Wolfgang Viechtbauer pointed out (
>> https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2017-August/000131.html
>> ),
>> only a better approximation of the var-cov structure can improve precision
>> of the fixed-effects estimates. That's why I'm exploring this option.
>> 
>> 3. How would I best determine for which hierarchical levels to specify
>> random effects? I certainly expect the true effects within the same set of
>> respondents to be correlated, so would at least add a random effect for
>> sample. Beyond that (i.e., study, paper, and so forth) I’m not so sure.
>> 
>> Cheers,
>> 
>> Fabian
>> 
>> ### Database example:
>> 
>> Paper 1 contributes two studies - one containing just one sample, the other
>> containing two samples – evaluating the effect of treatment vs. control on
>> one outcome. Paper 2 contributes one study containing one sample,
>> evaluating the effect of two treatments (relative to the same control) on
>> two separate outcomes each. The first moderator varies between samples, the
>> second moderator varies both between and within samples.
>> 
>> paper     study sample    comp es yi        vi mod1 mod2
>> 
>> 1         1 1         1 1 0.x       0.x A A
>> 
>> 1         2 2         2 2 0.x       0.x B B
>> 
>> 1         2 3         3 3 0.x       0.x A B
>> 
>> 2         3 4         4 4 0.x       0.x B A
>> 
>> 2         3 4         4 5 0.x       0.x B C
>> 
>> 2         3 4         5 6 0.x       0.x B A
>> 
>> 2         3 4         5 7 0.x       0.x B C
>> 
>> ---
>> Fabian Schellhaas | Ph.D. Candidate | Department of Psychology | Yale
>> University
>> 
> 
>    [[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-meta-analysis mailing list
> R-sig-meta-analysis using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis