[R-meta] Var-cov structure in multilevel/multivariate meta-analysis

Fabian Schellhaas |@b|@n@@che||h@@@ @end|ng |rom y@|e@edu
Sat Mar 23 17:10:46 CET 2019


Hi James,

Thanks a lot for the quick reply -- this makes sense. I just checked and
clustering at a higher level instead (e.g., study) actually reduces the
variance component for papers so much that retaining it does not improve
model fit, so I suppose that random effect would no longer be needed? More
generally, is it a correct rule of thumb to specify clustering at the
highest/outermost level, and then model hierarchical dependence at lower
levels by adding random effects as needed?

Thanks!
Fabian


On Sat, Mar 23, 2019 at 11:41 AM James Pustejovsky <jepusto using gmail.com>
wrote:

> Fabian,
>
> That will not work. Cluster-robust standard errors require that the
> clusters are independent, and so clustering at the level of the sample will
> fail to capture dependence at higher levels (studies and papers). I think
> the appropriate approach here is to cluster at the highest level (papers).
>
> James
>
>
> > On Mar 22, 2019, at 5:41 PM, Fabian Schellhaas <
> fabian.schellhaas using yale.edu> wrote:
> >
> > Hi Wolfgang and mailing list,
> >
> > I would like to follow up on point (3) from this old thread. Quick
> > refresher of the data structure: We have (approximately) 650 effects from
> > 400 treatment-control comparisons, which come from 325 independent
> samples,
> > nested in 275 studies from 200 papers. Many of the samples include more
> > than one treatment-control comparison and evaluate their effect on more
> > than one outcome measure, resulting in correlated residuals clustered at
> > the level of independent samples.
> >
> > First, we model the hierarchical dependence of the true effects. LRTs
> > indicate that model fit is improved significantly by adding random
> effects
> > for treatment-control comparisons (relative to a single-level model) and
> > further improved by adding random effects for papers (relative to the
> > two-level model). Adding random effects for samples and studies did not
> > improve on the two-level model. So in short, we include random effects
> for
> > papers, treatment-control comparisons, and individual estimates, skipping
> > studies and samples. Second, to deal with the nonindependent residuals,
> due
> > to multiple comparisons and multiple outcome measures, we use
> > cluster-robust variance estimation. In our data, residuals are clustered
> at
> > the level of independent samples.
> >
> > As such, we could fit a model as follows:
> > vcv <- clubSandwich::impute_covariance_matrix(vi = data$vi, cluster =
> > data$sample_id, r = 0.7)
> > m <- metafor::rma.mv(yi, V = vcv, random = ~ 1 | paper_id/comp_id/es_id)
> > clubSandwich::coef_test(m, cluster = data$sample_id, vcov = "CR2")
> >
> > Are there any problems with computing robust standard errors at a level
> of
> > clustering (here: samples) that does not correspond to the levels at
> which
> > hierarchical dependence of the true effects are modeled (here: papers and
> > treatment-control comparisons)? If so, what would be a better approach?
> >
> > Many thanks!
> > Fabian
> >
> >
> > On Fri, Oct 5, 2018 at 11:37 AM Viechtbauer, Wolfgang (SP) <
> > wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
> >
> >> Hi Fabian,
> >>
> >> (very clear description of the structure, thanks!)
> >>
> >> 1) Your approach sounds sensible.
> >>
> >> 2) If you are going to use cluster-robust inference methods in the end
> >> anyway, then getting the var-cov matrix of the sampling errors 'exactly
> >> right' is probably not crucial. It can be a huge pain constructing the
> >> var-cov matrix, especially when dealing with complex data structures as
> you
> >> describe. So, sticking to the "best guess" approach is probably
> defensible.
> >>
> >> 3) It is difficult to give general advice, but it is certainly possible
> to
> >> add random effects for samples, studies, and papers (plus random effects
> >> for the individual estimates) here. One can probably skip a level if the
> >> number of units at a particular level is not much higher than the
> number of
> >> units at the next level (the two variance components are then hard to
> >> distinguish). So, for example, 200 studies in 180 papers is quite
> similar,
> >> so one could probably leave out the studies level and only add random
> >> effects for papers (plus for samples and the individual estimates). You
> can
> >> also run likelihood ratio tests to compare models to see if adding
> random
> >> effects at the studies level actually improves the model fit
> significantly.
> >>
> >> Best,
> >> Wolfgang
> >>
> >> -----Original Message-----
> >> From: R-sig-meta-analysis [mailto:
> >> r-sig-meta-analysis-bounces using r-project.org] On Behalf Of Fabian
> Schellhaas
> >> Sent: Thursday, 27 September, 2018 23:55
> >> To: r-sig-meta-analysis using r-project.org
> >> Subject: [R-meta] Var-cov structure in multilevel/multivariate
> >> meta-analysis
> >>
> >> Dear all,
> >>
> >> My meta-analytic database consists of 350+ effect size estimates, drawn
> >> from 240+ samples, which in turn were drawn from 200+ studies, reported
> in
> >> 180+ papers. Papers report results from 1-3 studies each, studies report
> >> results from 1-2 samples each, and samples contribute 1-6 effect sizes
> >> each. Multiple effects per sample are possible due to (a) multiple
> >> comparisons, such that more than one treatment is compared to the same
> >> control group, (b) multiple outcomes, such that more than one outcome is
> >> measured within the same sample, or (c) both. We coded for a number of
> >> potential moderators, which vary between samples, within samples, or
> both.
> >> I included an example of the data below.
> >>
> >> There are two main sources of non-independence: First, there is
> >> hierarchical dependence of the true effects, insofar as effects nested
> in
> >> the same sample (and possibly those nested in the same study and paper)
> are
> >> correlated. Second, there is dependence arising from correlated sampling
> >> errors when effect-size estimates are drawn from the same set of
> >> respondents. This is the case whenever a sample contributes more than
> one
> >> effect, i.e. when there are multiple treatments and/or multiple
> outcomes.
> >>
> >> To model these data, I start by constructing a “best guess” of the
> var-cov
> >> matrices following James Pustejovsky's approach (e.g.,
> >>
> https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2017-August/000094.html
> >> ),
> >> treating samples in my database as independent clusters. Then, I use
> these
> >> var-cov matrices to construct the multilevel/multivariate meta-analytic
> >> model. To account for the misspecification of the var-cov structure, I
> >> perform all coefficient and moderator tests using cluster-robust
> variance
> >> estimation. This general approach has also been recommended on this
> mailing
> >> list and allows me (I think) to use all available data, test all my
> >> moderators, and estimate all parameters with an acceptable degree of
> >> precision.
> >>
> >> My questions:
> >>
> >> 1. Is this approach advisable, given the nature of my data? Any
> problems I
> >> missed?
> >>
> >> 2. Most manuscripts don’t report the correlations between multiple
> >> outcomes, thus preventing the precise calculation of covariances for
> this
> >> type of dependent effect size. By contrast, it appears to be fairly
> >> straightforward to calculate the covariances between multiple-treatment
> >> effects (i.e., those sharing a control group), as per Gleser and Olkin
> >> (2009). Given my data, is there a practical way to construct the var-cov
> >> matrices using a combination of “best guesses” (when correlations
> cannot be
> >> computed) and precise computations (when they can be computed via Gleser
> >> and Olkin)? I should note that I’d be happy to just stick with the “best
> >> guess” approach entirely, but as Wolfgang Viechtbauer pointed out (
> >>
> https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2017-August/000131.html
> >> ),
> >> only a better approximation of the var-cov structure can improve
> precision
> >> of the fixed-effects estimates. That's why I'm exploring this option.
> >>
> >> 3. How would I best determine for which hierarchical levels to specify
> >> random effects? I certainly expect the true effects within the same set
> of
> >> respondents to be correlated, so would at least add a random effect for
> >> sample. Beyond that (i.e., study, paper, and so forth) I’m not so sure.
> >>
> >> Cheers,
> >>
> >> Fabian
> >>
> >> ### Database example:
> >>
> >> Paper 1 contributes two studies - one containing just one sample, the
> other
> >> containing two samples – evaluating the effect of treatment vs. control
> on
> >> one outcome. Paper 2 contributes one study containing one sample,
> >> evaluating the effect of two treatments (relative to the same control)
> on
> >> two separate outcomes each. The first moderator varies between samples,
> the
> >> second moderator varies both between and within samples.
> >>
> >> paper     study sample    comp es yi        vi mod1 mod2
> >>
> >> 1         1 1         1 1 0.x       0.x A A
> >>
> >> 1         2 2         2 2 0.x       0.x B B
> >>
> >> 1         2 3         3 3 0.x       0.x A B
> >>
> >> 2         3 4         4 4 0.x       0.x B A
> >>
> >> 2         3 4         4 5 0.x       0.x B C
> >>
> >> 2         3 4         5 6 0.x       0.x B A
> >>
> >> 2         3 4         5 7 0.x       0.x B C
> >>
> >> ---
> >> Fabian Schellhaas | Ph.D. Candidate | Department of Psychology | Yale
> >> University
> >>
> >
> >    [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-meta-analysis mailing list
> > R-sig-meta-analysis using r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>

	[[alternative HTML version deleted]]



More information about the R-sig-meta-analysis mailing list