[R-meta] Var-cov structure in multilevel/multivariate meta-analysis

Fabian Schellhaas |@b|@n@@che||h@@@ @end|ng |rom y@|e@edu
Sat Mar 23 17:39:05 CET 2019


Hi James,

I suppose the variance components change because we're now imputing the
variance-covariance matrix at a higher level of clustering. My
understanding has been that the vcov imputation needs to be done at the
same level of clustering used for the estimation of cluster-robust standard
errors later, but I'm keen to hear what your take is. I included more
details below.

Initially, the model was (incorrectly) fitted as follows:
vcv <- clubSandwich::impute_covariance_matrix(vi = data$vi, cluster =
data$sample_id, r = 0.7)
m <- metafor::rma.mv(yi, V = vcv, random = ~ 1 | paper_id/comp_id/es_id)

Variance components: sigma^2.1 = 0.0068, sigma^2.2 = 0.0087, sigma^2.3 =
0.0137
Comparing this model to one model with random effects only for comparisons
and estimates: LRT = 15.11, p < .001

Now, fitting the model with the vcov matrix imputed at the level of papers:
vcv <- clubSandwich::impute_covariance_matrix(vi = data$vi, cluster =
data$paper_id, r = 0.7)
m <- metafor::rma.mv(yi, V = vcv, random = ~ 1 | paper_id/comp_id/es_id)

Variance components: sigma^2.1 = 0.0012, sigma^2.2 = 0.0131, sigma^2.3 =
0.0142
Comparing this model to one model with random effects only for comparisons
and estimates: LRT = 0.59, p = .443

Many thanks,
Fabian

---
Fabian M. H. Schellhaas | Ph.D. Candidate | Department of Psychology | Yale
University


On Sat, Mar 23, 2019 at 12:18 PM James Pustejovsky <jepusto using gmail.com>
wrote:

> Fabian,
>
> I don’t follow the new results you’ve just described. I meant clustering
> as in the clustering variable specified when computing robust standard
> errors. Switching this to a higher level should not affect your variance
> component estimates at all—only the SEs of the fixed effects will change.
> So I’m not sure what you’ve modified that leads to changes in the variance
> component estimates.
>
> James
>
> On Mar 23, 2019, at 11:10 AM, Fabian Schellhaas <
> fabian.schellhaas using yale.edu> wrote:
>
> Hi James,
>
> Thanks a lot for the quick reply -- this makes sense. I just checked and
> clustering at a higher level instead (e.g., study) actually reduces the
> variance component for papers so much that retaining it does not improve
> model fit, so I suppose that random effect would no longer be needed? More
> generally, is it a correct rule of thumb to specify clustering at the
> highest/outermost level, and then model hierarchical dependence at lower
> levels by adding random effects as needed?
>
> Thanks!
> Fabian
>
>
> On Sat, Mar 23, 2019 at 11:41 AM James Pustejovsky <jepusto using gmail.com>
> wrote:
>
>> Fabian,
>>
>> That will not work. Cluster-robust standard errors require that the
>> clusters are independent, and so clustering at the level of the sample will
>> fail to capture dependence at higher levels (studies and papers). I think
>> the appropriate approach here is to cluster at the highest level (papers).
>>
>> James
>>
>>
>> > On Mar 22, 2019, at 5:41 PM, Fabian Schellhaas <
>> fabian.schellhaas using yale.edu> wrote:
>> >
>> > Hi Wolfgang and mailing list,
>> >
>> > I would like to follow up on point (3) from this old thread. Quick
>> > refresher of the data structure: We have (approximately) 650 effects
>> from
>> > 400 treatment-control comparisons, which come from 325 independent
>> samples,
>> > nested in 275 studies from 200 papers. Many of the samples include more
>> > than one treatment-control comparison and evaluate their effect on more
>> > than one outcome measure, resulting in correlated residuals clustered at
>> > the level of independent samples.
>> >
>> > First, we model the hierarchical dependence of the true effects. LRTs
>> > indicate that model fit is improved significantly by adding random
>> effects
>> > for treatment-control comparisons (relative to a single-level model) and
>> > further improved by adding random effects for papers (relative to the
>> > two-level model). Adding random effects for samples and studies did not
>> > improve on the two-level model. So in short, we include random effects
>> for
>> > papers, treatment-control comparisons, and individual estimates,
>> skipping
>> > studies and samples. Second, to deal with the nonindependent residuals,
>> due
>> > to multiple comparisons and multiple outcome measures, we use
>> > cluster-robust variance estimation. In our data, residuals are
>> clustered at
>> > the level of independent samples.
>> >
>> > As such, we could fit a model as follows:
>> > vcv <- clubSandwich::impute_covariance_matrix(vi = data$vi, cluster =
>> > data$sample_id, r = 0.7)
>> > m <- metafor::rma.mv(yi, V = vcv, random = ~ 1 |
>> paper_id/comp_id/es_id)
>> > clubSandwich::coef_test(m, cluster = data$sample_id, vcov = "CR2")
>> >
>> > Are there any problems with computing robust standard errors at a level
>> of
>> > clustering (here: samples) that does not correspond to the levels at
>> which
>> > hierarchical dependence of the true effects are modeled (here: papers
>> and
>> > treatment-control comparisons)? If so, what would be a better approach?
>> >
>> > Many thanks!
>> > Fabian
>> >
>> >
>> > On Fri, Oct 5, 2018 at 11:37 AM Viechtbauer, Wolfgang (SP) <
>> > wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
>> >
>> >> Hi Fabian,
>> >>
>> >> (very clear description of the structure, thanks!)
>> >>
>> >> 1) Your approach sounds sensible.
>> >>
>> >> 2) If you are going to use cluster-robust inference methods in the end
>> >> anyway, then getting the var-cov matrix of the sampling errors 'exactly
>> >> right' is probably not crucial. It can be a huge pain constructing the
>> >> var-cov matrix, especially when dealing with complex data structures
>> as you
>> >> describe. So, sticking to the "best guess" approach is probably
>> defensible.
>> >>
>> >> 3) It is difficult to give general advice, but it is certainly
>> possible to
>> >> add random effects for samples, studies, and papers (plus random
>> effects
>> >> for the individual estimates) here. One can probably skip a level if
>> the
>> >> number of units at a particular level is not much higher than the
>> number of
>> >> units at the next level (the two variance components are then hard to
>> >> distinguish). So, for example, 200 studies in 180 papers is quite
>> similar,
>> >> so one could probably leave out the studies level and only add random
>> >> effects for papers (plus for samples and the individual estimates).
>> You can
>> >> also run likelihood ratio tests to compare models to see if adding
>> random
>> >> effects at the studies level actually improves the model fit
>> significantly.
>> >>
>> >> Best,
>> >> Wolfgang
>> >>
>> >> -----Original Message-----
>> >> From: R-sig-meta-analysis [mailto:
>> >> r-sig-meta-analysis-bounces using r-project.org] On Behalf Of Fabian
>> Schellhaas
>> >> Sent: Thursday, 27 September, 2018 23:55
>> >> To: r-sig-meta-analysis using r-project.org
>> >> Subject: [R-meta] Var-cov structure in multilevel/multivariate
>> >> meta-analysis
>> >>
>> >> Dear all,
>> >>
>> >> My meta-analytic database consists of 350+ effect size estimates, drawn
>> >> from 240+ samples, which in turn were drawn from 200+ studies,
>> reported in
>> >> 180+ papers. Papers report results from 1-3 studies each, studies
>> report
>> >> results from 1-2 samples each, and samples contribute 1-6 effect sizes
>> >> each. Multiple effects per sample are possible due to (a) multiple
>> >> comparisons, such that more than one treatment is compared to the same
>> >> control group, (b) multiple outcomes, such that more than one outcome
>> is
>> >> measured within the same sample, or (c) both. We coded for a number of
>> >> potential moderators, which vary between samples, within samples, or
>> both.
>> >> I included an example of the data below.
>> >>
>> >> There are two main sources of non-independence: First, there is
>> >> hierarchical dependence of the true effects, insofar as effects nested
>> in
>> >> the same sample (and possibly those nested in the same study and
>> paper) are
>> >> correlated. Second, there is dependence arising from correlated
>> sampling
>> >> errors when effect-size estimates are drawn from the same set of
>> >> respondents. This is the case whenever a sample contributes more than
>> one
>> >> effect, i.e. when there are multiple treatments and/or multiple
>> outcomes.
>> >>
>> >> To model these data, I start by constructing a “best guess” of the
>> var-cov
>> >> matrices following James Pustejovsky's approach (e.g.,
>> >>
>> https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2017-August/000094.html
>> >> ),
>> >> treating samples in my database as independent clusters. Then, I use
>> these
>> >> var-cov matrices to construct the multilevel/multivariate meta-analytic
>> >> model. To account for the misspecification of the var-cov structure, I
>> >> perform all coefficient and moderator tests using cluster-robust
>> variance
>> >> estimation. This general approach has also been recommended on this
>> mailing
>> >> list and allows me (I think) to use all available data, test all my
>> >> moderators, and estimate all parameters with an acceptable degree of
>> >> precision.
>> >>
>> >> My questions:
>> >>
>> >> 1. Is this approach advisable, given the nature of my data? Any
>> problems I
>> >> missed?
>> >>
>> >> 2. Most manuscripts don’t report the correlations between multiple
>> >> outcomes, thus preventing the precise calculation of covariances for
>> this
>> >> type of dependent effect size. By contrast, it appears to be fairly
>> >> straightforward to calculate the covariances between multiple-treatment
>> >> effects (i.e., those sharing a control group), as per Gleser and Olkin
>> >> (2009). Given my data, is there a practical way to construct the
>> var-cov
>> >> matrices using a combination of “best guesses” (when correlations
>> cannot be
>> >> computed) and precise computations (when they can be computed via
>> Gleser
>> >> and Olkin)? I should note that I’d be happy to just stick with the
>> “best
>> >> guess” approach entirely, but as Wolfgang Viechtbauer pointed out (
>> >>
>> https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2017-August/000131.html
>> >> ),
>> >> only a better approximation of the var-cov structure can improve
>> precision
>> >> of the fixed-effects estimates. That's why I'm exploring this option.
>> >>
>> >> 3. How would I best determine for which hierarchical levels to specify
>> >> random effects? I certainly expect the true effects within the same
>> set of
>> >> respondents to be correlated, so would at least add a random effect for
>> >> sample. Beyond that (i.e., study, paper, and so forth) I’m not so sure.
>> >>
>> >> Cheers,
>> >>
>> >> Fabian
>> >>
>> >> ### Database example:
>> >>
>> >> Paper 1 contributes two studies - one containing just one sample, the
>> other
>> >> containing two samples – evaluating the effect of treatment vs.
>> control on
>> >> one outcome. Paper 2 contributes one study containing one sample,
>> >> evaluating the effect of two treatments (relative to the same control)
>> on
>> >> two separate outcomes each. The first moderator varies between
>> samples, the
>> >> second moderator varies both between and within samples.
>> >>
>> >> paper     study sample    comp es yi        vi mod1 mod2
>> >>
>> >> 1         1 1         1 1 0.x       0.x A A
>> >>
>> >> 1         2 2         2 2 0.x       0.x B B
>> >>
>> >> 1         2 3         3 3 0.x       0.x A B
>> >>
>> >> 2         3 4         4 4 0.x       0.x B A
>> >>
>> >> 2         3 4         4 5 0.x       0.x B C
>> >>
>> >> 2         3 4         5 6 0.x       0.x B A
>> >>
>> >> 2         3 4         5 7 0.x       0.x B C
>> >>
>> >> ---
>> >> Fabian Schellhaas | Ph.D. Candidate | Department of Psychology | Yale
>> >> University
>> >>
>> >
>> >    [[alternative HTML version deleted]]
>> >
>> > _______________________________________________
>> > R-sig-meta-analysis mailing list
>> > R-sig-meta-analysis using r-project.org
>> > https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>>
>

	[[alternative HTML version deleted]]



More information about the R-sig-meta-analysis mailing list