[R-meta] Different estimates for multilevel meta-analysis when using rma.mv with mods or subset

Wed Feb 8 21:39:50 CET 2023

Hi all,

my question has already been raised before (e.g.,
https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2018-April/000774.html),
but the answers did not help me understanding the main problem or
solve differences in the output, so that is why I'm posting it again.

We are conducting a meta-analysis and have effect sizes ("esid")
nested within studies or samples ("study") nested within publications
("id"). There are 344 effect sizes in total across 110 publications
and 160 study - publication combinations. There are 48.75% of studies
including more than one effect size and of these 26.82% are dependent
effects because they either use the same sample or the same control
treatment used to calculate the effect. Therefore, we have constructed
an approximate variance-covariance matrix to account for this. Some of
the moderators only use a subset of this data, as they have missing
values (as in the example below).

My main problem is now that I get different estimates when running the
rma.mv model with factorial moderators using 'mods' or when subsetting
them via the 'subset' command. I have seen this discussed here
(http://www.metafor-project.org/doku.php/tips:comp_two_independent_estimates)
and it seems that the main difference is that 'mods' uses the same
residual heterogeneity, whereas 'subset' allows for different levels
of tau^2. However, this example does not discuss a multilevel
meta-analysis.

When running 'mods' using the following syntax:

rma.mv(yi, V, random = ~ 1 | id/study/esid, data = cc_d, method =
"REML", mods = ~factor(target_commitment) - 1)

I get:

Multivariate Meta-Analysis Model (k = 302; method: REML)

Variance Components:

            estim    sqrt  nlvls  fixed        factor
sigma^2.1  0.0346  0.1861     96     no            id
sigma^2.2  0.0000  0.0000    138     no      id/study
sigma^2.3  0.0356  0.1887    302     no  id/study/esid

Test for Residual Heterogeneity:
QE(df = 300) = 2213.7041, p-val < .0001

Test of Moderators (coefficients 1:2):
QM(df = 2) = 54.1962, p-val < .0001

Model Results:

                                             estimate         se
zval        pval        ci.lb    ci.ub
factor(target_commitment)1   -0.1551  0.0315  -4.9258  <.0001  -0.2169
 -0.0934  ***
factor(target_commitment)5    0.1878  0.0420   4.4770  <.0001   0.1056
  0.2701  ***

However, running:

rma.mv(yi, V, random = ~ 1 | id/study/ID2, data = cc_d, method =
"REML", subset=target_commitment==1)

I obtain:

Multivariate Meta-Analysis Model (k = 215; method: REML)

Variance Components:

                   estim    sqrt        nlvls  fixed        factor
sigma^2.1  0.0401  0.2002     67     no            id
sigma^2.2  0.0000  0.0000     93     no      id/study
sigma^2.3  0.0459  0.2142    215     no  id/study/esid

Test for Heterogeneity:
Q(df = 214) = 1876.2588, p-val < .0001

Model Results:

                                estimate      se        tval
df      pval       ci.lb    ci.ub
targetcommitment1 -0.1747  0.0352  -4.9607  214  <.0001  -0.2441  -0.1053  ***

Multivariate Meta-Analysis Model (k = 87; method: REML)

Variance Components:

            estim    sqrt  nlvls  fixed        factor
sigma^2.1  0.0294  0.1713     33     no            id
sigma^2.2  0.0000  0.0000     54     no      id/study
sigma^2.3  0.0116  0.1077     87     no  id/study/esid

Test for Heterogeneity:
Q(df = 86) = 302.2812, p-val < .0001

Model Results:

                                 estimate      se    tval  df    pval
 ci.lb   ci.ub
  targetcommitment5 0.2407  0.0397  6.0606  86  <.0001  0.1618  0.3197  ***

So there is a difference between -.155 and -.175, as well as .188 and
.241. These are not the biggest differences, but I have other
moderators for which the differences are stronger.

I tried to follow the suggestions of one of the previous questions
(https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2018-April/000774.html),
but this didn't reduce the difference for me.

My main question would be which of the two approaches would be the
preferable one (and if there is any empirical basis for this)? Is it
more a choice of preference here or is one approach superior in the
present case?

I could also post a link to the data if that would be helpful for
reproducing the findings.

I apologize for double posting this issue and hope that maybe someone
can clarify which option I should choose.

Best wishes,
Janis