[R-meta] Different estimates for multilevel meta-analysis when using rma.mv with mods or subset

Fri Feb 10 03:14:29 CET 2023

Hi Janis,

There are potentially two things going on here. First, when estimating
separate models within each subgroup, you allow the heterogeneity (variance
components) to differ by subgroup. This in turn leads to a different
weighting of the individual effect size estimates in the calculation of the
overall average effect sizes than the weighting used in the moderator
analysis. That difference in weighting might be enough to account for the
swings in the average ES for each category. To determine which model is
more appropriate, you could use fit statistics or a likelihood ratio test.
To get such information, you'll need to find a way to express the subgroup
analyses in terms of a single model. You can do this as follows:

1. Make a factor variable from targetcommitment so that you don't have to
repeatedly calculate it:
cc_d$tc_fac <- factor(cc_d$targetcommitment)

2. If you created the V matrix using metafor::vcalc(), set the subgroup
argument to the moderator variable:
Vsub <- vcalc(..., subgroup = cc_d$tc_fac, ...)

3. Fit a model using this new V matrix, specifying a random effects
structure that allows independent effects in each subgroup:
rma.mv(yi, V = Vsub,
              mods = ~ tc_fac - 1,
              random = list(~ tc_fac | id, ~ tc_fac | id:esid), struct =
c("ID","ID"),
              data = cc_d, method = "REML")
Note that I've omitted the middle level of random effects because there
doesn't seem to be any variance there after accounting for the id and esid
levels. The results of (3) should be identical (or nearly so) to the
results from the subgroup analysis. But now they're embedded within one
model, so you can get fit statistics or conduct a likelihood ratio test to
compare this model against the model that assumes homogeneous variance
components across subgroups.

The second thing that might be going on is that your data seems to include
a few publications that have both levels of the targetcommitment variable.
Based on the reported counts from your output, it looks like there might be
four publications (reporting a total of nine studies) that have both
levels. Because of this structure, the first model you use for moderator
analysis will calculated average effect size estimates for each level of
targetcommitment based in part on the effect size estimates *for the other
category* from the four publications / 9 studies that include both levels.
This has the effect of moving the averages towards each other---that is
-.175 and .241 get pulled inward to  -.155 and .188.

If you're particularly interested in figuring out the difference between
levels of targetcommitment, the question is then whether this "pulling in"
is a reasonable thing to do. You could consider this on a conceptual level:
is it reasonable to put special emphasis on the results from these four
publications in estimating the difference between levels of
targetcommitment? If there's not a clear conceptual rationale, then you
might consider fitting a slightly different model, which includes the
study-mean variable and the study-mean-centered variable as separate terms.
For targetcommitment, this would mean calculating the average of the dummy
variable (targetcommitment == 2) for each study (this gives you the first
predictor, call it tc_study) and then subtracting the average from the
original dummy variable (this gives you the second predictor, call it
tc_within). The model should then use the moderators:
~ 1 + tc_study + tc_within
Tanner-Smith and Tipton (2014; https://doi.org/10.1002/jrsm.1091) recommend
this as a generic strategy for meta-regression of dependent effect sizes.

James

On Wed, Feb 8, 2023 at 2:40 PM Janis Zickfeld via R-sig-meta-analysis <
r-sig-meta-analysis using r-project.org> wrote:

> Hi all,
>
> my question has already been raised before (e.g.,
> https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2018-April/000774.html
> ),
> but the answers did not help me understanding the main problem or
> solve differences in the output, so that is why I'm posting it again.
>
> We are conducting a meta-analysis and have effect sizes ("esid")
> nested within studies or samples ("study") nested within publications
> ("id"). There are 344 effect sizes in total across 110 publications
> and 160 study - publication combinations. There are 48.75% of studies
> including more than one effect size and of these 26.82% are dependent
> effects because they either use the same sample or the same control
> treatment used to calculate the effect. Therefore, we have constructed
> an approximate variance-covariance matrix to account for this. Some of
> the moderators only use a subset of this data, as they have missing
> values (as in the example below).
>
> My main problem is now that I get different estimates when running the
> rma.mv model with factorial moderators using 'mods' or when subsetting
> them via the 'subset' command. I have seen this discussed here
> (
> http://www.metafor-project.org/doku.php/tips:comp_two_independent_estimates
> )
> and it seems that the main difference is that 'mods' uses the same
> residual heterogeneity, whereas 'subset' allows for different levels
> of tau^2. However, this example does not discuss a multilevel
> meta-analysis.
>
> When running 'mods' using the following syntax:
>
> rma.mv(yi, V, random = ~ 1 | id/study/esid, data = cc_d, method =
> "REML", mods = ~factor(target_commitment) - 1)
>
> I get:
>
> Multivariate Meta-Analysis Model (k = 302; method: REML)
>
> Variance Components:
>
>             estim    sqrt  nlvls  fixed        factor
> sigma^2.1  0.0346  0.1861     96     no            id
> sigma^2.2  0.0000  0.0000    138     no      id/study
> sigma^2.3  0.0356  0.1887    302     no  id/study/esid
>
> Test for Residual Heterogeneity:
> QE(df = 300) = 2213.7041, p-val < .0001
>
> Test of Moderators (coefficients 1:2):
> QM(df = 2) = 54.1962, p-val < .0001
>
> Model Results:
>
>                                              estimate         se
> zval        pval        ci.lb    ci.ub
> factor(target_commitment)1   -0.1551  0.0315  -4.9258  <.0001  -0.2169
>  -0.0934  ***
> factor(target_commitment)5    0.1878  0.0420   4.4770  <.0001   0.1056
>   0.2701  ***
>
>
> However, running:
>
> rma.mv(yi, V, random = ~ 1 | id/study/ID2, data = cc_d, method =
> "REML", subset=target_commitment==1)
>
> I obtain:
>
> Multivariate Meta-Analysis Model (k = 215; method: REML)
>
> Variance Components:
>
>                    estim    sqrt        nlvls  fixed        factor
> sigma^2.1  0.0401  0.2002     67     no            id
> sigma^2.2  0.0000  0.0000     93     no      id/study
> sigma^2.3  0.0459  0.2142    215     no  id/study/esid
>
> Test for Heterogeneity:
> Q(df = 214) = 1876.2588, p-val < .0001
>
> Model Results:
>
>                                 estimate      se        tval
> df      pval       ci.lb    ci.ub
> targetcommitment1 -0.1747  0.0352  -4.9607  214  <.0001  -0.2441  -0.1053
> ***
>
>
> Multivariate Meta-Analysis Model (k = 87; method: REML)
>
> Variance Components:
>
>             estim    sqrt  nlvls  fixed        factor
> sigma^2.1  0.0294  0.1713     33     no            id
> sigma^2.2  0.0000  0.0000     54     no      id/study
> sigma^2.3  0.0116  0.1077     87     no  id/study/esid
>
> Test for Heterogeneity:
> Q(df = 86) = 302.2812, p-val < .0001
>
> Model Results:
>
>                                  estimate      se    tval  df    pval
>  ci.lb   ci.ub
>   targetcommitment5 0.2407  0.0397  6.0606  86  <.0001  0.1618  0.3197  ***
>
>
> So there is a difference between -.155 and -.175, as well as .188 and
> .241. These are not the biggest differences, but I have other
> moderators for which the differences are stronger.
>
> I tried to follow the suggestions of one of the previous questions
> (https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2018-April/000774.html
> ),
> but this didn't reduce the difference for me.
>
> My main question would be which of the two approaches would be the
> preferable one (and if there is any empirical basis for this)? Is it
> more a choice of preference here or is one approach superior in the
> present case?
>
> I could also post a link to the data if that would be helpful for
> reproducing the findings.
>
> I apologize for double posting this issue and hope that maybe someone
> can clarify which option I should choose.
>
> Best wishes,
> Janis
>
> _______________________________________________
> R-sig-meta-analysis mailing list @ R-sig-meta-analysis using r-project.org
> To manage your subscription to this mailing list, go to:
> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>

	[[alternative HTML version deleted]]