[R-meta] Different estimates for multilevel meta-analysis when using rma.mv with mods or subset
Janis Zickfeld
jhz|ck|e|d @end|ng |rom gm@||@com
Mon Feb 13 08:53:55 CET 2023
Hi James,
thank you so much for the detailed response. This was really helpful
and helped me to get a better grasp of what is going on in the data.
I conducted the LRTs and model fits are quite similar for all models I ran.
Best wishes,
Janis
On Fri, Feb 10, 2023 at 3:14 AM James Pustejovsky <jepusto using gmail.com> wrote:
>
> Hi Janis,
>
> There are potentially two things going on here. First, when estimating separate models within each subgroup, you allow the heterogeneity (variance components) to differ by subgroup. This in turn leads to a different weighting of the individual effect size estimates in the calculation of the overall average effect sizes than the weighting used in the moderator analysis. That difference in weighting might be enough to account for the swings in the average ES for each category. To determine which model is more appropriate, you could use fit statistics or a likelihood ratio test. To get such information, you'll need to find a way to express the subgroup analyses in terms of a single model. You can do this as follows:
>
> 1. Make a factor variable from targetcommitment so that you don't have to repeatedly calculate it:
> cc_d$tc_fac <- factor(cc_d$targetcommitment)
>
> 2. If you created the V matrix using metafor::vcalc(), set the subgroup argument to the moderator variable:
> Vsub <- vcalc(..., subgroup = cc_d$tc_fac, ...)
>
> 3. Fit a model using this new V matrix, specifying a random effects structure that allows independent effects in each subgroup:
> rma.mv(yi, V = Vsub,
> mods = ~ tc_fac - 1,
> random = list(~ tc_fac | id, ~ tc_fac | id:esid), struct = c("ID","ID"),
> data = cc_d, method = "REML")
> Note that I've omitted the middle level of random effects because there doesn't seem to be any variance there after accounting for the id and esid levels. The results of (3) should be identical (or nearly so) to the results from the subgroup analysis. But now they're embedded within one model, so you can get fit statistics or conduct a likelihood ratio test to compare this model against the model that assumes homogeneous variance components across subgroups.
>
> The second thing that might be going on is that your data seems to include a few publications that have both levels of the targetcommitment variable. Based on the reported counts from your output, it looks like there might be four publications (reporting a total of nine studies) that have both levels. Because of this structure, the first model you use for moderator analysis will calculated average effect size estimates for each level of targetcommitment based in part on the effect size estimates *for the other category* from the four publications / 9 studies that include both levels. This has the effect of moving the averages towards each other---that is -.175 and .241 get pulled inward to -.155 and .188.
>
> If you're particularly interested in figuring out the difference between levels of targetcommitment, the question is then whether this "pulling in" is a reasonable thing to do. You could consider this on a conceptual level: is it reasonable to put special emphasis on the results from these four publications in estimating the difference between levels of targetcommitment? If there's not a clear conceptual rationale, then you might consider fitting a slightly different model, which includes the study-mean variable and the study-mean-centered variable as separate terms. For targetcommitment, this would mean calculating the average of the dummy variable (targetcommitment == 2) for each study (this gives you the first predictor, call it tc_study) and then subtracting the average from the original dummy variable (this gives you the second predictor, call it tc_within). The model should then use the moderators:
> ~ 1 + tc_study + tc_within
> Tanner-Smith and Tipton (2014; https://doi.org/10.1002/jrsm.1091) recommend this as a generic strategy for meta-regression of dependent effect sizes.
>
> James
>
>
> On Wed, Feb 8, 2023 at 2:40 PM Janis Zickfeld via R-sig-meta-analysis <r-sig-meta-analysis using r-project.org> wrote:
>>
>> Hi all,
>>
>> my question has already been raised before (e.g.,
>> https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2018-April/000774.html),
>> but the answers did not help me understanding the main problem or
>> solve differences in the output, so that is why I'm posting it again.
>>
>> We are conducting a meta-analysis and have effect sizes ("esid")
>> nested within studies or samples ("study") nested within publications
>> ("id"). There are 344 effect sizes in total across 110 publications
>> and 160 study - publication combinations. There are 48.75% of studies
>> including more than one effect size and of these 26.82% are dependent
>> effects because they either use the same sample or the same control
>> treatment used to calculate the effect. Therefore, we have constructed
>> an approximate variance-covariance matrix to account for this. Some of
>> the moderators only use a subset of this data, as they have missing
>> values (as in the example below).
>>
>> My main problem is now that I get different estimates when running the
>> rma.mv model with factorial moderators using 'mods' or when subsetting
>> them via the 'subset' command. I have seen this discussed here
>> (http://www.metafor-project.org/doku.php/tips:comp_two_independent_estimates)
>> and it seems that the main difference is that 'mods' uses the same
>> residual heterogeneity, whereas 'subset' allows for different levels
>> of tau^2. However, this example does not discuss a multilevel
>> meta-analysis.
>>
>> When running 'mods' using the following syntax:
>>
>> rma.mv(yi, V, random = ~ 1 | id/study/esid, data = cc_d, method =
>> "REML", mods = ~factor(target_commitment) - 1)
>>
>> I get:
>>
>> Multivariate Meta-Analysis Model (k = 302; method: REML)
>>
>> Variance Components:
>>
>> estim sqrt nlvls fixed factor
>> sigma^2.1 0.0346 0.1861 96 no id
>> sigma^2.2 0.0000 0.0000 138 no id/study
>> sigma^2.3 0.0356 0.1887 302 no id/study/esid
>>
>> Test for Residual Heterogeneity:
>> QE(df = 300) = 2213.7041, p-val < .0001
>>
>> Test of Moderators (coefficients 1:2):
>> QM(df = 2) = 54.1962, p-val < .0001
>>
>> Model Results:
>>
>> estimate se
>> zval pval ci.lb ci.ub
>> factor(target_commitment)1 -0.1551 0.0315 -4.9258 <.0001 -0.2169
>> -0.0934 ***
>> factor(target_commitment)5 0.1878 0.0420 4.4770 <.0001 0.1056
>> 0.2701 ***
>>
>>
>> However, running:
>>
>> rma.mv(yi, V, random = ~ 1 | id/study/ID2, data = cc_d, method =
>> "REML", subset=target_commitment==1)
>>
>> I obtain:
>>
>> Multivariate Meta-Analysis Model (k = 215; method: REML)
>>
>> Variance Components:
>>
>> estim sqrt nlvls fixed factor
>> sigma^2.1 0.0401 0.2002 67 no id
>> sigma^2.2 0.0000 0.0000 93 no id/study
>> sigma^2.3 0.0459 0.2142 215 no id/study/esid
>>
>> Test for Heterogeneity:
>> Q(df = 214) = 1876.2588, p-val < .0001
>>
>> Model Results:
>>
>> estimate se tval
>> df pval ci.lb ci.ub
>> targetcommitment1 -0.1747 0.0352 -4.9607 214 <.0001 -0.2441 -0.1053 ***
>>
>>
>> Multivariate Meta-Analysis Model (k = 87; method: REML)
>>
>> Variance Components:
>>
>> estim sqrt nlvls fixed factor
>> sigma^2.1 0.0294 0.1713 33 no id
>> sigma^2.2 0.0000 0.0000 54 no id/study
>> sigma^2.3 0.0116 0.1077 87 no id/study/esid
>>
>> Test for Heterogeneity:
>> Q(df = 86) = 302.2812, p-val < .0001
>>
>> Model Results:
>>
>> estimate se tval df pval
>> ci.lb ci.ub
>> targetcommitment5 0.2407 0.0397 6.0606 86 <.0001 0.1618 0.3197 ***
>>
>>
>> So there is a difference between -.155 and -.175, as well as .188 and
>> .241. These are not the biggest differences, but I have other
>> moderators for which the differences are stronger.
>>
>> I tried to follow the suggestions of one of the previous questions
>> (https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2018-April/000774.html),
>> but this didn't reduce the difference for me.
>>
>> My main question would be which of the two approaches would be the
>> preferable one (and if there is any empirical basis for this)? Is it
>> more a choice of preference here or is one approach superior in the
>> present case?
>>
>> I could also post a link to the data if that would be helpful for
>> reproducing the findings.
>>
>> I apologize for double posting this issue and hope that maybe someone
>> can clarify which option I should choose.
>>
>> Best wishes,
>> Janis
>>
>> _______________________________________________
>> R-sig-meta-analysis mailing list @ R-sig-meta-analysis using r-project.org
>> To manage your subscription to this mailing list, go to:
>> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
More information about the R-sig-meta-analysis
mailing list