# [R-meta] Clarification on an answer

James Pustejovsky jepu@to @end|ng |rom gm@||@com
Tue Feb 14 03:50:40 CET 2023

```I compared the sum of the number of studies (or number of publications)
across levels of the moderator to the total number of studies (or
publications). The difference is the number of studies (or publications)
that include both levels.

On Mon, Feb 13, 2023 at 8:16 PM Yuhang Hu <yh342 using nau.edu> wrote:

> Thank you, James, for the clarification. Regarding my second question, I
> was particularly interested in understanding how to determine the number of
> publications and their nested number of studies that have both levels from
> the output (likely from "nlvls" below), not the raw data?
>
> Thank you,
> Yuhang
>
> Full Model:
> >             estim    sqrt  nlvls  fixed        factor
> > sigma^2.1  0.0346  0.1861     96     no            id
> > sigma^2.2  0.0000  0.0000    138     no      id/study
> > sigma^2.3  0.0356  0.1887    302     no  id/study/esid
>
> Subgroup1:
> >                    estim    sqrt        nlvls  fixed        factor
> > sigma^2.1  0.0401  0.2002     67     no            id
> > sigma^2.2  0.0000  0.0000     93     no      id/study
> > sigma^2.3  0.0459  0.2142    215     no  id/study/esid
>
> Subgroup2:
> >             estim    sqrt  nlvls  fixed        factor
> > sigma^2.1  0.0294  0.1713     33     no            id
> > sigma^2.2  0.0000  0.0000     54     no      id/study
> > sigma^2.3  0.0116  0.1077     87     no  id/study/esid
>
> On Mon, Feb 13, 2023 at 3:43 PM James Pustejovsky <jepusto using gmail.com>
> wrote:
>
>> Hi Yuhang,
>>
>>
>> On your first question, yes you are correct it should be DIAG DIAG
>> (diagonal vcov matrices) rather than ID ID (identity matrices).
>>
>> On your second question, I think the easiest approach might be to just
>> count. For a given level of units (studies or samples), summarize of the
>> number of effects included in each unit by each value of the moderator.
>> Then count the number of units that have both values of the moderator (or
>> more generally, have multiple values of the moderator).
>>
>> Section 4.1 of the supplementary materials for Pustejovsky & Tipton
>> (2022) has some examples of summary tables that get at this question.
>> Supplementary materials pdf: https://osf.io/nyv4u
>> Code for supplementary materials: https://osf.io/mahc2
>>
>> James
>>
>>
>> On Mon, Feb 13, 2023 at 3:49 PM Yuhang Hu via R-sig-meta-analysis <
>> r-sig-meta-analysis using r-project.org> wrote:
>>
>>> Dear Meta-analysis Experts,
>>>
>>> I'm hoping to use the forwarded answer below in my model but need two
>>>
>>> First, bullet point # 3, says that we fit the below model and then
>>> "compare
>>> this model against the model that assumes homogeneous variance components
>>> across subgroups."
>>>
>>> rma.mv(yi, V = Vsub,
>>>               mods = ~ tc_fac - 1,
>>>               random = list(~ tc_fac | id, ~ tc_fac | id:esid), struct =
>>> c("ID","ID"),
>>>               data = cc_d, method = "REML")
>>>
>>> But the above model already assumes homogeneous variance components
>>> across
>>> subgroups, so I wonder if the intention was actually to set struct =
>>> c("DIAG","DIAG")?
>>>
>>> Second, the answer mentions "Based on the reported counts from your
>>> output,
>>> it looks like there might be four publications (reporting a total of nine
>>> studies) that have both levels."
>>>
>>> I was wondering how to obtain the number of publications and their nested
>>> number of studies that have both levels (so I can apply this to my
>>> model)?
>>>
>>> Thank you for your time,
>>> Yuhang
>>> ---------- Forwarded message ---------
>>> From: James Pustejovsky via R-sig-meta-analysis
>>> <r-sig-meta-analysis using r-project.org>
>>> Date: Thu, Feb 9, 2023 at 8:14 PM
>>> Subject: Re: [R-meta] Different estimates for multilevel meta-analysis
>>> when using rma.mv with mods or subset
>>> To: R Special Interest Group for Meta-Analysis
>>> <r-sig-meta-analysis using r-project.org>
>>>
>>> Hi Janis,
>>>
>>> There are potentially two things going on here. First, when estimating
>>> separate models within each subgroup, you allow the heterogeneity
>>> (variance
>>> components) to differ by subgroup. This in turn leads to a different
>>> weighting of the individual effect size estimates in the calculation of
>>> the
>>> overall average effect sizes than the weighting used in the moderator
>>> analysis. That difference in weighting might be enough to account for the
>>> swings in the average ES for each category. To determine which model is
>>> more appropriate, you could use fit statistics or a likelihood ratio
>>> test.
>>> To get such information, you'll need to find a way to express the
>>> subgroup
>>> analyses in terms of a single model. You can do this as follows:
>>>
>>> 1. Make a factor variable from targetcommitment so that you don't have to
>>> repeatedly calculate it:
>>> cc_d\$tc_fac <- factor(cc_d\$targetcommitment)
>>>
>>> 2. If you created the V matrix using metafor::vcalc(), set the subgroup
>>> argument to the moderator variable:
>>> Vsub <- vcalc(..., subgroup = cc_d\$tc_fac, ...)
>>>
>>> 3. Fit a model using this new V matrix, specifying a random effects
>>> structure that allows independent effects in each subgroup:
>>> rma.mv(yi, V = Vsub,
>>>               mods = ~ tc_fac - 1,
>>>               random = list(~ tc_fac | id, ~ tc_fac | id:esid), struct =
>>> c("ID","ID"),
>>>               data = cc_d, method = "REML")
>>> Note that I've omitted the middle level of random effects because there
>>> doesn't seem to be any variance there after accounting for the id and
>>> esid
>>> levels. The results of (3) should be identical (or nearly so) to the
>>> results from the subgroup analysis. But now they're embedded within one
>>> model, so you can get fit statistics or conduct a likelihood ratio test
>>> to
>>> compare this model against the model that assumes homogeneous variance
>>> components across subgroups.
>>>
>>> The second thing that might be going on is that your data seems to
>>> include
>>> a few publications that have both levels of the targetcommitment
>>> variable.
>>> Based on the reported counts from your output, it looks like there might
>>> be
>>> four publications (reporting a total of nine studies) that have both
>>> levels. Because of this structure, the first model you use for moderator
>>> analysis will calculated average effect size estimates for each level of
>>> targetcommitment based in part on the effect size estimates *for the
>>> other
>>> category* from the four publications / 9 studies that include both
>>> levels.
>>> This has the effect of moving the averages towards each other---that is
>>> -.175 and .241 get pulled inward to  -.155 and .188.
>>>
>>> If you're particularly interested in figuring out the difference between
>>> levels of targetcommitment, the question is then whether this "pulling
>>> in"
>>> is a reasonable thing to do. You could consider this on a conceptual
>>> level:
>>> is it reasonable to put special emphasis on the results from these four
>>> publications in estimating the difference between levels of
>>> targetcommitment? If there's not a clear conceptual rationale, then you
>>> might consider fitting a slightly different model, which includes the
>>> study-mean variable and the study-mean-centered variable as separate
>>> terms.
>>> For targetcommitment, this would mean calculating the average of the
>>> dummy
>>> variable (targetcommitment == 2) for each study (this gives you the first
>>> predictor, call it tc_study) and then subtracting the average from the
>>> original dummy variable (this gives you the second predictor, call it
>>> tc_within). The model should then use the moderators:
>>> ~ 1 + tc_study + tc_within
>>> Tanner-Smith and Tipton (2014; https://doi.org/10.1002/jrsm.1091)
>>> recommend
>>> this as a generic strategy for meta-regression of dependent effect sizes.
>>>
>>> James
>>>
>>>
>>> On Wed, Feb 8, 2023 at 2:40 PM Janis Zickfeld via R-sig-meta-analysis <
>>> r-sig-meta-analysis using r-project.org> wrote:
>>>
>>> > Hi all,
>>> >
>>> > my question has already been raised before (e.g.,
>>> >
>>> https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2018-April/000774.html
>>> > ),
>>> > but the answers did not help me understanding the main problem or
>>> > solve differences in the output, so that is why I'm posting it again.
>>> >
>>> > We are conducting a meta-analysis and have effect sizes ("esid")
>>> > nested within studies or samples ("study") nested within publications
>>> > ("id"). There are 344 effect sizes in total across 110 publications
>>> > and 160 study - publication combinations. There are 48.75% of studies
>>> > including more than one effect size and of these 26.82% are dependent
>>> > effects because they either use the same sample or the same control
>>> > treatment used to calculate the effect. Therefore, we have constructed
>>> > an approximate variance-covariance matrix to account for this. Some of
>>> > the moderators only use a subset of this data, as they have missing
>>> > values (as in the example below).
>>> >
>>> > My main problem is now that I get different estimates when running the
>>> > rma.mv model with factorial moderators using 'mods' or when subsetting
>>> > them via the 'subset' command. I have seen this discussed here
>>> > (
>>> >
>>>
>>> http://www.metafor-project.org/doku.php/tips:comp_two_independent_estimates
>>> > )
>>> > and it seems that the main difference is that 'mods' uses the same
>>> > residual heterogeneity, whereas 'subset' allows for different levels
>>> > of tau^2. However, this example does not discuss a multilevel
>>> > meta-analysis.
>>> >
>>> > When running 'mods' using the following syntax:
>>> >
>>> > rma.mv(yi, V, random = ~ 1 | id/study/esid, data = cc_d, method =
>>> > "REML", mods = ~factor(target_commitment) - 1)
>>> >
>>> > I get:
>>> >
>>> > Multivariate Meta-Analysis Model (k = 302; method: REML)
>>> >
>>> > Variance Components:
>>> >
>>> >             estim    sqrt  nlvls  fixed        factor
>>> > sigma^2.1  0.0346  0.1861     96     no            id
>>> > sigma^2.2  0.0000  0.0000    138     no      id/study
>>> > sigma^2.3  0.0356  0.1887    302     no  id/study/esid
>>> >
>>> > Test for Residual Heterogeneity:
>>> > QE(df = 300) = 2213.7041, p-val < .0001
>>> >
>>> > Test of Moderators (coefficients 1:2):
>>> > QM(df = 2) = 54.1962, p-val < .0001
>>> >
>>> > Model Results:
>>> >
>>> >                                              estimate         se
>>> > zval        pval        ci.lb    ci.ub
>>> > factor(target_commitment)1   -0.1551  0.0315  -4.9258  <.0001  -0.2169
>>> >  -0.0934  ***
>>> > factor(target_commitment)5    0.1878  0.0420   4.4770  <.0001   0.1056
>>> >   0.2701  ***
>>> >
>>> >
>>> > However, running:
>>> >
>>> > rma.mv(yi, V, random = ~ 1 | id/study/ID2, data = cc_d, method =
>>> > "REML", subset=target_commitment==1)
>>> >
>>> > I obtain:
>>> >
>>> > Multivariate Meta-Analysis Model (k = 215; method: REML)
>>> >
>>> > Variance Components:
>>> >
>>> >                    estim    sqrt        nlvls  fixed        factor
>>> > sigma^2.1  0.0401  0.2002     67     no            id
>>> > sigma^2.2  0.0000  0.0000     93     no      id/study
>>> > sigma^2.3  0.0459  0.2142    215     no  id/study/esid
>>> >
>>> > Test for Heterogeneity:
>>> > Q(df = 214) = 1876.2588, p-val < .0001
>>> >
>>> > Model Results:
>>> >
>>> >                                 estimate      se        tval
>>> > df      pval       ci.lb    ci.ub
>>> > targetcommitment1 -0.1747  0.0352  -4.9607  214  <.0001  -0.2441
>>> -0.1053
>>> > ***
>>> >
>>> >
>>> > Multivariate Meta-Analysis Model (k = 87; method: REML)
>>> >
>>> > Variance Components:
>>> >
>>> >             estim    sqrt  nlvls  fixed        factor
>>> > sigma^2.1  0.0294  0.1713     33     no            id
>>> > sigma^2.2  0.0000  0.0000     54     no      id/study
>>> > sigma^2.3  0.0116  0.1077     87     no  id/study/esid
>>> >
>>> > Test for Heterogeneity:
>>> > Q(df = 86) = 302.2812, p-val < .0001
>>> >
>>> > Model Results:
>>> >
>>> >                                  estimate      se    tval  df    pval
>>> >  ci.lb   ci.ub
>>> >   targetcommitment5 0.2407  0.0397  6.0606  86  <.0001  0.1618  0.3197
>>> ***
>>> >
>>> >
>>> > So there is a difference between -.155 and -.175, as well as .188 and
>>> > .241. These are not the biggest differences, but I have other
>>> > moderators for which the differences are stronger.
>>> >
>>> > I tried to follow the suggestions of one of the previous questions
>>> > (
>>> https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2018-April/000774.html
>>> > ),
>>> > but this didn't reduce the difference for me.
>>> >
>>> > My main question would be which of the two approaches would be the
>>> > preferable one (and if there is any empirical basis for this)? Is it
>>> > more a choice of preference here or is one approach superior in the
>>> > present case?
>>> >
>>> > I could also post a link to the data if that would be helpful for
>>> > reproducing the findings.
>>> >
>>> > I apologize for double posting this issue and hope that maybe someone
>>> > can clarify which option I should choose.
>>> >
>>> > Best wishes,
>>> > Janis
>>> >
>>> > _______________________________________________
>>> > R-sig-meta-analysis mailing list @ R-sig-meta-analysis using r-project.org
>>> > To manage your subscription to this mailing list, go to:
>>> > https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-meta-analysis mailing list @ R-sig-meta-analysis using r-project.org
>>> To manage your subscription to this mailing list, go to:
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>>>
>>
>
> --
> Yuhang Hu (She/Her/Hers)
> Ph.D. Student in Applied Linguistics
> Department of English
> Northern Arizona University
>

[[alternative HTML version deleted]]

```