[R-meta] Clarification on an answer
James Pustejovsky
jepu@to @end|ng |rom gm@||@com
Mon Feb 13 23:42:56 CET 2023
Hi Yuhang,
Thanks for your questions.
On your first question, yes you are correct it should be DIAG DIAG
(diagonal vcov matrices) rather than ID ID (identity matrices).
On your second question, I think the easiest approach might be to just
count. For a given level of units (studies or samples), summarize of the
number of effects included in each unit by each value of the moderator.
Then count the number of units that have both values of the moderator (or
more generally, have multiple values of the moderator).
Section 4.1 of the supplementary materials for Pustejovsky & Tipton (2022)
has some examples of summary tables that get at this question.
Supplementary materials pdf: https://osf.io/nyv4u
Code for supplementary materials: https://osf.io/mahc2
James
On Mon, Feb 13, 2023 at 3:49 PM Yuhang Hu via R-sig-meta-analysis <
r-sig-meta-analysis using r-project.org> wrote:
> Dear Meta-analysis Experts,
>
> I'm hoping to use the forwarded answer below in my model but need two
> clarifications about the answer.
>
> First, bullet point # 3, says that we fit the below model and then "compare
> this model against the model that assumes homogeneous variance components
> across subgroups."
>
> rma.mv(yi, V = Vsub,
> mods = ~ tc_fac - 1,
> random = list(~ tc_fac | id, ~ tc_fac | id:esid), struct =
> c("ID","ID"),
> data = cc_d, method = "REML")
>
> But the above model already assumes homogeneous variance components across
> subgroups, so I wonder if the intention was actually to set struct =
> c("DIAG","DIAG")?
>
> Second, the answer mentions "Based on the reported counts from your output,
> it looks like there might be four publications (reporting a total of nine
> studies) that have both levels."
>
> I was wondering how to obtain the number of publications and their nested
> number of studies that have both levels (so I can apply this to my model)?
>
> Thank you for your time,
> Yuhang
> ---------- Forwarded message ---------
> From: James Pustejovsky via R-sig-meta-analysis
> <r-sig-meta-analysis using r-project.org>
> Date: Thu, Feb 9, 2023 at 8:14 PM
> Subject: Re: [R-meta] Different estimates for multilevel meta-analysis
> when using rma.mv with mods or subset
> To: R Special Interest Group for Meta-Analysis
> <r-sig-meta-analysis using r-project.org>
>
> Hi Janis,
>
> There are potentially two things going on here. First, when estimating
> separate models within each subgroup, you allow the heterogeneity (variance
> components) to differ by subgroup. This in turn leads to a different
> weighting of the individual effect size estimates in the calculation of the
> overall average effect sizes than the weighting used in the moderator
> analysis. That difference in weighting might be enough to account for the
> swings in the average ES for each category. To determine which model is
> more appropriate, you could use fit statistics or a likelihood ratio test.
> To get such information, you'll need to find a way to express the subgroup
> analyses in terms of a single model. You can do this as follows:
>
> 1. Make a factor variable from targetcommitment so that you don't have to
> repeatedly calculate it:
> cc_d$tc_fac <- factor(cc_d$targetcommitment)
>
> 2. If you created the V matrix using metafor::vcalc(), set the subgroup
> argument to the moderator variable:
> Vsub <- vcalc(..., subgroup = cc_d$tc_fac, ...)
>
> 3. Fit a model using this new V matrix, specifying a random effects
> structure that allows independent effects in each subgroup:
> rma.mv(yi, V = Vsub,
> mods = ~ tc_fac - 1,
> random = list(~ tc_fac | id, ~ tc_fac | id:esid), struct =
> c("ID","ID"),
> data = cc_d, method = "REML")
> Note that I've omitted the middle level of random effects because there
> doesn't seem to be any variance there after accounting for the id and esid
> levels. The results of (3) should be identical (or nearly so) to the
> results from the subgroup analysis. But now they're embedded within one
> model, so you can get fit statistics or conduct a likelihood ratio test to
> compare this model against the model that assumes homogeneous variance
> components across subgroups.
>
> The second thing that might be going on is that your data seems to include
> a few publications that have both levels of the targetcommitment variable.
> Based on the reported counts from your output, it looks like there might be
> four publications (reporting a total of nine studies) that have both
> levels. Because of this structure, the first model you use for moderator
> analysis will calculated average effect size estimates for each level of
> targetcommitment based in part on the effect size estimates *for the other
> category* from the four publications / 9 studies that include both levels.
> This has the effect of moving the averages towards each other---that is
> -.175 and .241 get pulled inward to -.155 and .188.
>
> If you're particularly interested in figuring out the difference between
> levels of targetcommitment, the question is then whether this "pulling in"
> is a reasonable thing to do. You could consider this on a conceptual level:
> is it reasonable to put special emphasis on the results from these four
> publications in estimating the difference between levels of
> targetcommitment? If there's not a clear conceptual rationale, then you
> might consider fitting a slightly different model, which includes the
> study-mean variable and the study-mean-centered variable as separate terms.
> For targetcommitment, this would mean calculating the average of the dummy
> variable (targetcommitment == 2) for each study (this gives you the first
> predictor, call it tc_study) and then subtracting the average from the
> original dummy variable (this gives you the second predictor, call it
> tc_within). The model should then use the moderators:
> ~ 1 + tc_study + tc_within
> Tanner-Smith and Tipton (2014; https://doi.org/10.1002/jrsm.1091)
> recommend
> this as a generic strategy for meta-regression of dependent effect sizes.
>
> James
>
>
> On Wed, Feb 8, 2023 at 2:40 PM Janis Zickfeld via R-sig-meta-analysis <
> r-sig-meta-analysis using r-project.org> wrote:
>
> > Hi all,
> >
> > my question has already been raised before (e.g.,
> >
> https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2018-April/000774.html
> > ),
> > but the answers did not help me understanding the main problem or
> > solve differences in the output, so that is why I'm posting it again.
> >
> > We are conducting a meta-analysis and have effect sizes ("esid")
> > nested within studies or samples ("study") nested within publications
> > ("id"). There are 344 effect sizes in total across 110 publications
> > and 160 study - publication combinations. There are 48.75% of studies
> > including more than one effect size and of these 26.82% are dependent
> > effects because they either use the same sample or the same control
> > treatment used to calculate the effect. Therefore, we have constructed
> > an approximate variance-covariance matrix to account for this. Some of
> > the moderators only use a subset of this data, as they have missing
> > values (as in the example below).
> >
> > My main problem is now that I get different estimates when running the
> > rma.mv model with factorial moderators using 'mods' or when subsetting
> > them via the 'subset' command. I have seen this discussed here
> > (
> >
> http://www.metafor-project.org/doku.php/tips:comp_two_independent_estimates
> > )
> > and it seems that the main difference is that 'mods' uses the same
> > residual heterogeneity, whereas 'subset' allows for different levels
> > of tau^2. However, this example does not discuss a multilevel
> > meta-analysis.
> >
> > When running 'mods' using the following syntax:
> >
> > rma.mv(yi, V, random = ~ 1 | id/study/esid, data = cc_d, method =
> > "REML", mods = ~factor(target_commitment) - 1)
> >
> > I get:
> >
> > Multivariate Meta-Analysis Model (k = 302; method: REML)
> >
> > Variance Components:
> >
> > estim sqrt nlvls fixed factor
> > sigma^2.1 0.0346 0.1861 96 no id
> > sigma^2.2 0.0000 0.0000 138 no id/study
> > sigma^2.3 0.0356 0.1887 302 no id/study/esid
> >
> > Test for Residual Heterogeneity:
> > QE(df = 300) = 2213.7041, p-val < .0001
> >
> > Test of Moderators (coefficients 1:2):
> > QM(df = 2) = 54.1962, p-val < .0001
> >
> > Model Results:
> >
> > estimate se
> > zval pval ci.lb ci.ub
> > factor(target_commitment)1 -0.1551 0.0315 -4.9258 <.0001 -0.2169
> > -0.0934 ***
> > factor(target_commitment)5 0.1878 0.0420 4.4770 <.0001 0.1056
> > 0.2701 ***
> >
> >
> > However, running:
> >
> > rma.mv(yi, V, random = ~ 1 | id/study/ID2, data = cc_d, method =
> > "REML", subset=target_commitment==1)
> >
> > I obtain:
> >
> > Multivariate Meta-Analysis Model (k = 215; method: REML)
> >
> > Variance Components:
> >
> > estim sqrt nlvls fixed factor
> > sigma^2.1 0.0401 0.2002 67 no id
> > sigma^2.2 0.0000 0.0000 93 no id/study
> > sigma^2.3 0.0459 0.2142 215 no id/study/esid
> >
> > Test for Heterogeneity:
> > Q(df = 214) = 1876.2588, p-val < .0001
> >
> > Model Results:
> >
> > estimate se tval
> > df pval ci.lb ci.ub
> > targetcommitment1 -0.1747 0.0352 -4.9607 214 <.0001 -0.2441 -0.1053
> > ***
> >
> >
> > Multivariate Meta-Analysis Model (k = 87; method: REML)
> >
> > Variance Components:
> >
> > estim sqrt nlvls fixed factor
> > sigma^2.1 0.0294 0.1713 33 no id
> > sigma^2.2 0.0000 0.0000 54 no id/study
> > sigma^2.3 0.0116 0.1077 87 no id/study/esid
> >
> > Test for Heterogeneity:
> > Q(df = 86) = 302.2812, p-val < .0001
> >
> > Model Results:
> >
> > estimate se tval df pval
> > ci.lb ci.ub
> > targetcommitment5 0.2407 0.0397 6.0606 86 <.0001 0.1618 0.3197
> ***
> >
> >
> > So there is a difference between -.155 and -.175, as well as .188 and
> > .241. These are not the biggest differences, but I have other
> > moderators for which the differences are stronger.
> >
> > I tried to follow the suggestions of one of the previous questions
> > (
> https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2018-April/000774.html
> > ),
> > but this didn't reduce the difference for me.
> >
> > My main question would be which of the two approaches would be the
> > preferable one (and if there is any empirical basis for this)? Is it
> > more a choice of preference here or is one approach superior in the
> > present case?
> >
> > I could also post a link to the data if that would be helpful for
> > reproducing the findings.
> >
> > I apologize for double posting this issue and hope that maybe someone
> > can clarify which option I should choose.
> >
> > Best wishes,
> > Janis
> >
> > _______________________________________________
> > R-sig-meta-analysis mailing list @ R-sig-meta-analysis using r-project.org
> > To manage your subscription to this mailing list, go to:
> > https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-meta-analysis mailing list @ R-sig-meta-analysis using r-project.org
> To manage your subscription to this mailing list, go to:
> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>
[[alternative HTML version deleted]]
More information about the R-sig-meta-analysis
mailing list