[R-meta] Multilevel model between-study variance

Mon Jul 15 15:50:44 CEST 2024

Hi Frederik,

I offer some further comments inline below. I feel like I'm near the limit
of the input that I can offer responsibly over a mailing list (without
knowing quite a bit more about the context and aims of your study).
Ultimately, it's up to your judgement about the best way to handle your
analysis because you know the big-picture context. (I also like Michael's
suggestion from earlier to use sensitivity analysis to provide a fuller
picture of the results.)

James

On Mon, Jul 15, 2024 at 2:19 AM Frederik Zirn <frederik.zirn using uni-konstanz.de>
wrote:

> Hi James,
>
> I hope it is OK if I ask a further question. You wondered whether some of
> the reported effect sizes in the dissertation are different from the
> others. As explained, the dissertation measures several behaviors at the
> workplace - all indicating transfer success. Each of those behaviors is 1)
> self-reported vs a control group 2) third-party evaluated vs a control
> group. On top, each is measured in a way like 3) "Compared to 3 months ago,
> I improved my behavior regarding..." (self-reported vs. CG) and 4)
> "Compared to 3 months ago, my supervisor changed his behavior regarding..
> (third-party evaluation vs. CG).
> Thus, every identified transfer behavior is measured in 4 different ways.

Do the other included studies also use both self reports and third-party
reports? Do the other studies use retrospective change measures? If the
answer on either of these is no, then that would seem to be a reasonable
basis for excluding the type of measure that is not represented except in
the dissertation (or for doing so as a sensitivity analysis). Or, you could
specify a model that includes moderators for self- versus
third-party-report and for retrospective versus contemporaneous measures.

> I thought about only including the first two measurements of every
> behavior and neglecting the retrospective change measurements (those
> resulted in high effect sizes, especially for the self-reported ones).
> However, I am unsure whether that would be justified as there is a control
> group for those measurements, which meets my inclusion criteria.

If you have pre-specified inclusion criteria and/or an analytic protocol,
then it does seem important to hew close to those and report whatever your
pre-specified analysis is. Further variations could still be reported as
sensitivity analyses.

> Considering for a moment that it would be justified: Leaving 3) and 4) out
> would reduce the included effect sizes from the dissertation from 25 to 13,
> resulting in a change in the between-study heterogeneity (no longer being
> zero)
>
>             estim    sqrt  nlvls  fixed       factor
> sigma^2.1  0.0133  0.1151     12     no        Study
> sigma^2.2  0.0661  0.2570     29     no  Study/ES_ID
>
> What is unclear to me: Why would the overall effect of the CHE model
> decrease marginally from 0.1995 to 01919 when I exclude the estimates with
> the largest effect sizes from the dissertation? Would that be once more due
> to those weird properties of the inverse-variance method?
>

I think it's not so much the negative weighting issue (which is the weird
aspect of inverse-variance weighting) but rather a property of the CHE
model. Because CHE allows for within-study heterogeneity, it tends to put
more weight on studies with a larger number of effect sizes. If, after
excluding the retrospective change measures, the dissertation study has a
small average effect size, then then it will tend to get a lot of weight
under the CHE working model.

>
> Even with the proposed reduction, the effect size distribution remains
> very uneven (with more than a third coming from one study), which is why I
> am unsure whether it is appropriate to implement a CHE model. Do you think
> this is a valid argument to decide against the CHE model in my case and opt
> for a reductionist (aggregated) approach instead?

I think it is a pretty reasonable basis for reporting multiple working
models (such as CHE and aggregation). The reason I'm still reluctant to
dismiss CHE entirely is that all of this seems to be driven by substantial
heterogeneity of effects *at the within-study level*. Aggregation amounts
to ignoring that level of heterogeneity.

> Would it be a good idea to communicate transparently the decision against
> the CHE model in the article?
>

Certainly, transparent communication of rationale is always good.

	[[alternative HTML version deleted]]