[R-meta] Moderator analysis: Subsample analysis vs. model without intercept in CHE models.

Fri Apr 28 17:12:01 CEST 2023

Hi Sebastian,

The two approaches you're looking at differ in multiple respects.

Your first approach looks at the subset where tq_ca == 1 and estimates a
model with both math and language outcomes. The effect sizes for different
subjects are modeled as correlated due to the study-level random effect
(which is common across outcomes) and to the assumed correlation between
the ES estimates as represented in V_mat_ca. Because of the correlation
between ES for different subjects, the average effects will be estimated by
"borrowing information" (or partially pooling) across subjects, which has
the effect of pulling the estimates towards each other a bit.

If you wanted to estimate average ES for each subject without the borrowing
of information, based only on the ES for that subject, you could do:

V_mat_ca_sub <- impute_covariance_matrix(daten_ca$class_var,
                                       cluster = daten_ca$studynr,
                                       subgroup = daten_ca$subject,
                                       r = rho,
                                       smooth_vi = TRUE)
model_ca_sub <- rma.mv(r_gesamt_z, V_mat_ca_sub, random =list(~ subject |
studynr, ~ subject | nummer), struct = c("DIAG","DIAG"),
                        mods =~ -1 + subject_math + subject_langall,
                        data=daten_ca)
robust(model_ca_sub, daten_ca$studynr)

Comparing this to your first approach would let you isolate the
consequences of borrowing information.

Your second approach looks at each subject area in separate models, but
includes data from multiple teaching aspects. The effect sizes for
different teaching aspects are modeled as correlated due to the study-level
random effect (which is now common across teaching aspects) and to the
assumed correlation between ES estimates. Again, because of these
correlations, the average effects for each teaching aspect will be
estimated by borrowing information across teaching aspects. You could adapt
the code above but use subgroups by tq instead of by subject to isolate how
the borrowing of information affects the estimates from your second set of
models.

So which approach is correct? I would argue that this is a question that
requires context knowledge to answer. Is it *theoretically* reasonable to
partially pool across subjects? Or to partially pool across teaching
aspects? Or both (i.e., do all subjects and all teaching aspects in one
model)?

James

On Fri, Apr 28, 2023 at 5:38 AM Röhl, Sebastian via R-sig-meta-analysis <
r-sig-meta-analysis using r-project.org> wrote:

> Hello,
> I am currently conducting an analysis of about 500 ES in 50 teaching
> studies. Two ordinal moderators appear here: The teaching subject (e.g.
> math, language...) and the teaching aspect (e.g. athmosphere, clarity...).
> I'm using a correlated and hierarchical model using
> "impute_covariance_matrix" from clubSandwich package.
> I have a problem with differing estimates using different ways in
> conducting moderator analyses.
> For example, I would like to analyze how ES for athmosphere differ between
> math and language subjects. A simple selection of the subset via "subset ="
> in the rma.mv function is not possible, because the imputed covariance
> matrix does not have the correct dimensioning.
>
> I have compared two possibilities (syntaxes at the end of this message):
> A: I select only all ES related to athmosphere in math and language
> subjects and impute the covariance matrix for them. Then I analyze them
> with the subject as moderator, using a model without intercept.
> Result:
>                  estimate      se¹     tval¹  df¹    pval¹    ci.lb¹
> ci.ub¹
> subject_math       0.1262  0.0204    6.1888   12   <.0001    0.0818
> 0.1707   ***
> subject_langall   -0.0809  0.0102   -7.9658   12   <.0001   -0.1030
>  -0.0588   ***
> (these are nearly the same estimates as using two subsets for ES with
> "math & atmosphere" and "language & atmosphere" and conducting separate
> analyses.)
>
> B: I form two subsets for the subjects math and language, impute the
> covariance matrix and analyze "atmosphere" for the two subsets separately
> with the teaching aspects as moderator, using a model without intercept.
>
> Math:          tq_ca           0.2088  0.0378   5.5213   17   <.0001
> 0.1290   0.2885   ***
>
> Language:      tq_ca          -0.0227 0.0230 -0.9863 2.42 0.4120 -0.1065
> 0.0614
>
> The numbers of effect sizes and studies are correct in the respective
> subsets and analyses.
> Noteworthy: an analysis of all teaching aspects across all subjects in an
> interceptless model yields an estimate of about 0.20 for "atmosphere". In
> contrast, if I select only the subset with athmosphere ES, the result is an
> Estimate of 0.13.
> Where do these large differences in the estimates come from and what would
> be the correct approach?
>
> Thanks a lot for your help!
>
> Best,
> Sebastian
>
>
> Syntax A:
> daten_ca <- subset(daten, (daten$tq_ca==1 & (daten$subject_math==1 |
> daten$subject_langall==1)))
> V_mat_ca <- impute_covariance_matrix(daten_ca$class_var,
>                                        cluster = daten_ca$studynr,
>                                        r = rho,
>                                        smooth_vi = TRUE)
> model_ca <- rma.mv(r_gesamt_z, V_mat_ca, random =~ 1 | studynr / nummer,
>                         mods =~ -1 + subject_math + subject_langall,
>                         data=daten_ca)
> robust(model_ca, daten_ca$studynr)
>
> Syntax B:
> daten_math <- subset(daten, daten$subject_math==1)
> daten_langall <- subset(daten, daten$subject_langall==1)
> V_mat_math <- impute_covariance_matrix(daten_math$class_var,
>                                        cluster = daten_math$samplenr,
>                                        r = rho,
>                                        smooth_vi = TRUE)
> V_mat_langall <- impute_covariance_matrix(daten_langall$class_var,
>                                        cluster = daten_langall$samplenr,
>                                        r = rho,
>                                        smooth_vi = TRUE)
> model_math <- rma.mv(r_gesamt_z, V_mat_math, random =~ 1 | studynr /
> nummer,
>                      mods =~ -1 + tq_ca + tq_cm + tq_cont + tq_pract +
> tq_assess +
>                        tq_sup_em + tq_sup_learn + tq_adapt + tq_srl +
> tq_all + tq_other,
>                      data=daten_math)
> robust(model_math, daten_math$studynr)
>
> model_langall <- rma.mv(r_gesamt_z, V_mat_langall, random =~ 1 | studynr
> / nummer,
>                      mods =~ -1 + tq_ca + tq_cm + tq_cont + tq_pract +
> tq_assess +
>                        tq_sup_em + tq_sup_learn + tq_adapt + tq_srl +
> tq_all + tq_other,
>                      data=daten_langall)
> robust(model_langall, daten_langall$studynr)
>
> ****************************
> Dr. Sebastian Röhl
> Eberhard Karls Universität Tübingen
> Institute for Educational Science
> Tübingen School of Education (TüSE)
> Wilhelmstraße 31 / Room 302
> D-72074 Tübingen
> Germany
>
> Phone: +49 7071 29-75527
> Fax: +49 7071 29-35309
> Email: sebastian.roehl using uni-tuebingen.de<mailto:
> sebastian.roehl using uni-tuebingen.de>
> Twitter: @sebastian_roehl  @ResTeacherEdu
>
>
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-meta-analysis mailing list @ R-sig-meta-analysis using r-project.org
> To manage your subscription to this mailing list, go to:
> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>

	[[alternative HTML version deleted]]