[R-meta] Multivariate meta regression and predict for robust estimates
Reza Norouzian
rnorouz|@n @end|ng |rom gm@||@com
Thu Oct 21 19:47:48 CEST 2021
Hi Ivan,
Please see my answers inline.
>> With regards to the SATcoaching example, how so [how come the same level of test doesn't repeat in the same study]?
Note that studies in SATcoaching data have (A), (B) suffixes. Thus,
each of these are different studies. As a result, in the table of test
levels (co-)occurrences below, we can see that the number of studies
(outside parentheses) that contain each (Math or Verbal) or both (Math
and Verbal) levels of test and the corresponding number of effect
sizes/rows (in parentheses below) for each of these cases are the
same:
test Math Verbal
1 Math 29 (29) -
2 Verbal 20 (20) 38 (38)
This means that we don't have the repetition of the same levels of
test in the same study (this answers your next question as well).
There are 20 studies in which Verbal and Math occur only once
together. That's why there is no need for a further level beyond study
and "random = ~ test | study" suffices.
>> You mean no repetition of the same level of outcome occurs within the same sample, perhaps?
See my answer above.
>> 1) The second and third models should effectively be the same, and they are, after adding what was missing to the second one (~ 1 | es_id). While the syntax of the third one makes a lot of sense, I'm struggling to understand the syntax of the second one, and ultimately, why are they the same?
second_model <- rma.mv(yi, V, random = list(~ outcome | study, ~ outcome |
interaction(study, group), ~1|row_id), struct = c("UN","UN"))
third_model <- rma.mv(yi, V, random = list(~ outcome | study, ~ 1|
row_id), struct = "UN"))
I don't know how your data exactly looks, but the equivalence between
the two models would mean that there is not much variation in the
group as a level. Therefore, in practice your second model reduces to
the third model. In other words, study-group combinations = row_ids.
You can roughly translate the two models, respectively, to:
~ 1 | study/group/outcome/row_id
~ 1 | study/outcome/row_id
Now, you can see that, if these two models are the same, it's due to
the group not effectively being a level.
>> 2) When you say "coded for" and "haven't coded for" the design-related feature(s) you are literally referring to having vs not having all "columns" related to study, groups, and outcomes properly aligned, right? I guess it's hard for me to relate as I always have these three together with es_id (or row_id, as you say) as a fourth one.
Yes e.g., meaning that studies can have various groups, outcomes,
times of measurement, independent samples of subjects, . . . .
Potentially, each of these features can be a column in the dataset. In
practice, however, accounting for all these sources of variation may
be a theoretical ideal. Because, rarely do so many studies
simultaneously possess all these features. If they do, then, you would
expect a super large dataset where each feature accounts for the
variation within the feature that sits above it. Here is one example
study where each of the features mentioned above have only two
observed levels (e.g., just pre and post, just two outcomes, just
two...). You can imagine what happens to the number of rows in this
one study, if some of these features have more than two observed
levels!
So all of this means that without your exact data, RQs, goals,
context, and substantive expertise regarding the current state of
literature/phenomenon under study fitting these models wouldn't be
possible. (So, please take this answer as just providing some
intuition/heuristics).
Kind regards,
Reza
study sample group outcome time comparison
1 1 1 1 1 0 treatment vs control
2 1 1 1 1 1 treatment vs control
3 1 1 1 2 0 treatment vs control
4 1 1 1 2 1 treatment vs control
5 1 1 2 1 0 treatment vs control
6 1 1 2 1 1 treatment vs control
7 1 1 2 2 0 treatment vs control
8 1 1 2 2 1 treatment vs control
9 1 2 1 1 0 treatment vs control
10 1 2 1 1 1 treatment vs control
11 1 2 1 2 0 treatment vs control
12 1 2 1 2 1 treatment vs control
13 1 2 2 1 0 treatment vs control
14 1 2 2 1 1 treatment vs control
15 1 2 2 2 0 treatment vs control
16 1 2 2 2 1 treatment vs control
On Thu, Oct 21, 2021 at 3:56 AM Ivan Jukic <ivan.jukic using aut.ac.nz> wrote:
>
> Dear Reza,
>
> thank you for responding and providing such a great example (walkthrough). I'm glad that you covered all three scenarios because I was thinking before about aggregating my effect sizes and therefore "reducing" my data structure from your scenario (3) to scenario (1). It seems that I was on the right track, but I don't want to aggregate effect sizes anymore, so I'll stick with a third scenario you described.
>
> Thank you for correcting yourself (and for responding so late in the night). I really appreciate it!
>
> I actually tried out your examples right after you first responded and realised what's missing in the second model, so all good. With regards to the SATcoaching example, how so? Verbal and math tests are repeated in three studies, but I guess the participants providing these scores are independent (I'm not sure about the study by Burke, though). You mean no repetition of the same level of outcome occurs within the same sample, perhaps?
>
> Based on your response, I would like to add two (related) things.
>
> 1) The second and third models should effectively be the same, and they are, after adding what was missing to the second one (~ 1 | es_id). While the syntax of the third one makes a lot of sense, I'm struggling to understand the syntax of the second one, and ultimately, why are they the same?
>
> 2) When you say "coded for" and "haven't coded for" the design-related feature(s) you are literaly refering to having vs not having all "columns" related to study, groups, and outcomes properly aligned, right? I guess it's hard for me to relate as I always have these three togeather with es_id (or row_id, as you say) as a fourth one.
>
> Thank you very much for your time,
> Ivan
>
>
>
> From: Reza Norouzian <rnorouzian using gmail.com>
> Sent: Thursday, 21 October 2021 7:36 PM
> To: Ivan Jukic <ivan.jukic using aut.ac.nz>
> Cc: r-sig-meta-analysis using r-project.org <r-sig-meta-analysis using r-project.org>
> Subject: Re: [R-meta] Multivariate meta regression and predict for robust estimates
>
> I guess I responded too quickly (1:30 am answer effect:). CORRECTION:
>
> First, if your data is just like clubSandwich::SATcoaching, then yes
> your current model works, as no repetition of the same levels of
> outcome occurs.
>
> Second, in my own second model, you can account for repetition of the
> same levels of outcome by adding random row effects:
>
> rma.mv(yi, V, random = list(~ outcome | study, ~ outcome |
> interaction(study, group), ~1|row_id), struct = c("UN","UN"))
>
> Now, this model will recognize the repetition of the same levels of outcome.
>
> Sorry for the confusion,
> Reza
>
>
> On Thu, Oct 21, 2021 at 12:15 AM Reza Norouzian <rnorouzian using gmail.com> wrote:
> >
> > Dear Ivan,
> >
> > I leave question (B) to James or Wolfgang (or other list members).
> > Regarding question (A), I discuss three situations.
> >
> > First, you current model assumes that in each study, the same levels
> > of outcome don't repeat, something along the lines of:
> >
> > study outcome
> > 1 A
> > 1 B
> > 2 A
> > 2 B
> > 3 B
> > 4 A
> >
> > If your data has the above structure, then your current model seems
> > reasonable. It assumes that levels of outcome are correlated with one
> > another in each study across all studies.
> >
> > Since you have assumed a UN structure and a V matrix, your more
> > frequently occurring levels of outcome lend support to less frequently
> > occurring levels of outcome thereby improving the fixed coefficients
> > (in terms of bias) and the standard errors (in terms of magnitude) of
> > the less frequently occurring levels of outcome.
> >
> > Second, if your data structure is more along the lines of:
> >
> > study group outcome
> > 1 1 A
> > 1 1 B
> > 1 2 A
> > 1 2 B
> > 2 1 A
> > 2 1 B
> > 2 2 A
> > 2 2 B
> > 3 1 B
> > 4 1 A
> >
> > That is, only due to a particular "coded for" design-related feature
> > (e.g., some studies having more than one treatment group), you can
> > have the same levels of outcome (e.g., A) repeated in some studies,
> > then, you can try:
> >
> > rma.mv(yi, V, random = list(~ outcome | study, ~ outcome |
> > interaction(study, group) struct = c("UN","UN"))
> >
> > Or simplify the `struct =` (perhaps to "HCS" in case of overparameterization).
> >
> > This second model assumes that in addition to the study-level
> > correlations between the levels of outcome, we can have separate
> > group-level correlations between the levels of outcome. This will then
> > recognize the repetition of the same levels of outcome due to the
> > existence of multi-group studies.
> >
> > A third situation might be that your data structure is exactly like
> > above (i.e., the same levels of outcome repeat in some studies) but
> > that you "haven't coded for" the design-related feature that has
> > caused that repetition, that is:
> >
> > study outcome row_id
> > 1 A 1
> > 1 B 2
> > 1 A 3
> > 1 B 4
> > 2 A 5
> > 2 B 6
> > 2 A 7
> > 2 B 8
> > 3 B 9
> > 4 A 10
> >
> > Then, you can try:
> >
> > rma.mv(yi, V, random = list(~ outcome | study, ~ 1| row_id, struct = "UN"))
> >
> > This last model shares the same assumption at the study-level with the
> > previous models, but then it simply allows each level of outcome to be
> > heterogeneous (have variation in it) accounting for the repetitions of
> > the same level of outcome.
> >
> > Kind regards,
> > Reza
> >
> >
> >
> > On Wed, Oct 20, 2021 at 10:46 PM Ivan Jukic <ivan.jukic using aut.ac.nz> wrote:
> > >
> > > Dear all,
> > >
> > > Let's say that one wants to perform a multivariate random-effects meta regression where the data structure can be described as follows: 1) There are 2 outcomes; 2) there is a continious moderator of interest; 3) all studies reported on both outcomes; and 4) most of the studies reported multiple effect sizes for at least one of the outcomes. This means that some participants, from certain groups and for a given outcome, provided data multiple times.
> > >
> > > Following the examples below (where 1 is extremely relevant)
> > >
> > > 1. https://www.jepusto.com/imputing-covariance-matrices-for-multi-variate-meta-analysis/
> > > 2. http://www.metafor-project.org/doku.php/analyses:berkey1998
> > > 3. https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2017-August/000097.html
> > >
> > > I would specify the model as follows:
> > >
> > > res <- rma.mv(yi = yi,
> > > V = V,
> > > data = dat,
> > > random = ~ outcome | study,
> > > method = "REML",
> > > test = "t",
> > > slab = study,
> > > struct = "UN",
> > > mods = ~ mod1*outcome)
> > >
> > > A) I'm wondering if this would account for the fact that there are multiple effect sizes coming from the same study for a given outcome? In a "regular" multilevel model, I would typically have study/es_id.
> > >
> > > B) In addition, is anyone aware of the predict function that could be used with robust estimates (e.g., after using coef_test from clubSandwich package)? Predict.rma.mv works wonderfuly in combination with robust from metafor, but I would like to take the advantage of clubSandwich's "CR2" that should in principle lead to more accurate results in small samples.
> > >
> > > There is something similar that apparently works with robu package.
> > > https://rdrr.io/github/zackfisher/robumeta/src/R/predict.robu.R
> > >
> > > Thank you for your time,
> > > Ivan
> > > _______________________________________________
> > > R-sig-meta-analysis mailing list
> > > R-sig-meta-analysis using r-project.org
> > > https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
More information about the R-sig-meta-analysis
mailing list