[R-meta] Multivariate meta regression and predict for robust estimates

Fri Oct 22 05:51:43 CEST 2021

Dear Reza,

Regarding the SATcoaching, I see, and since these ABC identifiers suggest these are all different studies (coming from the same authors) what you're saying makes a lot of sense. I just didn't realise that these samples are independent. Thank you!

>> I don't know how your data exactly looks, but the equivalence between
the two models would mean that there is not much variation in the
group as a level. Therefore, in practice your second model reduces to
the third model. In other words, study-group combinations = row_ids.

1) I don't have groups, perhaps I misled you with the word "group" that I used in my original post. I always have a continuous moderator of interest, two outcomes, and multiple effect sizes from the same study for at least one of the two outcomes. However, I understand exactly what you mean by this. While I wanted to keep this discussion rather conceptual, I'll provide an example of my dataset just to avoid potential confusion (see at the bottom).

2) By "study-group combinations" terminology to describe your row_id you just mean es_id, or number for each entry in the dataset (es_id = 1:n()), isn't it?

3) Maybe I confused you with my message, what I meant by the fact that the models are the same is that they are both returning *identical* outputs (all estimates). Actually, the second model *also* returns gamma and phi estimates which are missing in the third model you described. The second model seems to be overparameterized because "Some combinations of the levels of the inner factor never occurred. Corresponding phi value(s) fixed to 0". This is not happening with the third model, of course. I guess I misspecified it due to the confusion of the data structure?

>>... please take this answer as just providing some intuition/heuristics).

Of course! I really appreciate this conceptual level discussion, and you provided much more than I expected, so thank you very much!

study ---- mod1 ---- outcome ---- es_id
1                  15                  1                  1
1                  15                  1                  2
1                  15                  0                  3
1                  15                  0                  4
1                  30                  1                  5
1                  30                  1                  6
1                  30                  0                  7
1                  30                  0                  8
.
.
.

Cheers,
Ivan

----

From: Reza Norouzian <rnorouzian using gmail.com>
Sent: Friday, 22 October 2021 6:47 AM
To: Ivan Jukic <ivan.jukic using aut.ac.nz>
Cc: r-sig-meta-analysis using r-project.org <r-sig-meta-analysis using r-project.org>
Subject: Re: [R-meta] Multivariate meta regression and predict for robust estimates 

Hi Ivan,

Please see my answers inline.

>> With regards to the SATcoaching example, how so [how come the same level of test doesn't repeat in the same study]?

Note that studies in SATcoaching data have (A), (B) suffixes. Thus,
each of these is different studies. As a result, in the table of test
levels (co-)occurrences below, we can see that the number of studies
(outside parentheses) that contain each (Math or Verbal) or both (Math
and Verbal) levels of test and the corresponding number of effect
sizes/rows (in parentheses below) for each of these cases are the
same:

  test         Math    Verbal
1 Math   29 (29)           -
2 Verbal 20 (20)   38 (38)

This means that we don't have the repetition of the same levels of
test in the same study (this answers your next question as well).
There are 20 studies in which Verbal and Math occur only once
together. That's why there is no need for a further level beyond study
and "random = ~ test | study" suffices.

>> You mean no repetition of the same level of outcome occurs within the same sample, perhaps?

See my answer above.

>> 1) The second and third models should effectively be the same, and they are, after adding what was missing to the second one (~ 1 | es_id). While the syntax of the third one makes a lot of sense, I'm struggling to understand the syntax of the second one, and ultimately, why are they the same?

second_model <- rma.mv(yi, V, random = list(~ outcome | study, ~ outcome |
interaction(study, group), ~1|row_id), struct = c("UN","UN"))

third_model <- rma.mv(yi, V, random = list(~ outcome | study, ~ 1|
row_id), struct = "UN"))

I don't know how your data exactly looks, but the equivalence between
the two models would mean that there is not much variation in the
group as a level. Therefore, in practice your second model reduces to
the third model. In other words, study-group combinations = row_ids.

You can roughly translate the two models, respectively, to:

~ 1 | study/group/outcome/row_id

~ 1 | study/outcome/row_id

Now, you can see that, if these two models are the same, it's due to
the group not effectively being a level.

>> 2) When you say "coded for" and "haven't coded for" the design-related feature(s) you are literally referring to having vs not having all "columns" related to study, groups, and outcomes properly aligned, right? I guess it's hard for me to relate as I always have these three together with es_id (or row_id, as you say) as a fourth one.

Yes e.g., meaning that studies can have various groups, outcomes,
times of measurement, independent samples of subjects, . . . .

Potentially, each of these features can be a column in the dataset. In
practice, however, accounting for all these sources of variation may
be a theoretical ideal. Because, rarely do so many studies
simultaneously possess all these features. If they do, then, you would
expect a super large dataset where each feature accounts for the
variation within the feature that sits above it. Here is one example
study where each of the features mentioned above have only two
observed levels (e.g., just pre and post, just two outcomes, just
two...). You can imagine what happens to the number of rows in this
one study, if some of these features have more than two observed
levels!

So all of this means that without your exact data, RQs, goals,
context, and substantive expertise regarding the current state of
literature/phenomenon under study fitting these models wouldn't be
possible. (So, please take this answer as just providing some
intuition/heuristics).