[R-meta] Multivariate meta regression and predict for robust estimates

Fri Oct 22 07:46:33 CEST 2021

With apologies for posting in HTML.

Yes, you seem to be in scenario three. There are some systematic features
in studies that have led to the repetition of the same outcome levels in
each study. But these features are absent in the data. A starting point
might be (to take advantage of the possible boost in precision etc. as a
result of correlated REs):

random = list(~ outcome | study, ~ 1 | es_id), struct = "UN"

Yes, es_id encapsulates all the uniquenesses (the combination of the
features that I referred to in my previous post) that define a single row.
If you already have coded for all these features along the way, then
following them takes you right to a single row.

For example, imagine you have 40 studies, each with two treatment groups
(and you have coded for this), and two outcomes (and you have coded for
this).
Thus, you have 40 four-row studies. Here, knowing the study number, group
number, and the outcome indicator gets you to a single row in the data.

That is, the combination of study 1, group 1, outcome A gets you to the
first row, 1,1,B gets you to the second one, 1,2,A to the third one, and
1,2,B to the fourth one.
[image: image.png]

In this case, use of row_id is simply pointless, because you already have
all that is needed to travel the distance from a study-level true effect up
to a single row's true effect.

If you want to play around with syntax, here is some data:

data <- read.csv("https://raw.githubusercontent.com/rnorouzian/s/main/in.csv
")

These combinations are encapsulated in the "study/group/outcome" piece in
the following model:

(m1 <- rma.mv(yi, vi, random = list(~1|study/group/outcome), data = data))

Now, if you force row_id onto the model above, you'll see that row_id will
fill no new place here but just pushes the one already occupied by the last
piece that essentially holds the GPS location of each single row i.e,
.../outcome.

(m2 <- rma.mv(yi, vi, random = list(~1|study/group/outcome/row_id), data =
data))

Thus, this also means that if we remove outcome and instead use row_id we
exactly reproduce m1:

(m3 <- rma.mv(yi, vi, random = list(~1|study/group/row_id), data = data))

Kind regards,
Reza

On Thu, Oct 21, 2021 at 10:51 PM Ivan Jukic <ivan.jukic using aut.ac.nz> wrote:

> Dear Reza,
>
> Regarding the SATcoaching, I see, and since these ABC identifiers suggest
> these are all different studies (coming from the same authors) what you're
> saying makes a lot of sense. I just didn't realise that these samples are
> independent. Thank you!
>
> >> I don't know how your data exactly looks, but the equivalence between
> the two models would mean that there is not much variation in the
> group as a level. Therefore, in practice your second model reduces to
> the third model. In other words, study-group combinations = row_ids.
>
> 1) I don't have groups, perhaps I misled you with the word "group" that I
> used in my original post. I always have a continuous moderator of interest,
> two outcomes, and multiple effect sizes from the same study for at least
> one of the two outcomes. However, I understand exactly what you mean by
> this. While I wanted to keep this discussion rather conceptual, I'll
> provide an example of my dataset just to avoid potential confusion (see at
> the bottom).
>
> 2) By "study-group combinations" terminology to describe your row_id you
> just mean es_id, or number for each entry in the dataset (es_id = 1:n()),
> isn't it?
>
> 3) Maybe I confused you with my message, what I meant by the fact that the
> models are the same is that they are both returning *identical* outputs
> (all estimates). Actually, the second model *also* returns gamma and phi
> estimates which are missing in the third model you described. The second
> model seems to be overparameterized because "Some combinations of the
> levels of the inner factor never occurred. Corresponding phi value(s) fixed
> to 0". This is not happening with the third model, of course. I guess I
> misspecified it due to the confusion of the data structure?
>
> >>... please take this answer as just providing some intuition/heuristics).
>
> Of course! I really appreciate this conceptual level discussion, and you
> provided much more than I expected, so thank you very much!
>
> study ---- mod1 ---- outcome ---- es_id
> 1                  15                  1                  1
> 1                  15                  1                  2
> 1                  15                  0                  3
> 1                  15                  0                  4
> 1                  30                  1                  5
> 1                  30                  1                  6
> 1                  30                  0                  7
> 1                  30                  0                  8
> .
> .
> .
>
> Cheers,
> Ivan
>
> ----
>
> From: Reza Norouzian <rnorouzian using gmail.com>
> Sent: Friday, 22 October 2021 6:47 AM
> To: Ivan Jukic <ivan.jukic using aut.ac.nz>
> Cc: r-sig-meta-analysis using r-project.org <r-sig-meta-analysis using r-project.org>
> Subject: Re: [R-meta] Multivariate meta regression and predict for robust
> estimates
>
> Hi Ivan,
>
> Please see my answers inline.
>
> >> With regards to the SATcoaching example, how so [how come the same
> level of test doesn't repeat in the same study]?
>
> Note that studies in SATcoaching data have (A), (B) suffixes. Thus,
> each of these is different studies. As a result, in the table of test
> levels (co-)occurrences below, we can see that the number of studies
> (outside parentheses) that contain each (Math or Verbal) or both (Math
> and Verbal) levels of test and the corresponding number of effect
> sizes/rows (in parentheses below) for each of these cases are the
> same:
>
>   test         Math    Verbal
> 1 Math   29 (29)           -
> 2 Verbal 20 (20)   38 (38)
>
> This means that we don't have the repetition of the same levels of
> test in the same study (this answers your next question as well).
> There are 20 studies in which Verbal and Math occur only once
> together. That's why there is no need for a further level beyond study
> and "random = ~ test | study" suffices.
>
> >> You mean no repetition of the same level of outcome occurs within the
> same sample, perhaps?
>
> See my answer above.
>
> >> 1) The second and third models should effectively be the same, and they
> are, after adding what was missing to the second one (~ 1 | es_id). While
> the syntax of the third one makes a lot of sense, I'm struggling to
> understand the syntax of the second one, and ultimately, why are they the
> same?
>
> second_model <- rma.mv(yi, V, random = list(~ outcome | study, ~ outcome |
> interaction(study, group), ~1|row_id), struct = c("UN","UN"))
>
> third_model <- rma.mv(yi, V, random = list(~ outcome | study, ~ 1|
> row_id), struct = "UN"))
>
> I don't know how your data exactly looks, but the equivalence between
> the two models would mean that there is not much variation in the
> group as a level. Therefore, in practice your second model reduces to
> the third model. In other words, study-group combinations = row_ids.
>
> You can roughly translate the two models, respectively, to:
>
> ~ 1 | study/group/outcome/row_id
>
> ~ 1 | study/outcome/row_id
>
> Now, you can see that, if these two models are the same, it's due to
> the group not effectively being a level.
>
> >> 2) When you say "coded for" and "haven't coded for" the design-related
> feature(s) you are literally referring to having vs not having all
> "columns" related to study, groups, and outcomes properly aligned, right? I
> guess it's hard for me to relate as I always have these three together with
> es_id (or row_id, as you say) as a fourth one.
>
> Yes e.g., meaning that studies can have various groups, outcomes,
> times of measurement, independent samples of subjects, . . . .
>
> Potentially, each of these features can be a column in the dataset. In
> practice, however, accounting for all these sources of variation may
> be a theoretical ideal. Because, rarely do so many studies
> simultaneously possess all these features. If they do, then, you would
> expect a super large dataset where each feature accounts for the
> variation within the feature that sits above it. Here is one example
> study where each of the features mentioned above have only two
> observed levels (e.g., just pre and post, just two outcomes, just
> two...). You can imagine what happens to the number of rows in this
> one study, if some of these features have more than two observed
> levels!
>
> So all of this means that without your exact data, RQs, goals,
> context, and substantive expertise regarding the current state of
> literature/phenomenon under study fitting these models wouldn't be
> possible. (So, please take this answer as just providing some
> intuition/heuristics).
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://stat.ethz.ch/pipermail/r-sig-meta-analysis/attachments/20211022/da7941d1/attachment-0001.html>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 28079 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-meta-analysis/attachments/20211022/da7941d1/attachment-0001.png>