[R-meta] Coding Longitudinal Studies

Sun Sep 12 00:51:20 CEST 2021

Dear Danielle,

The issues you have inquired about have come up multiple times on the
mailing list archived at:
https://stat.ethz.ch/pipermail/r-sig-meta-analysis/. For example, I
found this: https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2021-August/003028.html
to be a very helpful resource.

So, I tend to more generally address your questions (marked by >>>>).

>>>> 1- I was also wondering how I code that in some studies they have independent cohorts performing different exercise treatment vs some studies the same cohort performs different exercise treatments. Would you have a second random effect nesting the groups within each study?

As a starting point, you would want to create columns that give each
cohort (cohort) and treatment (teat) in each study a distinguishable
id. Then in each study, you can compute an effect size (yi) comparing
a treatment versus a control for each cohort, at each time point
(time).

For two random single-post-test studies, one with one cohort, the
other with two cohorts, your dataset might look like:

   study cohort treat time           comparison         yi
1      1      1     1    0 treatment vs control  0.7394220
2      1      1     1    1 treatment vs control  0.2249452
3      1      1     2    0 treatment vs control  0.6425390
4      1      1     2    1 treatment vs control  1.2338803
5      2      1     1    0 treatment vs control -1.1074885
6      2      1     1    1 treatment vs control  0.6196865
7      2      1     2    0 treatment vs control  0.3012036
8      2      1     2    1 treatment vs control  0.1582372
9      2      2     1    0 treatment vs control  1.1909753
10     2      2     1    1 treatment vs control -0.5343208
11     2      2     2    0 treatment vs control  0.1612554
12     2      2     2    1 treatment vs control  0.9449014

That's how you code such studies.

*IF* you theoretically end up having a "huge" dataset such that there
will be many studies with multiple cohorts, each with multiple
treatments, and multiple time points, then there is a potential that
the variation in effects between studies is due to the variation among
cohorts within studies, and a further potential that the variation in
effects among cohorts within studies is due to the variation among
treatments within cohorts within studies, and yet another potential
that the variation in effects among treatments within cohorts within
studies is due to the variation in the timing of their measurements
within treatments within cohorts within studies (or equivalently,
their unique underlying differences given all combinations of study,
cohort, treatment, and time) that each is defined by, *THEN*, all such
sources of variation may be modeled as random effects.

A nearly utopian model for that given the limits of rma.mv() might be:

rma.mv(yi ~ cohort*treat*time, V = Some_V_matrix, random = list(~ time
| study, ~ time | interaction(study, cohort), ~ 1 | interaction(study,
cohort, treat), ~ 1 | interaction(study, cohort, treat, time)), struct
= c("UN","UN") )

This model can "give" you the average true effects of
cohort-group-time combinations. That is, it answers the question: how
each type of treatment effect in each cohort changes over time across
the studies?

This model can also "allow" the studies with a more complete set of
post-tests to "fill-in the gap" for studies with a smaller/incomplete
set of post-tests thereby improving the estimates of average true
effects of cohort-group-time combinations in general (and their
respective estimates of heterogeneity thereof).

In the real world data, you may not have so many sources of variation
(or they may be negligible). Two general simplification strategies
include (1) dropping the lower ends of the hierarchy and/or (2)
modifying the structure of the correlated random-effect (if any).
These two strategies lead to the formation of *many* models.

A few examples include:

rma.mv(yi ~ cohort*treat*time, V = Some_V_matrix, random = list(~ time
| study, ~ time | interaction(study, cohort), ~ 1 | interaction(study,
cohort, treat)), struct = c("UN","UN"))

rma.mv(yi ~ cohort*treat*time, V = Some_V_matrix, random = list(~ time
| study, ~ time | interaction(study, cohort), ~ 1 | interaction(study,
cohort, treat)), struct = c("UN","UN"))

rma.mv(yi ~ cohort*treat*time, V = Some_V_matrix, random = list(~ time
| study, ~ time | interaction(study, cohort), ~ 1 | interaction(study,
cohort, treat)), struct = c("HAR","HAR"))
.
.
.
The goal is to understand the assumptions, fit all these models, and
compare their fit to the data at hand to choose one (or more) among
them.

>>>> 2- Based on my reading I think I would code the random as Time|Study, struct = "AR".

It depends on the data. Please see my previous answer.

>>>> 3- This would allow observations from different studies to be independent (Study), but observations from within the same studies be dependent (Time). Is this correct?

yes.

>>>> 4- My last question is regarding the difference in coding the random effect as ~1|Time/Study and ~Time|Study?

I think Wolfgang has discussed this elsewhere
(https://www.metafor-project.org/doku.php/analyses:konstantopoulos2011).
In short, ~ 1 | Study/Time is a reparametrization of ~Time|Study,
struct = "CS".

Best,
Reza

On Fri, Sep 10, 2021 at 4:27 PM Danielle Hiam
<danielle.hiam using deakin.edu.au> wrote:
>
> Hello,
>
> I am seeking some clarification on longitudinal studies and coding the random effect using rma.mv.
>
> For context the studies have repeated measures across time and some studies have multiple treatments (exercise in my case). Further, some of the studies have an independent cohort performing the different exercise treatments, others use the same cohort to perform different exercise treatments. I am using the fold change (FC) in expression from baseline for each timepoint and the SEM of the FC. I would like to look at the Fold Change in expression across all cohorts and timepoints and amount of heterogeneity amongst the studies. Then I will investigate with moderators in a meta-regression to investigate sources of this heterogeneity.
>
> I have a couple of basic questions regarding the coding
>
>   1.  Based on my reading I think I would code the random as Time|Study, struct = "AR". This would allow observations from different studies to be independent (Study), but observations from within the same studies be dependent (Time). Is this correct?
>   2.  I was also wondering how I code that in some studies they have independent cohorts performing different exercise treatment vs some studies the same cohort performs different exercise treatments.  Would you have a second random effect nesting the groups within each study?
>   3.  My last question is regarding the difference in coding the random effect as ~1|Time/Study and ~Time|Study?
>
> Any help or guidance would be greatly appreciated
> Kind regards,
> Danielle
>
> Important Notice: The contents of this email are intended solely for the named addressee and are confidential; any unauthorised use, reproduction or storage of the contents is expressly prohibited. If you have received this email in error, please delete it and any attachments immediately and advise the sender by return email or telephone.
>
> Deakin University does not warrant that this email and any attachments are error or virus free.
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-meta-analysis mailing list
> R-sig-meta-analysis using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis