[R-meta] Covariance matrix specification for multiple levels + specifying random slopes model

Fri Feb 24 02:37:07 CET 2023

Hi Reza and msg board,

Thanks for your detailed reply. Here is an illustration of the type of data structure we have 264 unique es_ids :

timepoint = continuous measure ranging from 0 - ~120 (i.e. months since surgery in this case)
cohort = 117 unique cohorts, effectively the same as ’study' but some cohorts report follow-ups (i.e. different timepoints) over multiple studies, hence accounting for their non-independence
outcome = 3 level factor. Most cohorts only report one outcome, but some will report 2 or 3 (correlated measures), over time.

My effect measure is the RoM / RR.

Ok so for use of vcalc I have specified the following - confirming this would be accurate?:

vcalc(vi = vi,
           cluster = cohort,
           time1 = timepoint,
           obs = outcome,
           rho = 0.6, # estimated correlation between outcomes (based on previous research)
           phi = 0.85, # estimated correlation between timepoints (based on our own data)
           data = data_quad,
           checkpd = TRUE)

Prior to this I had used impute_covariance_matrix at the ‘highest level of clustering’ as was discussed by James and others in this answer https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2019-March/001485.html, but I realise this is more approximate (less accurate?) than the approach above. Interestingly when I use the approach above, and I check for positive-definite in vcalc it passes fine, however when using it in rma.mv it then flags that the matrix is not pd. I checked it with is.positive.definite from matrixcalc also and it was fine. Any ideas what might be causing this?

Thanks for your discussion on the different set ups, I think for our questions these approaches are less relevant but I appreciate you taking the time to make these suggestions, it’s certainly interesting to think about.

Thanks for your patience with me, I realise these types of questions are very common on this board, and have spent a fair bit of time looking through the archives, however wanted to make sure that this was appropriate for our approach and effect measure. I am very grateful for everyone’s help!

Mick

On 23 Feb 2023, at 19:13, Reza Norouzian <rnorouzian using gmail.com<mailto:rnorouzian using gmail.com>> wrote:

Dear Mick,

As mentioned in your previous post, it might be helpful to provide a bit of data structure (e.g., `subset(data_quad, cohort %in% unique(cohort)[1:2])[c("cohort","timepoint","outcome",...)]` or similar) and/or some additional context. Otherwise, any advice will primarily be based on your syntax.

Regarding the V matrix, Wolfgang has a flexible `vcalc()` function that he can hopefully help you with.

Please see my responses inline regarding your model-related queries:

>>>>> At the moment we landed on a potential model specification looking like this:

rma.mv<http://rma.mv>(yi, V,
                random = list( ~ timepoint | interaction(cohort, outcome), ~ outcome | cohort),
                mods = ~ outcome : log(timepoint),
                data = data_quad,
                struct = c("CAR", "HCS”))

Yes, if you want to borrow strength from other (more frequently co-occurring) levels of outcome in cohorts, it is useful to use a single model and let the outcome levels (co)vary according to some covariational structure within cohorts (ideally "UN" but if not possible, perhaps a simpler structure like "HCS" etc.).

However, regardless, the above model would likely require a bit of modification to improve interpretation. Specifically, since you are using "CAR" to measure the effect of timepoint in cohorts, it would be wiser to distinguish between within-cohort effect of timepoint and between-cohort effect of timepoint by first aggregating the timepoint at the cohort level to enable capturing between-cohort changes in timepoint:

data_quad$timepoint_Bet  <- with(data_quad, ave(timepoint, cohort))

And subtracting those aggregates from the timepoints to enable capturing within-cohort changes in timepoint:

data_quad$timepoint_Wit <- with(data_quad, timepoint - timepoint_Bet)

And then refitting your initial model as:

rma.mv<http://rma.mv>(yi, V,
       random = list( ~  timepoint_Wit | interaction(cohort, outcome), ~ outcome | cohort),
       mods = ~ log(timepoint_Bet) + log(timepoint_Wit):outcome,
       data = data_quad,
       struct = c("CAR", "HCS"))

Depending on your research question/study goals, you may then want to interpret the fixed effect of timepoint_Wit to your audience to show the effect of timepoint within the cohorts (typically of immediate substantive interest) and/or the cohort-aggregated effect of timepoint (timepoint_Bet) which typically is not of immediate substantive interest (but to remain in the model to help reduce the bias in the other model coefficients).

>>>>> Another approach we considered was using a random slopes and intercept model, with a random slope for outcome (3 levels). Would this just be a case of changing the struct to “GEN” for the 2nd parameter? But then that is outcomes nested within cohorts. So it just needs to be 1 | outcome then?

The "GEN" structure for your *outcome* variable is essentially a more general form of "UN". For example, compare the equivalency between the following two models:

(res1 <- rma.mv<http://rma.mv>(yi~ outcome + 0, vi, data =  dat.berkey1998,
                random = ~ outcome | trial, struct = "UN"))

(res2 <- rma.mv<http://rma.mv>(yi~ outcome + 0, vi, data =  dat.berkey1998,
                random = ~ outcome + 0 | trial, struct = "GEN"))

fitstats(res1, res2)

So, fitting a "GEN" structure just like a "UN" structure might be a bit difficult in practice, which I'm assuming was why you opted for a simpler "HCS" structure. So, your current HCS structure which nests the outcomes in cohorts should work as long as you have checked its performance.

That said, you could *in theory* add a GEN structure to capture the variation in the intercepts and slopes of cohort-specific regression lines fit onto the true effects in cohorts, but that likely will be an overfit, plus rma.mv<http://rma.mv>() won't support three *~inner | outer* random terms in a single model:

## DON'T RUN:
rma.mv<http://rma.mv>(yi, V,
       random = list( ~  timepoint_Wit | interaction(cohort, outcome), ~  timepoint_Wit | interaction(cohort, outcome), ~ outcome | cohort),
       mods = ~  log(timepoint_Bet) + log(timepoint_Wit):outcome,
       data = data_quad,
       struct = c("GEN","CAR", "HCS"))

Please think of these discussions as simply suggestions primarily based on your original syntax.

Kind regards,
Reza

On Wed, Feb 22, 2023 at 7:00 PM Mick Girdwood via R-sig-meta-analysis <r-sig-meta-analysis using r-project.org<mailto:r-sig-meta-analysis using r-project.org>> wrote:
Hi everyone,

Thank you in advance for your help, the advice from this board is always friendly and indispensable.

I am conducting a longitudinal meta-analysis, (thank you Reza for your advice relating to a previous question to do with this https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2023-January/004349.html<https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2023-January/004349.html> ). I have two things I was hoping you could help me with. Similar to this previous question here https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2021-April/002794.html<https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2021-April/002794.html>

At the moment I had planned to use the 'longitudinal meta-analysis' per measure (outcome) of interest. However it occurred to our team that we have many similar measures of the same outcome (in this case quadriceps muscle strength). We wondered whether it would be better to use all this information in one analysis (instead of running separately for each, and use each measure to borrow strength). The reason being that the different measures are correlated with each other (as well as the timepoints). At the moment we landed on a potential model specification looking like this:

rma.mv<http://rma.mv>(yi, V,
                random = list( ~ timepoint | interaction(cohort, outcome), ~ outcome | cohort),
                mods = ~ outcome : log(timepoint),
                data = data_quad,
                struct = c("CAR", "HCS”))

We have 3 different levels for outcome and timepoint is a continuous measure (hence CAR)

1. The thing I’m struggling with is how to get an appropriate V covariance matrix. Previously before including outcome I had used impute_covariance_matrix() with ‘cohort’ as the cluster. But now there is another level of correlated effects to consider - outcome. How can I account for this? Or should I just fix phi as the correlation parameter for the 2nd random component relating to outcome?

2. Another approach we considered was using a random slopes and intercept model, with a random slope for outcome (3 levels). Would this just be a case of changing the struct to “GEN” for the 2nd parameter? But then that is outcomes nested within cohorts. So it just needs to be 1 | outcome then?

Thank you so much for your help, I am indebted!

Mick Girdwood
La Trobe University | Australia

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-meta-analysis mailing list @ R-sig-meta-analysis using r-project.org<mailto:R-sig-meta-analysis using r-project.org>
To manage your subscription to this mailing list, go to:
https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis<https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis>

	[[alternative HTML version deleted]]