[R-meta] terminologies of multilevel and multivariate model when accounting for correlated errors

Tue Oct 4 09:54:39 CEST 2022

Dear Yefeng,

Please see below for my comments.

Best,
Wolfgang

>-----Original Message-----
>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org] On
>Behalf Of Yefeng Yang
>Sent: Tuesday, 04 October, 2022 4:41
>To: r-sig-meta-analysis using r-project.org
>Subject: [R-meta] terminologies of multilevel and multivariate model when
>accounting for correlated errors
>
>Hi all (especially Wofgang & James),
>
>My questions: I am confused about whether should we call a multilevel model with
>a VCV matrix accounting for sampling variances still a multilevel model OR should
>we call it a multivariate model
>
>I elaborate on my questions as follows:
>
>For statistically dependent effect sizes, we usually have two[1] 'typical' models
>to deal with.
>
>  1.  For dependence due to multilevel/nested structure (one study contributes
>more than one effect size estimate), we usually use a multilevel model (with a
>nested random effect structure) to account for the non-independence if there are
>'overlapping individuals' (no correlated sampling errors).

If there are overlapping individuals (i.e., the same individuals are used in computing multiple effect size estimates), then the sampling errors *are* correlated, so I am a bit confused here.

So, let me assume for the moment that there are *no* overlapping individuals, but a study can still yield multiple effect size estiamtes (e.g., for different subgroups). Example of this are:

https://www.metafor-project.org/doku.php/analyses:konstantopoulos2011
https://www.metafor-project.org/doku.php/analyses:crede2010

The model typically used in a multilevel model with 'random = ~ 1 | study/obs' as the random effects structure. However, note that we can reformulate this model into a multivariate parameterization with 'random = ~ obs | study', which is identical in fit (as long as the estimate of rho > 0).

So, already, I would say the terminology is a bit arbitrary, since we could call this a multilevel or a multivariate model.

>  2.  For dependence due to multivariate structure (one study contributes more
>than one response variable or outcome), we usually use a multivariate model (with
>a correlated random effect structure) to account for the non-independence. Also,
>we should use a variance-covariance matrix to account for the independent
>sampling errors (either guessing within-study correlation or using formulas).

An example of this would be:

https://www.metafor-project.org/doku.php/analyses:berkey1998

This would be a 'classical' multivariate meta-analysis and I think most people would call it that.

>[1] robust variance estimation (RVE) is also a good approach to dealing with
>dependent effect sizes in terms of estimating fixed effects (overall effect
>intercept beta0 or moderator effect slope beta1).  The combination of the RVE
>with either multilevel or multivariate is also an elegant approach. But RVE is
>not the focus of my question.
>
>However, sometimes we want to use the multilevel model to deal with all types of
>independence. By doing so, we reformulate the multivariate structure of the data
>as multilevel/nested data. I mean we: (1) use dummy codes to denote different
>types of response variables/outcomes, (2) impute or calculate a VCV matrix, and
>(3) fit a multilevel model.  Through (1) - (3) steps, I account for all types of
>independence: the correlations between true outcomes and sampling errors. Not
>100% sure, but this approach should work well.

An example along those lines would be (leaving aside the RVE stuff):

https://wviechtb.github.io/metadat/reference/dat.assink2016.html

or briefly:

dat <- dat.assink2016
V <- vcalc(vi, cluster=study, obs=esid, data=dat, rho=0.6)
rma.mv(yi, V, mods = ~ deltype, random = ~ 1 | study/esid, data=dat)

Again, this can be reformulated into:

rma.mv(yi, V, mods = ~ deltype, random = ~ esid | study, data=dat)

with identical fit. So, is this now a multilevel or multivariate model? I would say either term is fine. But the terms are so broad anyway that they communicate very little what was actually done, so either way, one should provide further details (with respect to V and the random-effects structure).

>So my question comes: I use a multilevel model but I also use a VCV matrix. What
>will a multilevel model with a VCV be called? Still a multilevel model, but a
>multilevel model assumes independent sampling errors (but we have a VCV in the
>model). Should it be a multivariate model, but we did not account for the
>correlated random effects only account for the correlated sampling errors? Hope
>my question is clear.
>
>Best,
>
>Yefeng Yang PhD
>Research Associate
>UNSW, Sydney