[R-meta] Difference between univariate and multivariate parameterization

Fri Aug 20 16:06:08 CEST 2021

Ah!! "sampleinstudy" just so happens to be equivalent to a "row_id" **in
this particular dataset**. And in this particular case, the second random
term (~ multsample | sampleinstudy) in reality is modeling the row_id
(within-study heterogeneity).

To help people reading this (n = 0;-), when you say, "just like in the
standard multilevel structure", you mean a standard 3-level model of the
form (~1|study_id/row_id).

I guess the one other time that this confusion happened to me was when I
was looking at "dat.konstantopoulos2011", demonstrating that the data
structure should always take precedence over the syntax!

Super clear and helpful as always,
Luke

On Fri, Aug 20, 2021 at 8:18 AM Viechtbauer, Wolfgang (SP) <
wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:

> For reference, we are discussing this:
>
> list(~ 1 | studyid, ~ multsample | sampleinstudy), struct="DIAG"
>
> where the data structure is like this:
>
> studyid  sampleinstudy  multsample
> 1        1              1
> 1        2              1
> 2        3              0
> 3        4              1
> 3        5              1
> 3        6              1
> 4        7              0
> 5        8              1
> 5        9              1
>
> ~ 1 | studyid adds a random effect corresponding to the study level. This
> is to account for 'between-study heterogeneity'.
>
> ~ multsample | sampleinstudy adds a random effect to the sampleinstudy
> level. For rows where sampleinstudy is the same, rows where multsample = 0
> versus 1 would get different but correlated random effects. However, since
> there is just one row per sampleinstudy, this never happens. So, each row
> is gettings its own random effect (just like in the standard multilevel
> structure). With struct="DIAG", we allow for a different tau^2 for
> multsample = 0 versus 1. So this models 'within-study heterogeneity' and
> allows this variance component to differ for single versus multisample
> studies (and one can then constrain the former to 0 if one likes).
>
> Best,
> Wolfgang
>
> >-----Original Message-----
> >From: Luke Martinez [mailto:martinezlukerm using gmail.com]
> >Sent: Friday, 20 August, 2021 14:37
> >To: Viechtbauer, Wolfgang (SP)
> >Cc: Farzad Keyhan; R meta
> >Subject: Re: [R-meta] Difference between univariate and multivariate
> >parameterization
> >
> >Dear Wolfgang,
> >
> >Many thanks.
> >
> >>>>> "In res5, the two tau^2 values can be thought of as sigma^2_within
> for single
> >vs multi sample studies."
> >
> >I believe my question was why/how in res5 (and res4) models, tau^2 values
> >represent only sigma^2_within?
> >
> >Is it because we have eliminated the off-diagonal elements (by
> struct="DIAG") in
> >"~ multsample | sampleinstudy" or because we have previously defined the
> >sigma^2_between studies using "~ 1 | studyid" and thus tau^2 values in "~
> >multsample | sampleinstudy" can't represent anything other than
> sigma^2_within
> >samples nested in studies?
> >
> >I appreciate your clarification,
> >Luke
> >
> >PS. On the other hand, my understanding is that "sigma^2_between" and
> >"sigma^2_within" are unique to each grouping variable so we can have
> >"sigma^2_between_studies" and "sigma^2_between_study_sample_combinations"
> and the
> >same is true for "sigma^2_withins".
> >
> >On Fri, Aug 20, 2021 at 6:31 AM Viechtbauer, Wolfgang (SP)
> ><wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
> >Dear Luke,
> >
> >tau^2 doesn't mean the same thing across different models. In res5, the
> two tau^2
> >values can be thought of as sigma^2_within for single vs multi sample
> studies.
> >Whether we call something tau^2, sigma^2, or chicken^2 doesn't carry any
> inherent
> >meaning.
> >
> >For example:
> >
> >dat <- dat.crede2010
> >dat <- escalc(measure="ZCOR", ri=ri, ni=ni, data=dat,
> subset=criterion=="grade")
> >
> >dat$studyid.copy <- dat$studyid
> >dat$sampleid.copy <- paste0(dat$studyid, ".", dat$sampleid)
> >rma.mv(yi, vi, random = ~ 1 | studyid/sampleid, data=dat)
> >rma.mv(yi, vi, random = list(~ studyid | studyid.copy, ~ sampleid |
> >sampleid.copy), struct=c("ID","ID"), data=dat)
> >
> >are identical models, but in the first we have two sigma^2 values and in
> the other
> >we have tau^2 and gamma^2 (a bit of a silly example, but just to
> illustrate the
> >point).
> >
> >Best,
> >Wolfgang
> >
> >>-----Original Message-----
> >>From: Luke Martinez [mailto:martinezlukerm using gmail.com]
> >>Sent: Thursday, 19 August, 2021 5:05
> >>To: Viechtbauer, Wolfgang (SP)
> >>Cc: Farzad Keyhan; R meta
> >>Subject: Re: [R-meta] Difference between univariate and multivariate
> >>parameterization
> >>
> >>Dear Wolfgang,
> >>
> >>Thanks for your reply. But, if in the multivariate specification: tau^2 =
> >>sigma^2_between  +  sigma^2_within, then in your suggested "res5" model
> where you
> >>fixed tau2 = 0 for single sample studies, you have killed both
> sigma^2_between +
> >>sigma^2_within, and not just sigma^2_within?
> >>
> >>Am I missing something?
> >>
> >>Thank you very much,
> >>Luke
> >>
> >>On Wed, Aug 18, 2021 at 3:01 PM Viechtbauer, Wolfgang (SP)
> >><wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
> >>It is also possible to formulate a model where sigma^2_within is *not*
> added for
> >>'single sample/estimate studies'. Let's consider this example:
> >>
> >>library(metafor)
> >>
> >>dat <- dat.crede2010
> >>dat <- escalc(measure="ZCOR", ri=ri, ni=ni, data=dat,
> subset=criterion=="grade")
> >>
> >>table(dat$studyid) # most studies are single sample studies
> >>
> >># multilevel model
> >>res1 <- rma.mv(yi, vi, random = ~ 1 | studyid/sampleid, data=dat)
> >>res1
> >>
> >># multivariate parameterization
> >>res2 <- rma.mv(yi, vi, random = ~ factor(sampleid) | studyid, data=dat)
> >>res2
> >>
> >># as a reminder, the multilevel model is identical to this formulation
> >>dat$sampleinstudy <- paste0(dat$studyid, ".", dat$sampleid)
> >>res3 <- rma.mv(yi, vi, random = list(~ 1 | studyid, ~ 1 |
> sampleinstudy),
> >>data=dat)
> >>res3
> >>
> >># logical to indicate for each study whether it is a multi sample study
> >>dat$multsample <- ave(dat$studyid, dat$studyid, FUN=length) > 1
> >>
> >># fit model that allows for a different sigma^2_within for single vs
> multi sample
> >>studies
> >>res4 <- rma.mv(yi, vi, random = list(~ 1 | studyid, ~ multsample |
> >sampleinstudy),
> >>struct="DIAG", data=dat)
> >>res4
> >>
> >># fit model that forces sigma^2_within = 0 for single sample studies
> >>res5 <- rma.mv(yi, vi, random = list(~ 1 | studyid, ~ multsample |
> >sampleinstudy),
> >>struct="DIAG", tau2=c(0,NA), data=dat)
> >>res5
> >>
> >>So this is all possible if you like.
> >>
> >>Best,
> >>Wolfgang
> >>
> >>>-----Original Message-----
> >>>From: R-sig-meta-analysis [mailto:
> r-sig-meta-analysis-bounces using r-project.org] On
> >>>Behalf Of Farzad Keyhan
> >>>Sent: Wednesday, 18 August, 2021 21:32
> >>>To: Luke Martinez
> >>>Cc: R meta
> >>>Subject: Re: [R-meta] Difference between univariate and multivariate
> >>>parameterization
> >>>
> >>>Dear Luke,
> >>>
> >>>In the multivariate specification (model 2), tau^2 = sigma^2_between  +
> >>>sigma^2_within. You can confirm that by your two models' output as well.
> >>>Also, because rho = sigma^2_between / (sigma^2_between  +
> sigma^2_within),
> >>>then, the off-diagonal elements of the matrix can be shown to be
> rho*tau^2
> >>>which again is equivalent to sigma^2_between in model 1's matrix.
> >>>
> >>>Note that sampling errors in a two-estimate study could be different
> hence
> >>>appropriate subscripts will be needed to distinguish between them.
> >>>
> >>>Finally, note that even a study with a single effect size estimate gets
> the
> >>>sigma^2_within, either directly (model 1) or indirectly (model 2) which
> >>>would mean that, that one-estimate study **could** have had more
> estimates
> >>>but it just so happens that it doesn't as a result of some form of
> >>>multi-stage sampling; first studies, and then effect sizes from within
> >>>those studies.
> >>>
> >>>I actually raised this last point a while back on the list (
> >>>
> https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2021-July/002994.html)
> >>>as I found this framework a potentially unrealistic but in the end, it's
> >>>the best approach we have.
> >>>
> >>>Cheers,
> >>>Fred
> >>>
> >>>On Wed, Aug 18, 2021 at 1:30 PM Luke Martinez <martinezlukerm using gmail.com
> >
> >>>wrote:
> >>>
> >>>> Dear Colleagues,
> >>>>
> >>>> Imagine I have two models.
> >>>>
> >>>> Model 1:
> >>>>
> >>>> random = ~1 | study / row_id
> >>>>
> >>>> Model 2:
> >>>>
> >>>> random = ~ row_id | study,  struct = "CS"
> >>>>
> >>>> I understand that the diagonal elements of the variance-covariance
> matrix
> >>>> of a study with two effect size estimates for each model will be:
> >>>>
> >>>> Model 1:
> >>>>
> >>>> VAR(y_ij) = sigma^2_between  +  sigma^2_within + e_ij
> >>>>
> >>>> Model 2:
> >>>>
> >>>> VAR(y_ij) = tau^2 + e_ij
> >>>>
> >>>> Question: In model 2's variance-covariance matrix, what fills the
> role of
> >>>> sigma^2_within (within-study heterogeneity) that exists in model 1's
> >>>> matrix?
> >>>>
> >>>> Thank you very much for your assistance,
> >>>> Luke Martinez
>

	[[alternative HTML version deleted]]