[R-meta] Difference between univariate and multivariate parameterization

Fri Aug 20 17:11:50 CEST 2021

Got it. Also, this strategy, even though Fred described it as a
"theoretical" possibility that shouldn't be used, seems to perhaps help
reduce the burden of model fitting especially in case of convergence issues
due to such single sample, single group,. . . studies getting a
within-heterogeneity components that they don't need. But I'm sure
experts/reviewers don't agree with fitting such models, as these models
have been built around a theory of mult-stage sampling.

Thanks again,
Luke

On Fri, Aug 20, 2021 at 9:53 AM Viechtbauer, Wolfgang (SP) <
wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:

> Please see below.
>
> Best,
> Wolfgang
>
> >-----Original Message-----
> >From: Luke Martinez [mailto:martinezlukerm using gmail.com]
> >Sent: Friday, 20 August, 2021 16:41
> >To: Viechtbauer, Wolfgang (SP)
> >Cc: Farzad Keyhan; R meta
> >Subject: Re: [R-meta] Difference between univariate and multivariate
> >parameterization
> >
> >Sure. So, for other data structures (below) where paste0(dat$study, ".",
> >dat$sample) alone can't produce unique rows (perhaps due to another nested
> >grouping variable like "grp"), the "within study heterogeneity" is given
> by
> >paste0(dat$study, ".", dat$sample)  and "within sample heterogeneity" is
> given by
> >paste0(dat$study, ".", dat$sample, dat$grp), correct?
>
> Correct, that is,
>
> rma.mv(yi, vi, random = ~ 1 | study/sample/grp)
>
> would be identical to
>
> dat$studysample    <- paste0(dat$study, dat$sample)
> dat$studysamplegrp <- paste0(dat$study, dat$sample, dat$grp)
>
> rma.mv(yi, vi, random = list(~ 1 | study, ~ 1 | studysample, ~ 1 |
> studysamplegrp))
>
> >Which then would mean that Fred's comment about studies with a single
> estimate
> >getting the same sigma^2_within that they shouldn't becomes studies with
> a single
> >sample and a single group getting two sigma^2_withins (one for study, one
> for
> >sample) that they shouldn't, correct?
>
> I am not able to fully parse your question. But yes, in the models above,
> the diagonal of the marginal (model-implied) var-cov matrix is
>
> sigma^2_study + sigma^2_studysample + sigma^2_studysamplegrp + v_ijk
>
> (three subscripts on v for studies, samples, and groups), so irrespective
> of the type of study (whether it has one or multiple samples) and
> irrespective of the type of sample (whether it contains one or multiple
> groups), all three variance-components (plus the sampling variance) are
> being added. If one deems this not to be sensible, one could extent the
> idea I described previously to constrain the studysample variance component
> to 0 for single-sample studies and the studysamplegrp variance component to
> 0 for single-group studies.
>
> >Thank you, Luke
> >
> >study sample grp
> >1     1      1
> >1     1      2
> >1     2      1
> >1     2      2
> >2     1      1
> >2     1      2
> >
> >On Fri, Aug 20, 2021 at 9:14 AM Viechtbauer, Wolfgang (SP)
> ><wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
> >Note that:
> >
> >dat$sampleinstudy <- paste0(dat$studyid, ".", dat$sampleid)
> >
> >was used to create this variable (and indeed, it takes on a unique value
> per row).
> >
> >This is in fact in essence what happens when you fit the model:
> >
> >rma.mv(yi, vi, random = ~ 1 | studyid/sampleid, data=dat)
> >
> >and hence this is the same model:
> >
> >rma.mv(yi, vi, random = list(~ 1 | studyid, ~ 1 | sampleinstudy),
> data=dat)
> >
> >And the latter can then be extended with:
> >
> >rma.mv(yi, vi, random = list(~ 1 | studyid, ~ multsample |
> sampleinstudy),
> >struct="DIAG", data=dat)
> >
> >as discussed.
> >
> >Best,
> >Wolfgang
> >
> >>-----Original Message-----
> >>From: Luke Martinez [mailto:martinezlukerm using gmail.com]
> >>Sent: Friday, 20 August, 2021 16:06
> >>To: Viechtbauer, Wolfgang (SP)
> >>Cc: Farzad Keyhan; R meta
> >>Subject: Re: [R-meta] Difference between univariate and multivariate
> >>parameterization
> >>
> >>Ah!! "sampleinstudy" just so happens to be equivalent to a "row_id" **in
> this
> >>particular dataset**. And in this particular case, the second random
> term (~
> >>multsample | sampleinstudy) in reality is modeling the row_id
> (within-study
> >>heterogeneity).
> >>
> >>To help people reading this (n = 0;-), when you say, "just like in the
> standard
> >>multilevel structure", you mean a standard 3-level model of the form
> >>(~1|study_id/row_id).
> >>
> >>I guess the one other time that this confusion happened to me was when I
> was
> >>looking at "dat.konstantopoulos2011", demonstrating that the data
> structure
> >should
> >>always take precedence over the syntax!
> >>
> >>Super clear and helpful as always,
> >>Luke
> >>
> >>On Fri, Aug 20, 2021 at 8:18 AM Viechtbauer, Wolfgang (SP)
> >><wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
> >>For reference, we are discussing this:
> >>
> >>list(~ 1 | studyid, ~ multsample | sampleinstudy), struct="DIAG"
> >>
> >>where the data structure is like this:
> >>
> >>studyid  sampleinstudy  multsample
> >>1        1              1
> >>1        2              1
> >>2        3              0
> >>3        4              1
> >>3        5              1
> >>3        6              1
> >>4        7              0
> >>5        8              1
> >>5        9              1
> >>
> >>~ 1 | studyid adds a random effect corresponding to the study level.
> This is to
> >>account for 'between-study heterogeneity'.
> >>
> >>~ multsample | sampleinstudy adds a random effect to the sampleinstudy
> level. For
> >>rows where sampleinstudy is the same, rows where multsample = 0 versus 1
> would
> >get
> >>different but correlated random effects. However, since there is just
> one row per
> >>sampleinstudy, this never happens. So, each row is gettings its own
> random effect
> >>(just like in the standard multilevel structure). With struct="DIAG", we
> allow
> >for
> >>a different tau^2 for multsample = 0 versus 1. So this models
> 'within-study
> >>heterogeneity' and allows this variance component to differ for single
> versus
> >>multisample studies (and one can then constrain the former to 0 if one
> likes).
> >>
> >>Best,
> >>Wolfgang
> >>
> >>>-----Original Message-----
> >>>From: Luke Martinez [mailto:martinezlukerm using gmail.com]
> >>>Sent: Friday, 20 August, 2021 14:37
> >>>To: Viechtbauer, Wolfgang (SP)
> >>>Cc: Farzad Keyhan; R meta
> >>>Subject: Re: [R-meta] Difference between univariate and multivariate
> >>>parameterization
> >>>
> >>>Dear Wolfgang,
> >>>
> >>>Many thanks.
> >>>
> >>>>>>> "In res5, the two tau^2 values can be thought of as sigma^2_within
> for
> >>single
> >>>vs multi sample studies."
> >>>
> >>>I believe my question was why/how in res5 (and res4) models, tau^2
> values
> >>>represent only sigma^2_within?
> >>>
> >>>Is it because we have eliminated the off-diagonal elements (by
> struct="DIAG") in
> >>>"~ multsample | sampleinstudy" or because we have previously defined the
> >>>sigma^2_between studies using "~ 1 | studyid" and thus tau^2 values in
> "~
> >>>multsample | sampleinstudy" can't represent anything other than
> sigma^2_within
> >>>samples nested in studies?
> >>>
> >>>I appreciate your clarification,
> >>>Luke
> >>>
> >>>PS. On the other hand, my understanding is that "sigma^2_between" and
> >>>"sigma^2_within" are unique to each grouping variable so we can have
> >>>"sigma^2_between_studies" and
> "sigma^2_between_study_sample_combinations" and
> >the
> >>>same is true for "sigma^2_withins".
> >>>
> >>>On Fri, Aug 20, 2021 at 6:31 AM Viechtbauer, Wolfgang (SP)
> >>><wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
> >>>Dear Luke,
> >>>
> >>>tau^2 doesn't mean the same thing across different models. In res5, the
> two
> >tau^2
> >>>values can be thought of as sigma^2_within for single vs multi sample
> studies.
> >>>Whether we call something tau^2, sigma^2, or chicken^2 doesn't carry any
> >inherent
> >>>meaning.
> >>>
> >>>For example:
> >>>
> >>>dat <- dat.crede2010
> >>>dat <- escalc(measure="ZCOR", ri=ri, ni=ni, data=dat,
> subset=criterion=="grade")
> >>>
> >>>dat$studyid.copy <- dat$studyid
> >>>dat$sampleid.copy <- paste0(dat$studyid, ".", dat$sampleid)
> >>>rma.mv(yi, vi, random = ~ 1 | studyid/sampleid, data=dat)
> >>>rma.mv(yi, vi, random = list(~ studyid | studyid.copy, ~ sampleid |
> >>>sampleid.copy), struct=c("ID","ID"), data=dat)
> >>>
> >>>are identical models, but in the first we have two sigma^2 values and
> in the
> >>other
> >>>we have tau^2 and gamma^2 (a bit of a silly example, but just to
> illustrate the
> >>>point).
> >>>
> >>>Best,
> >>>Wolfgang
> >>>
> >>>>-----Original Message-----
> >>>>From: Luke Martinez [mailto:martinezlukerm using gmail.com]
> >>>>Sent: Thursday, 19 August, 2021 5:05
> >>>>To: Viechtbauer, Wolfgang (SP)
> >>>>Cc: Farzad Keyhan; R meta
> >>>>Subject: Re: [R-meta] Difference between univariate and multivariate
> >>>>parameterization
> >>>>
> >>>>Dear Wolfgang,
> >>>>
> >>>>Thanks for your reply. But, if in the multivariate specification:
> tau^2 =
> >>>>sigma^2_between  +  sigma^2_within, then in your suggested "res5"
> model where
> >>you
> >>>>fixed tau2 = 0 for single sample studies, you have killed both
> sigma^2_between
> >+
> >>>>sigma^2_within, and not just sigma^2_within?
> >>>>
> >>>>Am I missing something?
> >>>>
> >>>>Thank you very much,
> >>>>Luke
> >>>>
> >>>>On Wed, Aug 18, 2021 at 3:01 PM Viechtbauer, Wolfgang (SP)
> >>>><wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
> >>>>It is also possible to formulate a model where sigma^2_within is *not*
> added
> >for
> >>>>'single sample/estimate studies'. Let's consider this example:
> >>>>
> >>>>library(metafor)
> >>>>
> >>>>dat <- dat.crede2010
> >>>>dat <- escalc(measure="ZCOR", ri=ri, ni=ni, data=dat,
> >subset=criterion=="grade")
> >>>>
> >>>>table(dat$studyid) # most studies are single sample studies
> >>>>
> >>>># multilevel model
> >>>>res1 <- rma.mv(yi, vi, random = ~ 1 | studyid/sampleid, data=dat)
> >>>>res1
> >>>>
> >>>># multivariate parameterization
> >>>>res2 <- rma.mv(yi, vi, random = ~ factor(sampleid) | studyid,
> data=dat)
> >>>>res2
> >>>>
> >>>># as a reminder, the multilevel model is identical to this formulation
> >>>>dat$sampleinstudy <- paste0(dat$studyid, ".", dat$sampleid)
> >>>>res3 <- rma.mv(yi, vi, random = list(~ 1 | studyid, ~ 1 |
> sampleinstudy),
> >>>>data=dat)
> >>>>res3
> >>>>
> >>>># logical to indicate for each study whether it is a multi sample study
> >>>>dat$multsample <- ave(dat$studyid, dat$studyid, FUN=length) > 1
> >>>>
> >>>># fit model that allows for a different sigma^2_within for single vs
> multi
> >>sample
> >>>>studies
> >>>>res4 <- rma.mv(yi, vi, random = list(~ 1 | studyid, ~ multsample |
> >>>sampleinstudy),
> >>>>struct="DIAG", data=dat)
> >>>>res4
> >>>>
> >>>># fit model that forces sigma^2_within = 0 for single sample studies
> >>>>res5 <- rma.mv(yi, vi, random = list(~ 1 | studyid, ~ multsample |
> >>>sampleinstudy),
> >>>>struct="DIAG", tau2=c(0,NA), data=dat)
> >>>>res5
> >>>>
> >>>>So this is all possible if you like.
> >>>>
> >>>>Best,
> >>>>Wolfgang
>

	[[alternative HTML version deleted]]