[R-meta] Clarification on ranef.rma.mv()

Tue Sep 14 19:24:33 CEST 2021

Hi Wolfgang,

Thank you. And since in rma.mv() we can have up to two ~ inner | outer
random terms, then, I'm assuming to get the proportion of variation
for the second ~ inner | outer random term, I can do:

sds <- svd(chol(rma.mv_model4$H))$d
sds^2 / sum(sds^2)

I guess one potential problem I'm running into is that what should we
do if we see that the proportion of explained between-studies variance
by only one or two levels of a categorical variable is almost zero
while rest of the levels of that categorical variable make significant
contributions?

The reason I ask this is that with continuous variables (using struct
= "GEN"), if a variable's contribution is almost zero, then, you can
decide not to use that continuous variable in the random part at all
(that variable altogether is overfitted).

But with categorical variables, when several levels make good
contributions to the between-studies variance except just one or two
levels, then, you can't easily decide not to use that whole
categorical variable in the random part at all.

Do you have any opinion on this dilemma?

Many thanks,
Luke

On Tue, Sep 14, 2021 at 1:12 AM Viechtbauer, Wolfgang (SP)
<wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
>
> You can just do:
>
> sds <- svd(chol(fm3$G))$d / sqrt(fm3$sigma2)
> sds^2 / sum(sds^2)
>
> to get those 'Proportion of Variance' values. In fact, the scaling by sqrt(fm3$sigma2) is irrelevant then, so this is the same as:
>
> sds <- svd(chol(fm3$G))$d
> sds^2 / sum(sds^2)
>
> Or you can do the same as you did in those helper functions you wrote.
>
> I find it a bit strange to talk of the contribution of the intercept and slope variances to their sum (since they are on different scales), but maybe this makes sense when using a singular value decomposition of the Cholesky factorization of G instead of just doing summary(princomp(fm$G)) directly.
>
> Best,
> Wolfgang
>
> >-----Original Message-----
> >From: Luke Martinez [mailto:martinezlukerm using gmail.com]
> >Sent: Tuesday, 14 September, 2021 2:12
> >To: Viechtbauer, Wolfgang (SP)
> >Cc: R meta
> >Subject: Re: [R-meta] Clarification on ranef.rma.mv()
> >
> >Hi Wolfgang,
> >
> >You'll see the meaningful results when you run the following:
> >
> >lapply(rePCA_lme(fm2), summary)
> >
> >The first column in the output (below) is the amount of total
> >between-subjects variation explained by the random intercepts. The
> >second column is the amount of total between-subjects variation
> >cumulatively explained by the random intercepts AND random slopes.
> >
> >The idea is to find out whether some overfitting of random-effects
> >structure has taken place or not. For example, in this case, it seems
> >that random slopes are only responsible for a tiny amount of total
> >between-subjects variation as 99.41% of such variation is explained by
> >the random intercepts alone. So, maybe the model's random structure
> >can be simplified as:
> >
> >lme(distance ~ age, random = ~1 | Subject , data = Orthodont)
> >
> >This is a useful feature possibly for metafor as well, so I wonder how
> >to add some names and "prcomplist" class to do the same thing to
> >rma.mv just like I extended that from lmer to lme?
> >
> >I don't think the fact that yi values have their own variance would
> >interfere with the central concept here.
> >
> >Thanks,
> >Luke
> >
> >$Subject
> >Importance of components:
> >                                         [,1]    [,2]
> >Standard deviation     1.7794 0.13681
> >Proportion of Variance 0.9941 0.00588
> >Cumulative Proportion  0.9941 1.00000
> >
> >On Mon, Sep 13, 2021 at 1:17 PM Viechtbauer, Wolfgang (SP)
> ><wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
> >>
> >> Going back to the earlier example:
> >>
> >> library(nlme)
> >> fm2 <- lme(distance ~ age, random = ~ age | Subject, data = Orthodont)
> >> rePCA_lme(fm2)
> >>
> >> This is essentially:
> >>
> >> svd(chol(as.matrix(getVarCov(fm2))))$d / sigma(fm2)
> >> svd(chol(as.matrix(getVarCov(fm2))))$u
> >>
> >> Doing the same with rma.mv():
> >>
> >> library(metafor)
> >> Orthodont$id <- 1:nrow(Orthodont)
> >> fm3 <- rma.mv(distance ~ age, 0, random = list(~ age | Subject, ~ 1 | id),
> >struct="GEN", data = Orthodont)
> >> svd(chol(fm3$G))$d / sqrt(fm3$sigma2)
> >> svd(chol(fm3$G))$u
> >>
> >> No idea how to interpret this and whether the idea underlying this generalizes
> >to the case where each yi value has its own variance (instead of estimating a
> >single residual variance).
> >>
> >> Best,
> >> Wolfgang
> >>
> >> >-----Original Message-----
> >> >From: Luke Martinez [mailto:martinezlukerm using gmail.com]
> >> >Sent: Monday, 13 September, 2021 19:49
> >> >To: Viechtbauer, Wolfgang (SP)
> >> >Cc: R meta
> >> >Subject: Re: [R-meta] Clarification on ranef.rma.mv()
> >> >
> >> >Hi Wolfgang,
> >> >
> >> >My goal was to follow lme4:::rePCA.merMod assuming that there is an
> >> >underlying reason that lme4 developers chose not to directly apply PCA
> >> >to the G matrix.
> >> >
> >> >But if you think for rma.mv() models applying the PCA directly to the
> >> >G matrix gives the same result, I will very much appreciate a quick
> >> >demonstration?
> >> >
> >> >Thank you again,
> >> >Luke
> >> >
> >> >On Mon, Sep 13, 2021 at 11:42 AM Viechtbauer, Wolfgang (SP)
> >> ><wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
> >> >>
> >> >> I am not familiar with lme4::rePCA, but if you want to do PCA on the var-cov
> >> >matrix of the random effects, why not just apply PCA directly to G?
> >> >>
> >> >> Best,
> >> >> Wolfgang
> >> >>
> >> >> >-----Original Message-----
> >> >> >From: Luke Martinez [mailto:martinezlukerm using gmail.com]
> >> >> >Sent: Monday, 13 September, 2021 18:25
> >> >> >To: Viechtbauer, Wolfgang (SP)
> >> >> >Cc: R meta
> >> >> >Subject: Re: [R-meta] Clarification on ranef.rma.mv()
> >> >> >
> >> >> >Sure, I recently wrote a function similar to ?lme4::rePCA for lme
> >> >> >models. I keep benefiting from using it. So, I was wondering if I
> >> >> >could extend that to rma.mv() models?
> >> >> >
> >> >> >Here is how it works for lme models:
> >> >> >
> >> >> >#-- Some helper functions:
> >> >> >
> >> >> >Tlist_lme <- function(fit) rev(pdMatrix(fit$modelStruct$reStruct,
> >> >> >factor = TRUE))
> >> >> >theta_lme <- function(fit) sapply(Tlist_lme(fit), function(i)
> >> >> >i[lower.tri(i, diag = TRUE)])
> >> >> >
> >> >> >lowerbd <- function(x){
> >> >> >dd <- diag(0, nrow=nrow(x))
> >> >> >dd[lower.tri(dd)] <- -Inf
> >> >> >dd[lower.tri(dd, diag=TRUE)]
> >> >> >}
> >> >> >
> >> >> >lwr_lme <- function(fit) sapply(Tlist_lme(fit), lowerbd)
> >> >> >
> >> >> >#-- main function:
> >> >> >
> >> >> >rePCA_lme <- function(x){
> >> >> >
> >> >> >  chfs <- Tlist_lme(x)
> >> >> >  nms <- names(chfs)
> >> >> >  unms <- unique(nms)
> >> >> >  names(unms) <- unms
> >> >> >
> >> >> >  svals <- function(m) { # this is applied each of the RE matrices
> >> >> >    vv <- svd(m, nv = 0L)
> >> >> >    names(vv) <- c("sdev", "rotation")
> >> >> >    vv$center <- FALSE
> >> >> >    vv$scale <- FALSE
> >> >> >    class(vv) <- "prcomp"
> >> >> >    vv
> >> >> >  }
> >> >> >
> >> >> >  structure(lapply(unms, function(m)
> >> >> >    svals(Matrix::bdiag(chfs[which(nms ==
> >> >> >                                     m)]))), class = "prcomplist")
> >> >> >}
> >> >> >
> >> >> >#-- Testing:
> >> >> >
> >> >> >fm2 <- lme(distance ~ age,random=~age | Subject , data = Orthodont)
> >> >> >
> >> >> >rePCA_lme(fm2)
> >> >> >
> >> >> >On Mon, Sep 13, 2021 at 11:10 AM Viechtbauer, Wolfgang (SP)
> >> >> ><wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
> >> >> >>
> >> >> >> This gives the scaled (by the residual variance) Cholesky decomposition of
> >> >the
> >> >> >var-cov matrix of the random effects.
> >> >> >>
> >> >> >> We can see this if we compare:
> >> >> >>
> >> >> >> t(Tlist_lme(fm1)$Subject) %*% Tlist_lme(fm1)$Subject * sigma(fm1)^2
> >> >> >>
> >> >> >> with:
> >> >> >>
> >> >> >> VarCorr(fm1)
> >> >> >>
> >> >> >> rma.mv() also uses a Cholesky decomposition to ensure that the var-cov
> >matrix
> >> >of
> >> >> >the random effects is (semi)positive definite. However, it does not scale
> >> >things
> >> >> >by the residual variance, since those variances are heteroscedastic.
> >> >> >>
> >> >> >> I don't know why you would want to obtain just Tlist_lme(fm1), but if you
> >are
> >> >> >actually after the var-cov matrix itself, it is stored as "G" in the model
> >> >object.
> >> >> >To use your earlier example:
> >> >> >>
> >> >> >> library(metafor)
> >> >> >> dat <- dat.berkey1998
> >> >> >> res.mv <- rma.mv(yi~ outcome - 1, vi, data = dat, random = ~ outcome |
> >trial,
> >> >> >struct = "UN")
> >> >> >> res.mv$G
> >> >> >>
> >> >> >> If you really want the Cholesky decomposition, then just do:
> >> >> >>
> >> >> >> C <- chol(res.mv$G)
> >> >> >> C
> >> >> >>
> >> >> >> and, as expected, t(C) %*% C is the same as res.mv$G.
> >> >> >>
> >> >> >> Best,
> >> >> >> Wolfgang
> >> >> >>
> >> >> >> >-----Original Message-----
> >> >> >> >From: Luke Martinez [mailto:martinezlukerm using gmail.com]
> >> >> >> >Sent: Monday, 13 September, 2021 17:27
> >> >> >> >To: Viechtbauer, Wolfgang (SP)
> >> >> >> >Cc: R meta
> >> >> >> >Subject: Re: [R-meta] Clarification on ranef.rma.mv()
> >> >> >> >
> >> >> >> >Dear Wolfgang,
> >> >> >> >
> >> >> >> >Thank you very much. For correlated random-effect structures in
> >> >> >> >rma.mv(), is it possible to extract the equivalent of the following
> >> >> >> >lme() information:
> >> >> >> >
> >> >> >> >Tlist_lme <- function(fit) rev(pdMatrix(fit$modelStruct$reStruct,
> >> >> >> >factor = TRUE))
> >> >> >> >
> >> >> >> >#----- Testing:
> >> >> >> >library(nlme)
> >> >> >> >fm1 <- lme(distance ~ age, data = Orthodont)
> >> >> >> >
> >> >> >> >Tlist_lme(fm1)
> >> >> >> >
> >> >> >> >On Mon, Sep 13, 2021 at 2:38 AM Viechtbauer, Wolfgang (SP)
> >> >> >> ><wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
> >> >> >> >>
> >> >> >> >> Dear Luke,
> >> >> >> >>
> >> >> >> >> Sure:
> >> >> >> >>
> >> >> >> >> library(metafor)
> >> >> >> >>
> >> >> >> >> dat <- dat.berkey1998
> >> >> >> >> res.mv <- rma.mv(yi~ outcome - 1, vi, data = dat, random = ~ outcome |
> >> >trial,
> >> >> >> >struct = "UN")
> >> >> >> >> res.mv$rho
> >> >> >> >>
> >> >> >> >> ran.mv <- ranef.rma.mv(res.mv)
> >> >> >> >> cor(matrix(ran.mv[[1]]$intrcpt, ncol=2, byrow=FALSE))[1,2]
> >> >> >> >>
> >> >> >> >> The two correlations are not the same for the reasons explained at the
> >> >link
> >> >> >you
> >> >> >> >provided.
> >> >> >> >>
> >> >> >> >> If the dataset is not so nicely balanced, one can do something similar,
> >> >but
> >> >> >the
> >> >> >> >restructuring of the output from ranef() into a wide format gets a bit
> >more
> >> >> >> >tedious.
> >> >> >> >>
> >> >> >> >> For example:
> >> >> >> >>
> >> >> >> >> dat <- dat[-4,]
> >> >> >> >> res.mv <- rma.mv(yi~ outcome - 1, vi, data = dat, random = ~ outcome |
> >> >trial,
> >> >> >> >struct = "UN")
> >> >> >> >> res.mv$rho
> >> >> >> >>
> >> >> >> >> ran.mv <- ranef.rma.mv(res.mv)
> >> >> >> >> ran.mv
> >> >> >> >>
> >> >> >> >> ran.mv <- ran.mv[[1]][1]
> >> >> >> >> ran.mv$study <- sapply(strsplit(rownames(ran.mv), "|", fixed=TRUE),
> >tail,
> >> >1)
> >> >> >> >> ran.mv$arm   <- sapply(strsplit(rownames(ran.mv), "|", fixed=TRUE),
> >head,
> >> >1)
> >> >> >> >> ran.mv
> >> >> >> >>
> >> >> >> >> wide <- reshape(ran.mv, direction="wide", idvar="study",
> >> >v.names="intrcpt",
> >> >> >> >timevar="arm")
> >> >> >> >> rownames(wide) <- NULL
> >> >> >> >> wide
> >> >> >> >> cor(wide[2:3], use="pairwise.complete.obs")[1,2]
> >> >> >> >>
> >> >> >> >> There might be a more elegant way to do this, but this gets the job
> >done.
> >> >> >> >>
> >> >> >> >> Best,
> >> >> >> >> Wolfgang
> >> >> >> >>
> >> >> >> >> >-----Original Message-----
> >> >> >> >> >From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-
> >> >project.org]
> >> >> >On
> >> >> >> >> >Behalf Of Luke Martinez
> >> >> >> >> >Sent: Monday, 13 September, 2021 8:38
> >> >> >> >> >To: R meta
> >> >> >> >> >Subject: [R-meta] Clarification on ranef.rma.mv()
> >> >> >> >> >
> >> >> >> >> >Dear Wolfgang and List Members,
> >> >> >> >> >
> >> >> >> >> >In ordinary multilevel models (lmer), one can use ranef() of a model
> >> >> >> >> >to get the correlations from the conditional modes of the random
> >> >> >> >> >effects (https://stats.stackexchange.com/q/153253/124093) which may
> >> >> >> >> >reveal how much random-effects for slopes and intercepts are roughly
> >> >> >> >> >correlated.
> >> >> >> >> >
> >> >> >> >> >Similarly, for a struct = "UN" model, I was wondering if
> >> >> >> >> >"ranef.rma.mv(fitted_model)" could reveal how much random-effects for
> >> >> >> >> >outcome levels are roughly correlated?
> >> >> >> >> >
> >> >> >> >> >For example, for the "res.mv" model below where the REML rho estimate
> >> >> >> >> >is "0.775", can ranef.rma.mv(res.mv) values give a rough estimate of
> >> >> >> >> >this correlation conditional on the observed data?
> >> >> >> >> >
> >> >> >> >> >dat <- dat.berkey1998
> >> >> >> >> >res.mv <- rma.mv(yi~ outcome - 1, vi, data = dat, random = ~ outcome |
> >> >> >> >> >trial, struct = "UN")
> >> >> >> >> >
> >> >> >> >> >ran.mv <- ranef.rma.mv(res.mv)
> >> >> >> >> >
> >> >> >> >> >Thank you very much for your help,
> >> >> >> >> >Luke