[R-meta] Clarification on ranef.rma.mv()

Thu Sep 30 23:17:57 CEST 2021

Hi Wolfgang,

Sure, thank you very much.

Best regards,
Luke

On Thu, Sep 30, 2021 at 1:56 AM Viechtbauer, Wolfgang (SP)
<wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
>
> Hi Luke,
>
> Just to conclude (as far as I am concerned) this thread:
>
> I don't have a good understanding of what taking the SVD of a Cholesky decomposition is doing in the first place. I helped to show how you can obtain this for rma.mv models, but I can't help you with making sense of this.
>
> Best,
> Wolfgang
>
> >-----Original Message-----
> >From: Luke Martinez [mailto:martinezlukerm using gmail.com]
> >Sent: Friday, 17 September, 2021 0:04
> >To: Viechtbauer, Wolfgang (SP)
> >Cc: R meta
> >Subject: Re: [R-meta] Clarification on ranef.rma.mv()
> >
> >Hi Wolfgang,
> >
> >A quick follow-up on estimating the proportion of between variance in
> >metafor as in lme4:::rePCA.merMod(). We discussed using it in
> >correlated random-effects, but can we use that for non-correlated
> >(varying intercept) models as well?
> >
> >Assuming yes, it may be that I've a bug in the code below, but even
> >though the "sigma2" for "paper/study" is 0.000 and sigma2 for
> >"paper/study/obs" is 0.0037, at the end the POV_S for "paper/study" is
> >larger than that for "paper/study/obs", not sure why?
> >
> >dat <- dat.bornmann2007
> >dat <- escalc(measure="OR", ai=waward, n1i=wtotal, ci=maward,
> >n2i=mtotal, data=dat)
> >dat$paper <- as.numeric(factor(dat$study))
> >dat$paper[dat$paper <= 2] <- 1
> >fit <- rma.mv(yi, vi, random = ~ 1 | paper/study/obs, data=dat)
> >
> ># Apply the pca approach:
> >
> >    round(S <- fit$sigma2, 4)
> >   #[1]  0.0157  0.0000  0.0037
> >    S <- diag(S)
> >    colnames(S) <- rownames(S) <- fit$s.names
> >    sds <- setNames(svd(chol(S))$d, colnames(S))
> >    (pov_S <- round(sds^2 / sum(sds^2), digits = 4))
> >
> >paper     paper/study    paper/study/obs  ## Does these results make sense?
> >0.8077          0.1923          0.0000
> >
> >On Wed, Sep 15, 2021 at 5:04 PM Luke Martinez <martinezlukerm using gmail.com> wrote:
> >>
> >> Dear Wolfgang,
> >>
> >> Thank you so very much! Especially thank you for reminding me
> >> regarding the effect of data dependent modifications!
> >>
> >> All the best,
> >> Luke
> >>
> >> On Wed, Sep 15, 2021 at 1:21 PM Viechtbauer, Wolfgang (SP)
> >> <wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
> >> >
> >> > Please see below for my comments.
> >> >
> >> > >-----Original Message-----
> >> > >From: Luke Martinez [mailto:martinezlukerm using gmail.com]
> >> > >Sent: Tuesday, 14 September, 2021 19:25
> >> > >To: Viechtbauer, Wolfgang (SP)
> >> > >Cc: R meta
> >> > >Subject: Re: [R-meta] Clarification on ranef.rma.mv()
> >> > >
> >> > >Hi Wolfgang,
> >> > >
> >> > >Thank you. And since in rma.mv() we can have up to two ~ inner | outer
> >> > >random terms, then, I'm assuming to get the proportion of variation
> >> > >for the second ~ inner | outer random term, I can do:
> >> > >
> >> > >sds <- svd(chol(rma.mv_model4$H))$d
> >> > >sds^2 / sum(sds^2)
> >> >
> >> > Correct.
> >> >
> >> > >I guess one potential problem I'm running into is that what should we
> >> > >do if we see that the proportion of explained between-studies variance
> >> > >by only one or two levels of a categorical variable is almost zero
> >> > >while rest of the levels of that categorical variable make significant
> >> > >contributions?
> >> > >
> >> > >The reason I ask this is that with continuous variables (using struct
> >> > >= "GEN"), if a variable's contribution is almost zero, then, you can
> >> > >decide not to use that continuous variable in the random part at all
> >> > >(that variable altogether is overfitted).
> >> > >
> >> > >But with categorical variables, when several levels make good
> >> > >contributions to the between-studies variance except just one or two
> >> > >levels, then, you can't easily decide not to use that whole
> >> > >categorical variable in the random part at all.
> >> > >
> >> > >Do you have any opinion on this dilemma?
> >> >
> >> > I would choose a random effects structure that is motivated by the structure
> >of the data and the possible sources of heterogeneity/variability/dependencies
> >that I think may exist in the data. For example, if a slope may vary across
> >units, then I would add a random effect for that slope to the model. If it turns
> >out that the estimate of the slope variability is very low, dropping that random
> >effect (which is the same as assuming that the variance is 0) or not will pretty
> >much do the same thing. My preference would be not to change the model, because I
> >generally try to avoid making changes to a model (since the consequences of such
> >data dependent modifications are hard to predict).
> >> >
> >> > The same applies to structures like "UN". If a particular tau^2 value is low,
> >then no, I would not drop that random effect. You *can* however set particular
> >tau^2 values to 0 (the 'tau2' argument of rma.mv() allows you to do that), but
> >again, I would personally avoid doing that.
> >> >
> >> > Best,
> >> > Wolfgang