[R-meta] test different levels of a random effect

Wed Nov 20 18:59:05 CET 2024

Dear Yefeng,

Trying to estimate 345 different variance components in approach 2. seems quite ambitious to me.

As for 1., it is worth pointing out that using fixed versus random effects for species is somewhat similar, except that the estimates you get from using random effects exhibit a natural shrinkage effect. This can be nicely visualized as follows:

res1 <- rma.mv(yi, V, random = list(~ 1 | study.id, ~ 1 | effect.size.id, ~ 1 | species.id), data=dat, sparse = TRUE)
res2 <- rma.mv(yi, V, mods = ~ species.id - 1, random = list(~ 1 | study.id, ~ 1 | effect.size.id), data=dat, sparse = TRUE)

plot(coef(res2), coef(res1) + ranef(res1)$species.id$intrcpt, pch=21, bg="gray",
     xlab="Fixed Species Effects", ylab="Random Species Effects",
     xlim=c(-1,3), ylim=c(-1,3),
     panel.first={abline(0,1);
                  abline(h=coef(res1), lty="dotted");
                  abline(v=coef(res1), lty="dotted")})

If you install the latest 'devel' version of metafor, you can use:

ranef(res1, vcov=TRUE)

to obtain also the full var-cov matrices of the BLUPs of the random effects. Then one could test pairs of species against each other while accounting for the covariance:

sub <- ranef(res1, vcov=TRUE)
sub$pred <- sub$pred$species.id[1:2,]
sub$vcov <- sub$vcov$species.id[1:2,1:2]
sub

rma.mv(intrcpt, V=sub$vcov, mods = c(0,1), data=sub$pred)

Best,
Wolfgang

> -----Original Message-----
> From: R-sig-meta-analysis <r-sig-meta-analysis-bounces using r-project.org> On Behalf
> Of Yefeng Yang via R-sig-meta-analysis
> Sent: Friday, November 15, 2024 05:16
> To: r-sig-meta-analysis using r-project.org
> Cc: Yefeng Yang <yefeng.yang1 using unsw.edu.au>
> Subject: [R-meta] test different levels of a random effect
>
> Dear community,
>
> My question might look weird. Let me explain my question using a concert
> example.
>
> Dat.moura2021 in metadata package contains 457 studies on assortative mating in
> 345 species.
> library(metafor)
> dat <- dat.moura2021$dat
>
> ### calculate r-to-z transformed correlations and corresponding sampling
> variances
> dat <- escalc(measure="ZCOR", ri=ri, ni=ni, data=dat)
> ### assume that the effect sizes within studies are correlated with rho=0.6
> V <- vcalc(vi, cluster=study.id, obs=effect.size.id, data=dat, rho=0.6)
> ### fit a model with study identity, observation, and species as random effects
> res <- rma.mv(yi, V, random = list(~ 1 | study.id, ~ 1 | effect.size.id, ~ 1 |
> species.id), data=dat, sparse = T)
>
> In this specific dataset, I want to know whether (1) an estimate associated with
> a certain species is significantly different from 0 (in the sense of P value),
> and (2) whether any two of such species-specific estimates of interest are
> different.
>
> 1. A natural solution that occurs to me is to add the species variable (in our
> case, species.id) as a predictor rather than a random effect, and we can do
> subsequent test of factors and linear combinations of parameters.
> ### add species.id as a predictor
> res2 <- rma.mv(yi, V, mods = ~ species.id - 1, random = list(~ 1 | study.id, ~ 1
> | effect.size.id), data=dat, sparse = T)
>
> This has a potential issue that model res2 is likely to be overparameterized. Of
> course, we can check the likelihood profile of each parameter estimate to
> confirm whether each estimate is identifiable. But this takes a long time to do
> so. Importantly, species.id should be a random effect rather than a fixed
> effect.
>
> 2. A second approach is similar to the first one but has a slightly different
> assumption. A normal regression model assumes homogeneous variance (tau^2)
> across different levels of a predictor. But species.id has 345 levels. It is not
> very reasonable to assume all 345 species share a common heterogeneity. To relax
> this assumption, we can use a subgroup random effects model, which was proposed
> in Pustejovsky, J. E., & Tipton, E. (2022)
>
> ### fit a subgroup random effects model
> V_subgroup <- vcalc(vi, cluster=study.id, subgroup=species.id, data=dat,
> rho=0.6)
> res3 <- rma.mv(yi, V_subgroup, mods = ~ species.id - 1, random = list(~
> species.id | study.id), struct = "DIAG", data=dat, sparse = T)
> ### add effect size id as an additional random effect
> res4 <- rma.mv(yi, V_subgroup, mods = ~ species.id - 1, random = list(~
> species.id | study.id, ~ 1 | effect.size.id), struct = "DIAG", data=dat, sparse
> = T)
>
> I do not know which one (res3 vs. res4) is better
>
> 3. We can sum up the intercept (or fixed effect) and random effect estimates of
> species.id of the model object res to get the species-specific estimates. Then,
> we use the Wald-type method to test the species-specific estimates and
> difference between them.
>
> ### get species's random effect estimate
> res_ranef <- ranef(res)
> ### combine with the fixed effect estimate
> est <- data.frame(species = rownames(res_ranef$species.id),
>                   beta = res$beta[1] + res_ranef$species.id$intrcpt,
>                   se = sqrt(res$se[1]^2 + res_ranef$species.id$se^2))
>
> But this approach will ignore the covariance between different species.
>
> I would be grateful if anyone can provide insights into (1) which approaches
> above is preferrable, and (2) if none of them is valid,  can you help find a
> solution?
>
> Best,
> Yefeng