[R-meta] multilevel bivariate metaanalysis in the presence of phylogenetic correlation

Thu Sep 12 12:53:18 CEST 2024

Dear Sigurd,

The syntax for allowing the random effects for a particular variable to be correlated according to a matrix specified via 'R' is:

rma.mv(..., random = ~ 1 | variable, R=list(variable=A), ...)

but since you used '~ trait | phylo' in 'random', rma.mv() does not pick up the A matrix (and gives you the warning that something appears to be off). Instead, '~ trait | phylo' will add random effects for each level of 'trait' within 'phylo' (i.e., species) using by default a compound symmetry structure (struct="CS"), but given the output, you seem to have changed this to struct="UN" for an unstructured var-cov matrix (which, for two levels of 'trait' is identical to struct="HCS" for a heteroscedastic CS structure).

Going back to the case of a single trait (and at the risk of explaining things you are more familiar with than I am), the amount to which phylogeny plays a role is typically quantified in such a model in terms of something like Pagel's lambda, which we can estimate with:

rma.mv(..., random = list(~ 1 | species, ~ 1 | phylo), R=list(phylo=A), ...)

where 'species' and 'phylo' are the same variable, just with different names, and then computing:

lambda = sigma^2_phylo / (sigma^2_species + sigma^2_phylo).

See Cinar et al. (2021), https://doi.org/10.1111/2041-210X.13760.

Now with two traits, we can easily let sigma^2_species differ for the two traits with:

rma.mv(..., random = list(~ trait | species, ~ 1 | phylo), R=list(phylo=A), struct="DIAG", ...)

which will give you tau^2_trait1 and tau^2_trait2, based on which we compute:

lambda_trait1 = sigma^2_phylo / (tau^2_trait1 + sigma^2_phylo)

and

lambda_trait2 = sigma^2_phylo / (tau^2_trait2 + sigma^2_phylo).

Note that for now I used struct="DIAG" above, just letting the two variances to differ but assuming no correlation. One could also use struct="UN"/"HCS", which allows for correlation between the random effects for trait 1 and trait 2 within species. Not sure if I would call this the 'phylogenetic correlation between the two traits' (because it is not sigma^2_species or tau^2_trait1 and tau^2_trait2 that indicate the degree to which phylogeny matters, but the ratios given above).

Leaving terminology aside, we also might want to let sigma^2_phylo differ for the two traits. Your attempt with 'random = ~ trait | phylo, R = list(phylo = A)' seems logical, but as mentioned above, rma.mv() does not work that way. The question is also what we really mean by this and how this translates into the correlation matrix for the full dataset.

Say we have these data:

study trait species
-------------------
1     1     s1
1     2     s1
2     1     s2
3     1     s3
4     1     s2
4     2     s2

and the phylogeny is like this (use a fixed-width font if things don't line up):

       +-- s1
       |
    +--+
    |  |
+---+  +-- s2
|   |
|   +----- s3
|
+--------- s4

so A is something like:

   s1 s2 s3 s4
s1  1 .7 .4  0
s2     1 .4  0
s3        1  0
s4           1

Let's create the dataset and A matrix for this:

dat <- data.frame(study=c(1,1,2,3,4,4),
                  trait=c(1,2,1,1,1,2),
                  species=c("s1","s1","s2","s3","s2","s2"))
dat$species.trait <- paste0(dat$species, ".", dat$trait)
dat

A <- matrix(c(1,.7,.4,.7,1,.4,.4,.4,1), nrow=3)
rownames(A) <- colnames(A) <- c("s1", "s2", "s3")
A

Then the correlation matrix for the six rows in the dataset is:

A[dat$species, dat$species]

Let me add the species.trait labels to this:

M <- A[dat$species, dat$species]
rownames(M) <- colnames(M) <- dat$species.trait
M

which looks like this:

     s1.1 s1.2 s2.1 s3.1 s2.1 s2.2
s1.1  1.0  1.0  0.7  0.4  0.7  0.7
s1.2  1.0  1.0  0.7  0.4  0.7  0.7
s2.1  0.7  0.7  1.0  0.4  1.0  1.0
s3.1  0.4  0.4  0.4  1.0  0.4  0.4
s2.1  0.7  0.7  1.0  0.4  1.0  1.0
s2.2  0.7  0.7  1.0  0.4  1.0  1.0

Now if we say that the variance component for this matrix differs for trait 1 and trait 2, then I get how this would affect the diagonal (it would alternately be tau2^1_trait1 and tau^2_trait2) and I get how this would affect element M[1,3] (= M[3,1]) (it would be multiplied by tau2^1_trait1) or M[2,6] (= M[6,2]) (it would be multiplied by tau2^1_trait2), but what about element M[1,2], a cross-trait correlation for the same species, or element M[1,5], a cross-trait correlation for different species? One might have to start adding more variance components as multipliers for these types, but this starts to get tricky to ensure that the sum of these matrices (times their variance component) stays positive definite. I am sure that some people much cleverer than I am have thought about this stuff, but I am not sure what is typically done in phylogenetic comparative methods to do this kind of analysis.

I am stopping here because this is starting to get quite long, although this is an intriguing problem.

Best,
Wolfgang

> -----Original Message-----
> From: R-sig-meta-analysis <r-sig-meta-analysis-bounces using r-project.org> On Behalf
> Of Sigurd Einum via R-sig-meta-analysis
> Sent: Thursday, September 12, 2024 10:42
> To: R-sig-meta-analysis using r-project.org
> Cc: Sigurd Einum <sigurd.einum using ntnu.no>
> Subject: [R-meta] multilevel bivariate metaanalysis in the presence of
> phylogenetic correlation
>
> Dear all, I have a question regarding multilevel bivariate metaanalysis in the
> presence of a known correlation structure (phylogenetic relatedness) using
> metafor.
>
> Each study has one or more experiments ( "id"), where each experiment has
> estimates of two different parameters ("trait", ARR and lambda), together with
> their variances and covariances (providing variance-covariance matrix for the
> errors in the parameter estimates, V). Experiments are done on different
> species, and I have a phylogeny which is used to create the phylogenetic
> variance-covariance matrix.
>
> One potential model I thought could be possible to use, which I thought would
> allow for different phylogenetic correlations for the different outcomes, as
> well as different species-level random effects (independent of phylogeny) for
> the two traits:
> mod1 <- rma.mv(yi = yi, V = V, mods = ~trait,  data = new_data, random = list(~
> 1 | study, ~1|id, ~ trait | species,~ trait | phylo), R = list(phylo = A))
>
> However, this gives the warning: Argument 'R' specified, but list name(s) not in
> 'random'.
> So, on the one hand this suggests to me that the phylogenetic variance-
> covariance matrix is ignored, and that phylo here is treated as a species random
> effect independent of phylogenetic correlation ("species" and "phylo" in
> new_data are identical species names). However, on the other hand the model
> output does give separate estimates for the two levels:
>
>
> Multivariate Meta-Analysis Model (k = 394; method: REML)
>
>      logLik     Deviance          AIC          BIC         AICc
> -37620.5596   75241.1192   75269.1192   75324.5733   75270.2452
>
> Variance Components:
>
>             estim    sqrt  nlvls  fixed  factor
> sigma^2.1  0.0001  0.0079     37     no   study
> sigma^2.2  0.0007  0.0265    197     no      id
>
> outer factor: species (nlvls = 77)
> inner factor: trait   (nlvls = 2)
>
>             estim    sqrt  k.lvl  fixed   level
> tau^2.1    0.0121  0.1100    197     no     ARR
> tau^2.2    0.0001  0.0117    197     no  lambda
>
>         rho.ARR  rho.lmbd    ARR  lmbd
> ARR           1                -    77
> lambda  -0.9852         1     no     -
>
> outer factor: phylo (nlvls = 77)
> inner factor: trait (nlvls = 2)
>
>               estim    sqrt  k.lvl  fixed   level
> gamma^2.1    0.0099  0.0994    197     no     ARR
> gamma^2.2    0.0002  0.0155    197     no  lambda
>
>         phi.ARR  phi.lmbd    ARR  lmbd
> ARR           1   -0.4692      -    77
> lambda  -0.4692         1     no     -
>
> Test for Residual Heterogeneity:
> QE(df = 388) = 8590750.5336, p-val < .0001
>
> Test of Moderators (coefficients 1:6):
> QM(df = 6) = 34669.2288, p-val < .0001
>
> So, my questions are
> (1) which of these interpretations are correct? If the model does include the
> phylogenetic correlation structure, (2) is the value -0.4692 the phylogenetic
> correlation between the two traits? (3) What does the value -0.9852 from the
> trait|species term represent? And (4) how do one calculate I2 for such a model
> (I could only find examples from models that are either multilevel or bivariate,
> and not both at the same time?)
>
> Any insights will be appreciated!
> Best wishes,
> Sigurd
>
> Prof. Sigurd Einum
> Dept. Biology, NTNU, Norwayhttps://www.ntnu.no/ansatte/sigurd.einum