[R-meta] Meta-analysis questions
James Pustejovsky
jepu@to @end|ng |rom gm@||@com
Mon Sep 13 21:32:23 CEST 2021
Hi Bethan,
I'll respond to your first questions and add a query of my own.
First, non-normality and high kurtosis are indeed problems for the modeling
approach you've taken. One consequence is that you will likely have very
high levels of estimated heterogeneity. Another consequence, and potential
concern, is that the standard errors and confidence intervals that come
from rma.mv() should probably not be trusted because they are predicated on
assumptions about normality of the random effects in the specified model.
There are (at least) two ways that you could deal with this issue. First,
you could ignore the standard errors from rma.mv() and instead use robust
variance estimation methods, which are asymptotically robust to
non-normality as well as mis-specification of the model covariance
structure. For example, the following will give you robust confidence
intervals for your model:
conf_int(Traitcat_model, vcov = "CR2")
An alternative to RVE is to use bootstrap confidence intervals, as you've
attempted. However, the usual implementation of bootstrapping will not work
here. Because you've got dependent effect sizes with a multi-level
structure, you'll need to bootstrap re-sample *at the study level* instead
of re-sampling individual observations. Here's a rough sketch of how to
cluster bootstrap:
library(boot)
study_ids <- unique(Traitcatdata$Study_number)
boot.func <- function(study_ids, indices) {
row_indices <- Traitcatdata$Study_number %in% study_ids[indices]
Traitcat_model2 <-
try(suppressWarnings(rma.mv(yi=Ln_response_corrected,V=Variance,
data=Traitcatdata, mods=~LnSR:Trait_cat - 1, test="t", random =
list(~1|Study_number/Response_id/Effect_size_id), method = "REML",
subset=row_indices)))
if (inherits(Traitcat_model2, "try-error")) {
NA
} else {
c(coef(Traitcat_model2), Traitcat_model2$sigma2)
}
}
res.boot2 <- boot(study_ids, boot.func, R=5000)
Cluster-bootstrapped confidence intervals will usually give you results
that are quite similar to the RVE approach.
To your question about reporting the bootstrap CIs with the original model
estimates versus with "bootstrapped estimates", I assume that "bootstrapped
estimates" means the average (arithmetic mean) of the bootstrap
distribution. Usually these should be quite close to the point estimates
from the original model, particularly with a linear model such as yours. If
they're discrepant, then something weird is going on that probably warrants
seeking help from a statistician.
And an additional question for you: I see in your model specification that
you've dropped the intercepts, so for each trait category, you're modeling
the slope of the relationship between the log response ratio and the log of
the temperature differential. By dropping the intercept, you are assuming
that the magnitude of the effect size is multiplicatively related to the
magnitude of the temperature differential (i.e., if you double the
log-temperature difference, you should expect to get twice as large a log
response ratio). By interacting the slope with trait category, you're
allowing this multiplicative relationship to differ for each trait
category. But then you're also including random effects in the model, which
are assumed to have a constant variance on the scale of the log response
ratio and across trait categories. Let's ignore the hierarchical structure
for the moment and just think about one effect per study. For a given trait
category, the model would be
LRR_i = beta * log( temp diff )_i + v_i + e_i
where e_i is the sampling error with known variance Var(e_i) = V_i and v_i
is a random effect with variance Var(v_i) = tau^2. Note that this model
assumes that the degree of heterogeneity is constant across temperature
differentials, so the degree of heterogeneity in a set of studies that all
looked at very small temperature differentials is the same as the degree of
heterogeneity in a set of studies that all examined very large temperature
differentials. Does that make theoretical sense in your scientific context?
(This is an honest question--I don't know anything about your research
area!)
Alternatively, I wonder whether it might be plausible to assume that the
degree of heterogeneity is *also* multiplicatively related to the magnitude
of the temperature differential. Under that assumption, you would divide
the effect sizes and their standard errors by the log of the temperature
differential, so the new model for a given category would become
[ LRR_i / log( temp diff )_i ] = beta + v_i + e_i,
where now Var(e_i) = V_i / [log( temp diff )_i]^2 and Var(v_i) = tau^2.
Here, the variance parameter tau^2 represents heterogeneity in the response
*per unit change* in the log temperature differential. This implies that
studies with larger log temperature differentials would have
proportionately larger degrees of heterogeneity in their responses. To take
into account multiple trait categories, you could fit a model that has
indicators for each category (dropping the intercept) but you would no
longer necessarily need to include th interaction with lnSR.
James
On Sun, Sep 12, 2021 at 8:42 PM Bethan Lang <bethan.lang using my.jcu.edu.au>
wrote:
> Hi there,
>
>
>
> I am working on a meta-analyses looking at the effect of ocean warming on
> various traits of marine animals. My model is as follows:
>
>
>
> Traitcat_model <- rma.mv(yi = Ln_response_corrected, V = Variance, mods =
> ~LnSR:Trait_cat - 1, test="t", random =
> list(~1|Study_number/Response_id/Effect_size_id), method = "REML", data =
> Traitcatdata)
>
>
>
> Ln_response_corrected is the lnRR, i.e. the natural logarithm of the
> difference between the experimental (high temperature) and control mean.
>
> Trait_cat is the measure e.g. survival, metabolic rate, growth. I have
> added in the an interaction with LnSR (the natural logarithm of the
> difference between the control and experimental temperature). I included
> this to account for the fact that some studies might test say a control and
> +10C, while others may only test at +2C. The effect size id is the number
> of the observation, which is nested within response_id. All observations of
> one measure e.g. metabolic rate at the different experimental temperatures
> tested in a study will have the same response Id. This is nested within the
> study number, which is the number assigned to each paper included.
>
>
>
> Firstly, when I test the normality and heteroskedasticity of the model as
> a whole as well as the response variable, there is a high level on
> non-normality and heteroskedasticity (see attached screenshots for the
> diagnostic plots of the model). I am not sure if this means that the model
> output cannot be trusted? Or if this level of non-normality and
> heteroskedasticity is ok?
>
>
>
> If not, I was thinking about bootstrapping the confidence intervals to
> account for the non-normality. However, I cannot seem to figure out the
> code to do this. This is the code that I tried, at the moment I am getting
> the error: <text>:1:6: unexpected symbol 1: .arg show_col_types ^
>
> But I was getting a different error when I tried to run the code
> yesterday, I can’t remember what this one was.
>
>
>
> library(boot)
>
> boot.func <- function(dat, indices) {
>
> Traitcat_model2 <- try(suppressWarnings(rma.mv(yi=Ln_response_corrected,V=Variance,
> data=Traitcatdata, mods=~LnSR:Trait_cat - 1, test="t", random =
> list(~1|Study_number/Response_id/Effect_size_id), method = "REML",
> subset=indices)))
>
> if (inherits(Traitcat_model2, "try-error")) {
>
> NA
>
> } else {
>
> c(coef(Traitcat_model2), vcov(Traitcat_model2), Traitcat_model2$sigma2)
>
> }
>
> }
>
> res.boot2 <- boot(Traitcatdata, boot.func, R=5000)
>
> res.boot2
>
>
>
> Please let me know how I need to adjust this code to get the bootstrapped
> CI’s for each Trait_cat group.
>
>
>
> Also, would it be ok to use bootstrapped CI’s along with the estimates
> from the original model, or should I use bootstrapped estimates too?
>
>
>
> Secondly, I am trying to figure out how to do an Orchard plot, which I
> know you cannot create from a model with two multiple moderators. I was
> wondering whether there is any way to back calculate from the model to get
> data points that have been adjusted for LnSR, and then use them for the
> Orchard plots?
>
>
>
> Please let me know if you need any further details in order to help me
> with these queries.
>
>
>
> Best wishes,
>
>
>
> Bethan
>
>
>
> *________________________________________*
>
> *Bethan Lang*
>
> *PhD Candidate*
>
> ARC Centre of Excellence for Coral Reef Studies
> James Cook University
> Townsville, QLD, Australia
>
> 4811
>
> [image: signature_1249144060]
>
>
> _______________________________________________
> R-sig-meta-analysis mailing list
> R-sig-meta-analysis using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://stat.ethz.ch/pipermail/r-sig-meta-analysis/attachments/20210913/a8d388a5/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 42143 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-meta-analysis/attachments/20210913/a8d388a5/attachment-0001.jpg>
More information about the R-sig-meta-analysis
mailing list