[R-meta] large increase of the QM value after applying robust on rma.mv

Tue Sep 3 14:21:26 CEST 2019

Hi James,

Thank you for your answer.
I have tried to run the models with species as the cluster variable. It does help, QM is now = 16 (the other univariate models have a QM between 0 and 2, QM for intercept-only model = 7).

For the structure of the dataset:
I only have a few species with multiple studyID  (6 species with 2 or 3 studyID), most of the time, 1 species = 1 specific studyID.

For this particular (troublemaking) moderator that has 5 levels, I have the following distribution. 

Per species

Level 1: 3 species
Level 2: 1 species
Level 3: 9 species
Level 4: 9 species
Level 5: 2 species

10 species with one given moderator value
4 species with 2 different moderator values
2 species with 3 different moderator values

Per studies

Level 1: 5 studies
Level 2: 1 study
Level 3: 11 studies
Level 4: 10 studies
Level 5: 5 studies

16 studies with one moderator value
5 studies with 2 different moderator values
2 studies with 3 different moderator values

Kind regards
Léa

> ----------------------------------------
> From: James Pustejovsky <jepusto using gmail.com>
> Sent: Tue Sep 03 04:31:15 CEST 2019
> To: Lea BRIARD <lea.briard using univ-tlse3.fr>
> Cc: <r-sig-meta-analysis using r-project.org>
> Subject: Re: [R-meta] large increase of the QM value after applying robust on rma.mv
> 
> 
> Emily,
> 
> Thanks for posting such a clear and detailed description of your analysis. I think the issue may be that you are fitting a model with three hierarchical levels of random effects, but then you are clustering the standard errors at an intermediate level (level 2 of 3). This will not generally work (in the sense of producing valid estimates of the SEs for average effect sizes or meta-regression coefficients). The reason is that cluster-robust standard errors (as implemented in metafor and in the clubSandwich package) are based on the assumption that each cluster is independent. This assumption will not hold if you are clustering by study but then using a model that posits dependence between different studies conducted on the same species. Using cluster-robust SEs in this situation is akin to ignoring the dependence at the top level.
> 
> I would speculate that this issue might be why you are getting the implausibly large QM values when you use cluster-robust SEs. If the moderator varies at the species level, the species-level variance component will matter a lot in how much variance can be explained by the moderator. But then if the SEs ignore the species-level variation, it might appear as if the moderator has explained it all away. 
> 
> To figure out what to do from here, I think the first thing is to determine the level at which each of the moderators vary and count the number of highest-level units for each unique value of the moderator. For example, a study-level moderator with two levels might have 4 species with level A only, 9 species with level B only, and 3 with both A and B (i.e., there are studies of each type for each of these 3 species). Cluster-robust SEs are based on only between-cluster variation, so there have to be an adequate number of clusters at each level of the moderator. If there’s just one species with a given moderator value, then there’s no to cluster by species (short of digging up some more studies).
> 
> James
> 
> > On Aug 29, 2019, at 11:32 AM, Lea BRIARD <lea.briard using univ-tlse3.fr> wrote:
> > 
> > Hi all,
> > 
> > I first want to apologize if my email is too long or if I don't provide the right information. This is my first email to a R community.
> > 
> > I currently running a meta-analysis using the rma.mv function. 
> > 
> > Here are some information that could be useful
> > 
> > I have a dataset with 210 observations, 23 studies and 16 species. I study the link between parasite and host behaviour. 
> > Each study is on one particular group of individuals, but the different observations are different behaviour and infection measures calculated for the same individuals (sometimes in the same study I have more than one species of parasites, and for each parasites multiple measures of host behaviour).
> > 
> > 
> > After reading these two threads
> > https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2017-August/000130.html
> > https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2017-November/000351.html
> > 
> > I followed these steps :
> > 1° calculation of effect sizes with escalc ( r converted to Fisher's Z)
> > 2° I have a three-level random factor (1|species/studyID/observation) (I compared different versions with different random factors using anova() to choose the best version).
> > I use REML, optimizer = optim and optmethod=Nelder-Mead
> > 
> > 3° Imputation of the variance-covariance matrix use the impute_covariance_matrix (data$vi, cluster=studyID, r = 0 or r = 0.5) 
> > 
> > 
> > 4° Use the robust function with cluster=data$studyID. I'm not using the factor with the smallest number of levels (species) because the risk of interdependence between observations is within studyID (one studyID= same group of individuals). I use robust it's the residuals are not normally distributed (the profile likelihood plots of the sigma components indicate strong parameter identifiability.
> > 
> > 5° run intercept-only model
> > 
> > 6° run 5 univariate model with different moderators (host characteristics, parasite characteristics) and ranking the 6 models using AICc
> > 
> > I'm running two versions of these models, a non-phylogenetically controlled and a phylogenetically controlled one (using a correlation matrix computed with vcv.phylo) with a R=list(species=matrix) and Rscale = cor0.
> > 
> > after struggling for months (it's my first meta-analysis) I thought I was finally out of the woods thanks to all the amazing material that I could find online (thank you so much Wolfgang and James) BUT I have one model with one particular moderator where the QM value goes through the roof after using the robust function and this happens ONLY in the non-phylogenetically controlled (without the R component). 
> > My QE is between 616 (intercept only) and 523 (best model)
> > MY QM after using robust is between 2.9 and 0.39 except for that particular moderator where QM = 673. For that troublemaking moderator, I have three levels with the following sample size level 1 = 179, level 2 = 10, level = 21. 
> > If I use another version of that moderator (more levels, but more evenly distributed) it's even worst I have a QM > 100000000
> > 
> > 
> > I tried to read as extensively as possible the metafor project website, as well as James Pustejovsky's website and the threads in the R-sig-meta-analysis mailing list but I can't find an explanation for this. I specially want to know if it means that this moderator is not usable (because of the sample size, with one particular level being overrepresented) or if I'm doing something wrong in general and this moderator is just a symptom.
> > 
> > King Regards
> > Léa Briard
> > 
> > _______________________________________________
> > R-sig-meta-analysis mailing list
> > R-sig-meta-analysis using r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis