[R-meta] best subset of moderators for `robumeta` package in R

Fri Nov 8 10:49:55 CET 2019

As James showed, you can get very similar results from robu() and when using metafor + clubSandwich.

However, there is a conceptual issue in this entire endeavor. Using cluster-robust inference methods doesn't change the likelihood, it just changes how the var-cov matrix of the fixed effects is calculated. So, when doing model selection based on information criteria such as AIC and BIC - which are based on the log-likelihood (plus a penalty term) - you are not doing model selection based on 'cluster-robust inference results' but on the fit of the original model as give by the likelihood. Also, I^2 isn't going to be changed by the way the var-cov matrix of the fixed effects is computed, so that's not helping either.

Even though it is often frowned upon, you might have to go back to using some form of stepwise model selection based on actually testing coefficients. So, start with the empty model and add each predictor in turn and test it using cluster-robust inference methods. Add the one that is most significant and that passes some threshold of significance (e.g., .10). Then add each remaining predictor in turn to this model and so on until no predictor passes the threshold. That would be forward selection. One could refine with this by allowing variables that become non-significant to be removed from the model.

In many cases, such stepwise methods actually end up giving you the same final model as doing all-subsets regression.

Best,
Wolfgang

-----Original Message-----
From: Reza Norouzian [mailto:rnorouzian using gmail.com] 
Sent: Thursday, 07 November, 2019 19:25
To: Viechtbauer, Wolfgang (SP)
Cc: R meta
Subject: Re: [R-meta] best subset of moderators for `robumeta` package in R

Thank you, what I'm after is to possibly get the best subsets for `robu()` which unfortunately doesn't provide logLik or AIC to get it connected to the packages you suggested. Using `robust()` in metafor for large number of studies unfortunately doesn't change results compared to its `rma()` counterpart, also the results are all significantly different from `robu()`.

My concern is that running best subset analysis using metafor may not translate into finding the best model for `robu()`. As a result, I wonder if there might be a way to either obtain AIC etc. from `robu()` to connect it to the packages you mentioned OR to make the packages you mentioned take "I2" as criteria not AIC etc.?

Thanks very much,
Reza

On Thu, Nov 7, 2019 at 12:12 PM Viechtbauer, Wolfgang (SP) <wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
I am not entirely sure what you are after with using I^2 in this context, but using the same example, this is how you would find the model with the lowest I^2 value:

I2s <- sapply(res using objects, function(x) x$I2)
res using objects[which.min(I2s)]

Best,
Wolfgang

-----Original Message-----
From: Reza Norouzian [mailto:rnorouzian using gmail.com] 
Sent: Thursday, 07 November, 2019 18:42
To: Viechtbauer, Wolfgang (SP)
Cc: R meta
Subject: Re: [R-meta] best subset of moderators for `robumeta` package in R

Dear Wolfgang,

Thank you so much for this truly awe-inspiring response (I really can't stop reading your post) !! At the cost of being ignorant, is there any way to focus on "I2" index instead of information-theoretic criteria in these model-finding quests using the packages you mentioned?

Once again, I truly appreciate your expertise and time on this,
Reza

On Thu, Nov 7, 2019 at 5:44 AM Viechtbauer, Wolfgang (SP) <wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
Hi Reza,

I haven't played around with the leaps package, but you could do this with glmulti or MuMIn. An example of how to do this in combination with metafor is given here:

http://www.metafor-project.org/doku.php/tips:model_selection_with_glmulti_and_mumin

One could add additional steps to the rma.glmulti() function shown there, such as robust() from metafor or using coef_test() from clubSandwich.

But note that with 35 moderators, you are looking at 2^35 = 34,359,738,368 possible models. Even if fitting a single model only takes 0.01 seconds (which is rather optimistic), you will wait about 11 years for this to finish. If you have a cluster and parallelize this, you might be able to get this down to weeks or months. But one could also wonder if this is a useful exercise in the first place.

You could restrict your search to models with at most 'm' predictors. For m = 8, that's choose(35,8) = 23,535,820 models, which is still a lot but more feasible. glmulti() has a 'maxsize' argument for this purpose. dredge() from MuMIn has argument 'm.lim' for this.

Best,
Wolfgang

-----Original Message-----
From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org] On Behalf Of Reza Norouzian
Sent: Thursday, 07 November, 2019 3:24
To: R meta
Subject: [R-meta] best subset of moderators for `robumeta` package in R

I have a large number of "categorical" moderators (35 moderators). I am
planning to use the best subset of these moderators that can maximally
explain the variation in my 257 correlated effect sizes from 51 studies.

The R package `*leaps*` does perform best possible subset analysis via
function `*regsubsets()*` but to make that suited to `*robu()*` I think
need to define `weights` argument in `*regsubsets()*` so I can basically
make this suited for RVE purposes not simply OLS regression.

Any idea regarding how I can execute my plan in R or generally how I can
choose best subset of moderators for `*robu()*` in `robumeta` in R?

Many thanks,
Reza
-- 
*Reza Norouzian*