[R-meta] Questions about model averaging with complex multilevel meta-analytic model

Sun Sep 25 12:29:49 CEST 2022

One thing I meant to add to my previous mail: I hope you have a large dataset. If I didn't miscount, you have around 22+ model terms (counting main effects and interactions), which is a quite sizable model.

As for your new question:

Using some measure of precision (such as sqrt(vi)) as a predictor doesn't really quantify publication bias. It only provides an indication whether there is a relationship between the measure of precision and the effect sizes (possibly after accounting for the influence of other moderators included in the model). That's all. If there is such a relationship, it *may* suggest the presence of publication bias, but such a relationship can exist for all kinds of other reasons. Also, this method doesn't really account for publication bias.

The issue of publication bias (especially in the context of more complex models) has come up several times on this mailing list. See for example this thread:

https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2022-September/004191.html

You might want to search the archives for some further discussions. You can search the archives here:

https://cse.google.com/cse?cx=ee4b2e6c93b6a9667

However, note that recent posts may not have been indexed by the search engine yet and therefore will not show up.

Best,
Wolfgang

>-----Original Message-----
>From: Margaret Slein [mailto:maslein using zoology.ubc.ca]
>Sent: Saturday, 24 September, 2022 23:43
>To: Viechtbauer, Wolfgang (NP)
>Cc: r-sig-meta-analysis using r-project.org
>Subject: Re: [R-meta] Questions about model averaging with complex multilevel
>meta-analytic model
>
>Hi Wolfgang,
>
>Thank you, this has been most helpful. Another lingering question I had was in
>regards to quantifying publication bias. I included the square root of the
>sampling variance to assess this and in the best model, it has a very large
>statistically significant effect. Is this approach, of including the square root
>of the sampling variance in the global model with all other moderators,
>appropriate and if so, how would I account for this publication bias in the rest
>of my results?
>
>I cannot conduct a trim and fill analysis with the function in metafor because of
>the model's complex structure, but if there are other ways of trying to address
>potential strong bias, I am interested in doing so.
>
>Cheers,
>Maggie
>
><*)))><>  <*)))><>   <*)))><>  <*)))><>
>
>Maggie Slein (she/her/hers)
>PhD Student, O’Connor Lab
>Department of Zoology
>Unceded xʷməθkʷəy̓əm (Musqueam) territory
>University of British Columbia
>
>On Sep 24, 2022, at 11:41 AM, Viechtbauer, Wolfgang (NP)
><wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
>
>Dear Maggie,
>
>Some notes on this:
>
>1) dredge() (and I assume the other functions from MuMIn as well) examines the
>'nobs' attribute that logLik() returns to determine the sample size of the
>various models. However, when using REML estimation, nobs = k - p, where p is the
>number of model coefficients (for some technical reasons that are not important
>right now). However, this leads dredge() to think that the sample size differs
>across models where p differs.
>
>In general, you should use method="ML" when comparing models that differ in terms
>of their fixed effects.[1] In that case, nobs = k and this issue won't arise.
>
>2) I would recommend to do all transformations (like mean centering or things
>like sqrt(vi)) outside of the model call (so, beforehand).
>
>3) You have *a lot* of fixed effects and even interactions. This will lead to
>many models that dredge() needs to fit. This could take a looooong time. dredge()
>has a 'cluster' argument for doing parallel processing, which you might consider
>using if you have powerful enough hardware. Still, even then this could be a
>rather daunting task.
>
>4) I can confirm that dredge() works just fine with rma.mv() models. An example
>with a similar model as you are fitting can be found here:
>
>https://gist.github.com/wviechtb/891483eea79da21d057e60fd1e28856b
>
>Best,
>Wolfgang
>
>[1] Actually, based on some research we did, REML might actually work:
>
>Cinar, O., Umbanhowar, J., Hoeksema, J. D., & Viechtbauer, W. (2021). Using
>information-theoretic approaches for model selection in meta-analysis. Research
>Synthesis Methods, 12(4), 537–556. https://doi.org/10.1002/jrsm.1489
>
>But we didn't examine complex models like you are using and I would still be very
>cautious with using REML when doing so.
>
>
>-----Original Message-----
>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org] On
>Behalf Of Margaret Slein
>Sent: Saturday, 24 September, 2022 18:58
>To: r-sig-meta-analysis using r-project.org
>Subject: [R-meta] Questions about model averaging with complex multilevel meta-
>analytic model
>
>Hi Wolfgang,
>
>I am PhD student at UBC in Vancouver, Canada, currently working on a meta-
>analysis. I have been trying to do model selection and model averaging using the
>model.sel() and model.avg() functions from the MuMIn package with an rma.mv model
>object, while also following your metafor help page using MuMIn and glmulti. I
>have been unable to get any of my models to perform model selection or model
>averaging because each model is being fit to different data. I have ensured there
>are no missing values or NAs across the data frame.
>
>Is it possible to do model averaging with the rma.mv function with 3 level random
>effects, a phylogenetic correlation, and several moderators? There are currently
>no examples I could find using model selection or averaging with this model
>structure and I have had no luck on stack overflow.
>
>Here is the full model I am trying to run:
>
>full_mod<-rma.mv(yi=yi, V=V,
>                mods = ~ I(flux_range-mean(flux_range))*I(mean_temp_constant-
>mean(mean_temp_constant))
>                +I(flux_range-mean(flux_range))*experiment_type
>                +I(mean_temp_constant-mean(mean_temp_constant))*experiment_type
>                +I(secondary_temp - mean(secondary_temp))*experiment_type
>                +duration_standard*experiment_type
>                +experiment_type*experiment_type
>                +exp_age*experiment_type
>                +size*experiment_type
>                +ecosystem*experiment_type
>                +trait_directionality*experiment_type
>                +sqrt(vi)*experiment_type, dfs="contain",
>                random = list( ~1 | phylo, ~1 | study_id, ~1 | response_id),
>                R = list(phylo=cor), test="t",
>                method="REML", data=dat_ES_final_2)
>
>I have also tried using dredge() in MuMIn in addition to trying my own subset of
>models with no luck but it takes days for dredge() to iterate and still yields an
>error when trying to perform model averaging that none of the models are being
>fit to the same data. Any suggestions or assistance you can provide on this would
>be greatly appreciated.
>
>Cheers,
>Maggie
>
><*)))><>  <*)))><>   <*)))><>  <*)))><>
>
>Maggie Slein (she/her/hers)
>PhD Student, O’Connor Lab
>Department of Zoology
>Unceded xʷməθkʷəy̓əm (Musqueam) territory
>University of British Columbia