[R-sig-ME] R package buildmer for automatic stepwise selection of mixed-effects models and GAMMs

Sun May 26 17:55:59 CEST 2019

I’ve recently published version 1.1 of my package 'buildmer' on CRAN (version 1.0 was a beta release to solicit feedback from direct colleagues). I thought that the folks here on r-sig-mixed might also be interested, in case they have similar problems to my original use-case: term selection in models with many random effects where the 'maximal' model does not necessarily converge.

'buildmer' automates term ordering and stepwise selection/elimination in, primarily, mixed-effects models. By 'term ordering' is meant that the user can specify a desired maximal model, and the package will determine the largest random-effects structure that still achieves convergence. This is done by starting from the fixed-effects model, and adding r.e. terms one by one in order of importance (by default, 'importance' is the value of the LRT statistic) until convergence is no longer possible. Next, the package can do automatic forward or backward elimination of both fixed and random effects. The package takes care of fitting with ML versus REML and of dividing LRT p-values by 2 for random-effect tests. Thus, the user only needs to provide an intended maximal model and wait for the results (just call function 'buildmer' with an lme4 formula for your maximal model, plus any data= and family= argument). Possible criteria for term ordering and elimination are LRT (default), AIC, BIC, or change in log-likelihood; user-specified criteria are also possible.

I started work on this package as a small script two years ago, when lmerTest's step() function was not yet fully capable of handling convergence failures (this has been much improved in lmerTest version 3.0), and when I wanted to use stepwise elimination based on likelihood-ratio tests (for fixed effects step() only offers F tests). If you use lmer or glmer models, require only backward elimination, and are okay with using F tests, there is not much reason for you to try buildmer over lmerTest. If you require more flexibility, you may be interested in giving buildmer a try. There is no statistical model-fitting code in buildmer, it is simply a glorified formula parser and wrapper around logLik() and friends. This also made it easy to extend it to models such as GAMMs or glmmTMB models, which are also supported. For the same reasons, I unfortunately cannot fold buildmer into a pull request to lmerTest, as it takes a fundamentally different approach.

I have now come to rely on this package for most of my models where variable selection is required. I hope others may find it useful too, and if not, sorry for the noise! Any issue reports and points of discussion are always welcome, of course, especially since I also rely on this package for my own research.

Best,
Cesko Voeten