[R-sig-ME] Julia vs nlme vs lme4 implementation of fitting linear mixed models
W Robert Long
longrob604 at gmail.com
Fri Oct 17 14:34:10 CEST 2014
Dear Professor Bates
This is a great help - thank you. I may have some follow-on questions,
in which case I shall reply to the list but your response below gives me
plenty to occupy myself with for now.
Regards
Robert Long
On 16/10/2014 17:57, Douglas Bates wrote:
> Thanks for the question. It is a good exercise for me to answer it so
> that I can get things straight in my mind. I am doing this off the top
> of my head, relying on my all-too-faulty memory. Corrections and
> clarifications are welcome. Ben, should we archive such information
> somewhere?
>
> The basics:
> nlme:
> - developed in mid-1990's by Pinheiro and Bates
> - documented in P & B (2000, Springer) and several papers
> - implemented in R and C using the .Call interface and R's C API.
> - fits linear mixed-effects models and nonlinear mixed-effects models
> - allows for additional variance and or correlation structures in the
> conditional distribution of the response, given the random effects.
> - multiple formula model specification with PDMat, VarCorr, ... classes.
> - allows for multiple nested grouping factors. There is a kludgy method
> for fitting models with crossed grouping factors but I wouldn't push it
> too hard.
> - uses S3 methods and class tags (IMO "S3 classes" don't exist)
> - optimization of parameter estimates via EM and Newton-Raphson
> algorithms, profiled w.r.t. residual variance only
> - Internal representation uses relative precision factor and
> log-Cholesky formulation. Singular covariance matrices correspond to
> infinite parameter values.
> - no longer actively developed. Maintained by R-Core primarily to
> adjust to changing requirements on R packages
> - no explicit methods for fitting GLMMs.
> - lme with iterative reweighting is used in the glmmPQL function in the
> MASS package.
>
> lme4:
> - development began in late 90's by Bates, Maechler, DebRoy, Sarkar and
> others. Current development is primarily by Bolker and Walker.
> - documented in papers, vignettes. Bates started writing a book but
> didn't complete the project - some draft chapters still online.
> - implemented in R, C and C++ using facilities provided by Rcpp and Eigen
> - fits linear mixed-effects models and GLMMs. Nominally has NLMM
> capabilities but the implementation is not solid
> - single formula model specification. Grouping factors can be nested,
> partially or fully crossed.
> - uses S4 classes and methods, S3 methods, reference classes, Matrix
> package.
> - internal representation uses relative covariance factor and sparse
> Cholesky factorization. Optimization of profiled log-likelihood uses
> general nonlinear optimizers that allow for box constraints (in practice
> only require non-negativity of some of the parameters) for LMMs. GLMMs
> use PIRLS (penalized iteratively reweighted least squares) to determine
> the conditional modes of the random effects with Laplace or adaptive
> Gauss-Hermite approximation to the log-likelihood. Settling on a single
> robust and reliable optimizer has been difficult.
> - all models use a sparse matrix representation to solve the penalized
> least squares problem
> - actively maintained and developed
>
> MixedModels
> - development started in 2012 by Bates
> - little or no documentation outside of examples
> - implemented exclusively in Julia (about 1600 lines of code)
> - fits LMMs. Development of GLMM capabilities is planned.
> - single formula specification similar to lme4. Because Julia is still
> a young language not all lme4 formulas will work with MixedModels.
> (Building the formula language interpretation code from scratch is not
> easy.) Grouping factors can be nested or crossed (partially or completely).
> - uses Julia methods and user-defined types. Julia methods provide for
> multiple dispatch like S4.
> - internal representation (all at the Julia level) provides for several
> different PLS (penalized least squares) solvers according to the
> configuration of the grouping factors.
> - Models with a nested sequence of grouping factors, including a single
> grouping factor as a trivial special case, use dense matrix methods and
> provide an analytic gradient of the profiled log-likelihood. Similarly
> for models with two crossed or nearly crossed grouping factors (think
> "subject" and "item").
> - Optimizers from Steven Johnson's NLopt Julia package that interfaces
> to his C library, which is also used in the R packages nloptr and
> nloptwrap. At present LN_BOBYQA is used for derivative-free
> optimization and LD_MMA for models that provide a gradient but switching
> optimizers within the NLopt library is trivial.
> - Considerable effort made to reuse storage and cut down on
> allocation/garbage collection.
> - Actively developed. Maintenance hasn't been too much of an issue
> because I don't think there are many users at present.
>
> Examples of MixedModels fits are at
>
> http://nbviewer.ipython.org/github/dmbates/slides/blob/master/2014-10-01-Ithaca/lmm%20Examples.ipynb
>
> Because I was involved in the development of all three packages I will
> take the liberty of commenting on the philosophy.
>
> The purpose of developing nlme was for fitting nonlinear mixed-effects
> models. Linear mixed-effects models were incorporated mainly as an
> iterative step in nlme. Numerical methods used are not nearly as
> sophisticated as those in lme4 and MixedModels.
>
> lme4 was developed to provide a use-case for S4 classes and methods. The
> Matrix package was initially part of lme4 then split off into a separate
> package. The numerical methods implemented in lme4 are, in my opinion,
> superior to those in nlme, mainly through the use of the relative
> covariance factor and the profiled log-likelihood. These may seem like
> details but to me they are very important. The motiviation for
> incorporating sparse matrix classes in the Matrix package and accessing
> the CHOLMOD code was to provide a general method for fitting such
> models. Using C++, Rcpp and RcppEigen was motivated by trying to
> provide generality and speed. The end result is confusing (my fault
> entirely) and fragile.
>
> MixedModels was developed because I became interested in the Julia
> language at the same time that I was disillusioned with at least some
> aspects of R. As a new language Julia doesn't have the infrastructure
> of R (dataframes, formula language, graphics, ...) but does have a clean
> implementation of the parts that are available. The most important
> aspect of Julia is "one language". You develop in the same language in
> which you optimize. The type system in Julia allows me to incorporate
> the different kinds of penalized least squares solvers in what to me is
> a clean way, thereby taking advantage of structural simplifications in
> simple, but common, cases. It is possible to do this in
> R/C++/Rcpp/EIgen but it would be a massive headache and perhaps beyond
> my abilities to do it well.
>
> Optimizing Julia code is often done at the expense of transparency. It
> is obvious what
>
> C = C - A*B
>
> means when C, A and B are matrices (* means matrix multiplication in
> Julia). It is less obvious what
>
> BLAS.gemm!('N','N',-1.,A,B,1.,C)
>
> means but the second form avoids taking two unnecessary copies of the
> matrix C and running a couple of extra loops. This isn't important if
> you only do it once. If you do it several dozen times in each function
> evaluation in an optimization that requires tens of thousands function
> evaluations, it is important.
>
> Please let me know if this answers your question.
More information about the R-sig-mixed-models
mailing list