[R-sig-ME] What is the state of th "Development version"?

Fri Jul 22 17:09:15 CEST 2011

On Fri, Jul 22, 2011 at 3:14 AM, Dieter Menne
<dieter.menne at menne-biomed.de> wrote:
> There is a message from Mai 31, 2009 on
>
> http://markmail.org/message/b3nssfujudrlczab
>
> where Douglas Bates confirmed that the output
>
> Fixed effects:
>     Estimate Std. Error t value
> Asym   192.04     104.09!!!!   1.845  <<<
> xmid   727.89      31.97  22.771
> scal   347.97      24.42  14.252
>
> is wrong, and that the development version gives the following correct output:

> Fixed effects:
> Estimate Std. Error t value
> Asym 191.06 15.51 !!!! 12.32
> xmid 722.61 33.59 21.51
> scal 344.20 25.94 13.27

> Two years later, the current CRAN version still gives the same incorrect output.
> I had installed the development version on my old computer, but lost is after I
> had a new setup, and I am trying to get it back now.

> What is the currently recommended development version?

Sorry for the delay.  I had forgotten about that error which should
have been corrected long ago.  My problem is that I keep reinventing
the underlying code to perform the deviance evaluation and there are
many, many details to try to get straight.  The combination of
retaining R's functional programming semantics (don't modify any
arguments) and getting reasonable performance on large problems makes
this particularly challenging.  You don't want to copy large
structures unnecessarily but, at the same time, you can't accidently
modify something that shouldn't be modified.

The core computation is the sparse Cholesky factorization, for which
we had used code from Tim Davis's SparseSuite, especially the CHOLMOD
library of C functions.  There are certain design aspects of that
library that are less than optimal.  Tim provides a template mechanism
to allow for different integer types for indices (32-bit integers or
64-bit integers) but does so in C which doesn't have a template
mechansim so he created one using various configuration files and
gmake-specific Makefiles.  You can use this in a stand-alone library
but it doesn't play well with the R package mechanism.  We long ago
split lme4 from the Matrix package but, again for performance, need to
call C functions in the Matrix package from C functions in the lme4
package.  There is a way of doing this but it is a purpose-built
extension to R packages and not at all easy to extend (it only can be
used with certain types of C functions and it requires coordinating
the two packages tightly).  Between the peculiarities of the CHOLMOD
templating and the C function registration mechanism we got the
problem with lme4a expecting cholmod_l_* and some versions of the
Matrix package providing cholmod_*

As I started to look into the Rcpp package that provides interfaces
between R and C++ I came more and more to like that approach.  If you
have followed Rcpp at all, including the recent JSS article by Dirk
and Romain, and you have done any programming with the .Call
interface, you will realize that using Rcpp is much, much cleaner.
Also, C++ is a more natural language for numerical linear algebra than
is C because you really want to think in terms of objects and you can
handle all the messy details of intermediates cleanly.  I looked at
several different C++ numerical linear algebra systems including
Armadillo and Boost's ublas and even toyed with doing some of it
myself (the lme4a/src/MatrixNS.cpp file) but eventually discovered
Eigen (http://eigen.tuxfamily.org).  This prompted me to write the
RcppEigen package in cooperation with Romain and Dirk.  It has a lot
going for it - flexibility, expressivity and performance.  I recently
posted on R-devel showing a simple example in response to a question
about the .Call interface.

I have a version of lme4 based on Eigen running and fitting LMMs.  I'm
currently working on the profiling and on GLMMs.  The good news is
that it performs well on small to medium sized problems.  The bad news
is that it doesn't perform well on large problems because Eigen does
not yet implement all the algorithms available in CHOLMOD.  Things
like SuperNodal sparse Cholesky decompositions may seem rather
esoteric but they are important in getting performance on large
problems.  I plan to upload lme4Eigen to the R-forge repository next
week.  Right now it will not compile on R-forge because it depends on
an unreleased RcppEigen which depends on an unreleased Rcpp and very
few packages are moving into CRAN as Kurt is on vacation.  When he
returns next week he will have over 200 packages already in the cue.
After the flood of CRAN updates Dirk will upload Rcpp_0.9.5 then
RcppEigen_0.1.2 and finally I will be able to make lme4Eigen
available.

> Dieter
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>