[R-sig-ME] multicore lmer: any changes since 2007?
Liaw, Andy
andy_liaw at merck.com
Wed Jul 6 19:51:55 CEST 2011
> From: r-sig-mixed-models-bounces at r-project.org
> [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf
> Of Ben Bolker
> Sent: Wednesday, July 06, 2011 1:41 PM
> To: r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] multicore lmer: any changes since 2007?
>
> On 07/06/2011 11:40 AM, Mike Lawrence wrote:
> > Thanks for the detailed reply.
> >
> > According to the comment by Joshua Ulrich (one of the DEoptim
> > developers) on this stackoverflow post
> >
> (http://stackoverflow.com/questions/3759878/parallel-optimizat
> ion-in-r),
> > it seems that DEoptim might be a parallel-izable optimizer, and as I
> > recall you can put box constraints on parameters with
> DEoptim. I just
> > sent off an email to the DEoptim developers to see if
> there's been any
> > progress on the parallel front.
> >
> > Mike
>
> I have used differential evolution in the past (although not in
> decades (!!)), although not the DEoptim() package, but I don't think
> DEoptim() will be appropriate for this purpose. I'm actually not
> entirely clear on what DB means by "parallel evaluation of
> the objective
> function". In the simplest derivative-free case for example (the
> Nelder-Mead simplex), it's hard to see how one could evaluate the
> objective function in parallel because each evaluation changes the
> structure of the simplex and determines where the next
> evaluation should
> be.
My recollection of how NM simplex works is fuzzy, but isn't it the case that at each iteration you try to "flip" the simplex in all p directions and choose the one with the steepest decend? If so, that's p points you need to evaluate the objective function. Or is that only needed at the beginning?
Andy
> A very quick look at BOBYQA (source code in the minqa package, or
> formal description at
> <http://www.damtp.cam.ac.uk/user/na/NA_papers/NA2009_06.pdf>) suggests
> the same one-point-at-a-time updating scheme.
>
> But DB says
>
> >> In many cases you know several points where you will be
> >> evaluating the objective so you could split those
> >> off into different threads.
>
> Since he has (probably literally) forgotten more about numerical
> computation than I ever knew, he's probably right, but I don't know of
> those examples.
>
> Interesting discussion ...
>
>
> >
> > On Wed, Jul 6, 2011 at 11:59 AM, Douglas Bates
> <bates at stat.wisc.edu> wrote:
> >> On Tue, Jul 5, 2011 at 4:52 PM, Mike Lawrence
> <Mike.Lawrence at dal.ca> wrote:
> >>> Back in 2007
> (http://tolstoy.newcastle.edu.au/R/e2/help/07/05/17777.html)
> >>> Dr. Bates suggested that using a multithreaded BLAS was the only
> >>> option for speeding lmer computations on multicore
> machines (and even
> >>> then, it might even cause a slow down under some circumstances).
> >>>
> >>> Is this advice still current, or have other means of speeding lmer
> >>> computations on multicore machines arisen in more recent years?
> >>
> >> As always, the problem with trying to parallelize a particular
> >> calculation is to determine how and when to start more than one
> >> thread.
> >>
> >> After the setup stage the calculations in fitting an lmer model
> >> involve optimizing the profiled deviance or profiled REML
> criterion.
> >> Each evaluation of the criterion involves updating the
> components of
> >> the relative covariance factor, updating the sparse Cholesky
> >> decomposition and solving a couple of systems of equations
> involving
> >> the sparse Cholesky factor.
> >>
> >> There are a couple of calculations involving dense matrices but in
> >> most cases the time spent on them is negligible relative to the
> >> calculations involving the sparse matrices.
> >>
> >> A multithreaded BLAS will only help speed up the
> calculations on dense
> >> matrices. The "supernodal" form of the Cholesky
> factorization can use
> >> the BLAS for some calculations but usually on small
> blocks. Most of
> >> the time the software chooses the "simplicial" form of the
> >> factorization because the supernodal form would not be
> efficient and
> >> the simplicial form doesn't use the BLAS at all. Even if the
> >> supernodal form is chosen, the block sizes are usually small and a
> >> multithreaded BLAS can actually slow down operations on
> small blocks
> >> because the communication and synchronization overhead
> cancels out any
> >> gain from using multiple cores.
> >>
> >> Of course, your mileage may vary and only by profiling
> both the R code
> >> and the compiled code will you be able to determine how
> things could
> >> be sped up.
> >>
> >> If I had to guess, I would say that the best hope for parallelizing
> >> the computation would be to find an optimizer that allows
> for parallel
> >> evaluation of the objective function. The lme4 package requires
> >> optimization of a nonlinear objective subject to "box constraints"
> >> (meaning that some of the parameters can have upper and/or lower
> >> bounds). Actually it is simpler than that, some of the parameters
> >> must be positive. We do not provide gradient evaluations. I once*
> >> worked out the gradient of the criterion (I think it was the best
> >> mathematics I ever did) and then found that it ended up slowing the
> >> optimization to a crawl in the difficult cases. A bit of
> reflection
> >> showed that each evaluation of the gradient could be hundreds or
> >> thousands of times more complex than an evaluation of the objective
> >> itself so you might as well use a gradient free method and just do
> >> more function evaluations. In many cases you know several points
> >> where you will be evaluating the objective so you could split those
> >> off into different threads.
> >>
> >> I don't know of such a mulithreaded optimizer (many of the
> optimizers
> >> that I find are still being written in Fortran 77, God help us) but
> >> that would be my best bet if one could be found. However,
> I am saying
> >> this without having done the profiling of the calculation myself so
> >> that is still a guess.
> >>
> >> * Bates and DebRoy, "Linear mixed models and penalized
> least squares",
> >> Journal of Multivariate Analysis, 91 (2004) 1-17
> >>
> >> _______________________________________________
> >> R-sig-mixed-models at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >>
> >
> > _______________________________________________
> > R-sig-mixed-models at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
Notice: This e-mail message, together with any attachme...{{dropped:11}}
More information about the R-sig-mixed-models
mailing list