[R-sig-ME] nlme and NONMEM

Fri Nov 2 00:34:12 CET 2007

Hi All,

This thread reminds me of an experience using nlme about 10 years ago. I was
remodeling a previously analyzed (and published) pharmacokinetic data set on
the drug remifentanil; NONMEM had been used to estimate a 3 compartment (6
parameter) model. The data included multiple plasma concentration values for
each subject.

Using nlme, no convergence was possible for a three compartment model
despite various choices of the control language and covariance structure. A
two compartment model converged.

Doug provided very useful tips to me at the time. For example, a visual
inspection of the raw data (time course of remifentanil concentration decay)
revealed only one inflection point in the decay curves for most subjects,
whereas two inflection points would be consistent with a three compartment
model. The data was not sufficient to fit a three compartment model.

I have never used NONMEM. Speaking to associates using NONMEM in the 90s,
they assured me that NONMEM could always be tweaked to converge. This was
considered a virtue.

This is an example NONMEM allowing overparametrized models.

Nathan
-- 
Nathan Leon Pace, MD, MStat
University of Utah
n.l.pace at utah.edu
W: 801.581.6393
F: 801.581.4367
M: 801.205.1019

> From: Douglas Bates <bates at stat.wisc.edu>
> Date: Thu, 1 Nov 2007 16:42:29 -0500
> To: Rob Forsyth <r.j.forsyth at newcastle.ac.uk>
> Cc: <r-sig-mixed-models at r-project.org>
> Subject: Re: [R-sig-ME] nlme and NONMEM
> 
> On 11/1/07, Rob Forsyth <r.j.forsyth at newcastle.ac.uk> wrote:
>> I'd appreciate hearing from anyone (off list if you think it more
>> appropriate) who can share their comparative experiences of non-
>> linear mixed effects modelling with both nlme and NONMEM. The latter
>> appears the traditional tool of choice particularly in pharmacology.
>> Having built up some familiarity with nlme I am now collaborating (on
>> a non-pharmacological project) with someone strongly encouraging me
>> to move to NONMEM, although that clearly represents another
>> considerable learning curve. The main argument in favour is the
>> relative difficulty I have had in getting convergence with nlme
>> models in my relatively sparse datasets particularly when (as in my
>> case) I am interested in the random effects covariance matrix and
>> wish to avoid having to coerce it using pdDiag().
> 
>> I note the following comment from Douglas Bates on the R-help archive
> 
>>> The nonlinear optimization codes used by S-PLUS and R are different.
>>> There are advantages to the code used in R relative to the code used
>>> in S-PLUS but there are also disadvantages. One of the disadvantages
>>> is that the code in R will try very large steps during its initial
>>> exploration phase then it gets trapped in remote regions of the
>>> parameter space. For nlme this means that the estimate of the
>>> variance-covariance matrix of the random effects becomes singular.
> 
>>> Recent versions of the nlme library for R have a subdirectory called
>>> scripts that contains R scripts for the examples from each of the
>>> chapters in our book. If you check them you will see that not all of
>>> the nonlinear examples work in the R version of nlme. We plan to
>>> modify the choice of starting estimates and the internal algorithms to
>>> improve this but it is a long and laborious process. I ask for your
>>> patience.
> 
>> Can Doug or anyone comment on whether the development work on
>> lme4:::nlmer has included any steps in this direction or not?
> 
> Yes.
> 
> The algorithm in nlme alternates between solving a linear
> mixed-effects problem to update estimates of the variance components
> and solving a penalized nonlinear least squares problem to update
> estimates of the fixed-effects parameters and our approximation to the
> conditional distribution of the random effects.  This type of
> algorithm that alternates between two conditional optimizations is
> appealing because each of the sub-problems is much simpler than the
> general problem.  However it may have poor convergence properties.  In
> particular it may end up bouncing back and forth between two different
> conditional optima.
> 
> Also, at the time we wrote nlme we tried to remove the constraints on
> the variance components by transforming them away (In simple
> situations we iterate on the logarithm of the relative variances of
> the random effects.)  This works well except when the estimate of the
> variance component is zero.  Trying to reach zero when iterating on
> the logarithm scale can lead to very flat likelihood surfaces.
> 
> In the nlmer function I use the same parameterization of the
> variance-covariance of the random effects as in lmer and use the
> Laplace approximation to the log-likelihood.  Both of these changes
> should provide more reliable convergence, although the nlmer code has
> not been vetted to nearly the same extent as has the nlme code.  In
> other words, I am confident that the algorithm is superior but the
> implementation may still need some work.
> 
> Regarding NONMEM, I think the work Jose Pinheiro and I did on nlme and
> my current work on lme4 is based on a different philosophy than is the
> basis of NONMEM.  As I have mentioned on this and other forums (fora?)
> I want to be confident that the results from the code that I write
> actually do represent an optimum of the objective function (such as
> the likelihood or log-likelihood).  Nonlinear mixed-effects models for
> sparse data frequently end up being over-parameterized. In such cases
> I view it as a feature and not a bug that nlme or nlmer will indicate
> failure to converge.  They may also fail to converge when there is a
> well-defined optimum.  That behavior is not a feature.
> 
> As I understand it from people who have used NONMEM (I once had access
> to a copy of NONMEM but was never successful in getting it to run and
> haven't tried since then) it will produce estimates just about every
> time it is run.  Considering how ill-defined the parameter estimates
> in some nonlinear mixed-effects model fits can be, I don't view this
> as a feature.
> 
> Many people feel that statistical techniques and statistical software
> are some sort of magic that can extract information from data, even
> when the information is not there.  As I understand it from
> conversations many years ago with Lewis Sheiner, his motivation in
> developing NONMEM (with Stu Beal) was to be able to use routine
> clinical data (such as the Quinidine data in the nlme package) to
> estimate population pharmacokinetic parameters.
> 
> Routine clinical data like these are very sparse. In the Quinidine
> example the majority of subjects have 1, 2 or 3 concentration
> measurements
> 
>> table(table(subset(Quinidine, !is.na(Quinidine$conc))$Subject))
> 
>  1  2  3  4  5  6  7 10 11
> 46 33 31  9  3  8  2  1  3
> 
> and frequently these measurements are at widely spaced time points
> relative to the dosing schedule.  Such cases contribute almost no
> information to the parameter estimates, yet I have had pharmacologists
> suggest to me that it would be wonderful to use study designs in which
> each patient has only one concentration measurement and somehow the
> magic of nonlinear mixed effects will conjure estimates from such
> data.
> 
> The real world doesn't work like that.  If you have only one
> observation per person it should make sense that no amount of
> statistical magic will be able to separate the per-observation noise
> from the per-person variability.
> 
> So when I am told that NONMEM converged to parameter estimates on a
> problem where nlme or nlmer failed to converge I think (and sometimes
> say) "You mean NONMEM *declared* convergence to a set of estimates".
> Declaring convergence and converging can be different.
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models