# [R] question about model formula

Peter Dalgaard BSA p.dalgaard at biostat.ku.dk
Thu Mar 6 23:16:25 CET 2003

```Setzer.Woodrow at epamail.epa.gov writes:

> I did not know about stepsize being an issue.  I had thought that
> problems with convergence in this case were due to bad approximations of
> the finite difference gradient.  I guessed that around the optimum,
> numerical errors would come to dominate the gradient calculations,
> causing convergence to fail.

In the case we looked at, it was quite obvious. nls simply ground
to a halt quite far from the optimum and plotting the SSD as a
function of the parameters showed a nice overall parabola, but with
points of discontinuity. It was without supplying gradients though,
and as you point out, that probably improves the behaviour, at least
until the "endgame" where you might find that the gradient is not zero
at the optimum and vice versa.

> I've found that the issue about confusing optimizers that use gradients
> can sometimes be fixed by augmenting the original system of odes with
> what I believe engineers call the sensitivity equations.  If your
> original equation is
> dg/dt = f(t, b), where b is a parameter to be estimated,
> then include
>
> d^2g/dtdb (be sure to remember the chain rule when doing this!)
>
> with the original equation.  With some regularity assumptions, this
> integrates to dg/db, which can be used to give nls a gradient to work
> with.

Yep, I know the implicit differentiation trick - even got an old paper in
Biometrics using essentially that in a PDE setting. We should probably
try it with our odesolve example (and in general there were a lot of
potential tuning knobs that we didn't touch).

--
O__  ---- Peter Dalgaard             Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics     2200 Cph. N
(*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907

```