[R] another optimization question

Prof Brian D Ripley ripley at stats.ox.ac.uk
Sun Nov 25 09:26:23 CET 2001


On Sat, 24 Nov 2001, John Fox wrote:

> Dear R list members,
>
> Since today seems to be the day for optimization questions, I have one that
> has been puzzling me:
>
> I've been doing some work on sem, my structural-equation modelling package.
> The models that the sem function in this package fits are essentially
> parametrizations of the multinormal distribution.  The function uses optim
> and nlm sequentially to maximize a multinormal likelihood. One of the
> changes I've introduced is to use an analytic gradient rather than rely on
> numerical derivatives. (If I can figure it out, I'd like to use an analytic
> Hessian as well.)
>
> I could provide additional details, but the question that I have is
> straightforward. I expected that using an analytic gradient would make the
> program faster and more stable. It *is* substantially faster, by up to an
> order of magnitude on the problems that I've tried. In one case, however, a
> model that converged (to the published solution) with numerical derivatives
> failed to converge with analytic derivatives. I can program around the
> problem, by having the program fall back to numerical derivatives when
> convergence fails, but I was surprised by this result, and I'm concerned
> that it reflects a programming problem or an error in my math. I suspect
> that if I had made such an error, however, the other examples I tried would
> not have worked so well.
>
> So, my question is, is it possible in principle for an optimization to fail
> using a correct analytic gradient but to converge with a numerical
> gradient? If this is possible, is it a common occurrence?

It's possible but rare.  You don't have a `correct analytic gradient', but
a numerical computation of it.  Inaccurately computed gradients are a
common cause of convergence problems. You may need to adjust the
tolerances.

It's also possible in principle that the optimizer takes a completely
different path from the starting point due to small differences in
calculated derivatives.  It's worth trying staritng near the expected
answer to rule this out.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list