[Rd] on the optim function

Fri Aug 6 17:25:02 CEST 2010

Thanks Ravi for your answer. You convince me!

Regards

Christophe

Le 6 août 2010 à 16:48, Ravi Varadhan a écrit :

> Christophe,
> 
> This is a good question. It is conventional that most classical optimization
> schemes (e.g. gradient-descent, quasi-Newton, and Newton-type) methods do
> not return the "iteration" counter, but they do return the number of times
> the gradient is computed.  These two are equivalent in the sense that the
> gradient is evaluated once and only once during each iteration.  However,
> the optimization codes always return the number of times the objective
> function is evaluated.  This is widely considered to be the most appropriate
> metric for measuring the computational effort of the algorithm.  The reason
> is this.  The actual step length of an algorithm for each iteration depends
> upon a "safe-guard" rule such as a line-search or a trust-region approach.
> These safe-guard rules might evaluate the function multiple times to ensure
> that the step length is "safe", for example, that it results in a reduction
> of the function value.  Therefore, just monitoring the number of iterations
> does not give you an accurate picture of the computational effort expended
> by the algorithm.
> 
> Another point worth noting is that when an analytic gradient function is not
> specified by the user, most (if not all) gradient-based algorithms use a
> numerical approximation for the gradient.  This is typically done using a
> first-order forward difference scheme.  This involves `p' evaluations of the
> objective function, when you have a p-dimensional parameter vector to be
> optimized.  This greatly increases the computational effort, much more than
> that involved in safe-guarding.  Therefore, it is always a good idea to
> specify analytic gradient, whenever possible.
> 
> Here is an example (based on that from optim's help page) that illustrates
> the equivalency of # iterations and gradient count:
> 
>> optim(c(-1.2,1), fr, grr, method="CG", control=list(maxit=50))
> $par
> [1] -0.9083598  0.8345932
> 
> $value
> [1] 3.636803
> 
> $counts
> function gradient 
>     202       51 
> 
> $convergence
> [1] 1
> 
> $message
> NULL
> 
> 
>> optim(c(-1.2,1), fr, method="CG", control=list(maxit=75))$counts
> function gradient 
>     304       76 
>> 
> Note that when I specified `maxit = 50', I got gradient counts = 51, and
> when I specified maxit=75, I got gradient counts = 76.
> 
> But look at the difference in function counts.  There is 102 more function
> evals, even though I did only 25 more iterations.  This tells you that the
> function is evaluated more than once during an iteration.
> 
> The above is also true for BFGS, L-BFGS-B.  You can check it for yourself.
> 
> 
> Hope this helps,
> Ravi.
> 
> 
> -----Original Message-----
> From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org]
> On Behalf Of Christophe Dutang
> Sent: Friday, August 06, 2010 5:15 AM
> To: r-devel at r-project.org
> Subject: [Rd] on the optim function
> 
> Dear useRs,
> 
> I have just discovered that the R optim function does not return the number
> of iterations. 
> 
> I still wonder why line 632-634 of optim C, the iter variable is not
> returned (for the BFGS method for example) ? 
> 
> Is there any trick to compute the iteration number with function call
> number?
> 
> Kind regards
> 
> Christophe
> 
> --
> Christophe Dutang
> Ph.D. student at ISFA, Lyon, France
> website: http://dutangc.free.fr
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

--
Christophe Dutang
Ph.D. student at ISFA, Lyon, France
website: http://dutangc.free.fr