[Rd] on the optim function
Christophe Dutang
dutangc at gmail.com
Fri Aug 6 17:25:02 CEST 2010
Thanks Ravi for your answer. You convince me!
Regards
Christophe
Le 6 août 2010 à 16:48, Ravi Varadhan a écrit :
> Christophe,
>
> This is a good question. It is conventional that most classical optimization
> schemes (e.g. gradient-descent, quasi-Newton, and Newton-type) methods do
> not return the "iteration" counter, but they do return the number of times
> the gradient is computed. These two are equivalent in the sense that the
> gradient is evaluated once and only once during each iteration. However,
> the optimization codes always return the number of times the objective
> function is evaluated. This is widely considered to be the most appropriate
> metric for measuring the computational effort of the algorithm. The reason
> is this. The actual step length of an algorithm for each iteration depends
> upon a "safe-guard" rule such as a line-search or a trust-region approach.
> These safe-guard rules might evaluate the function multiple times to ensure
> that the step length is "safe", for example, that it results in a reduction
> of the function value. Therefore, just monitoring the number of iterations
> does not give you an accurate picture of the computational effort expended
> by the algorithm.
>
> Another point worth noting is that when an analytic gradient function is not
> specified by the user, most (if not all) gradient-based algorithms use a
> numerical approximation for the gradient. This is typically done using a
> first-order forward difference scheme. This involves `p' evaluations of the
> objective function, when you have a p-dimensional parameter vector to be
> optimized. This greatly increases the computational effort, much more than
> that involved in safe-guarding. Therefore, it is always a good idea to
> specify analytic gradient, whenever possible.
>
> Here is an example (based on that from optim's help page) that illustrates
> the equivalency of # iterations and gradient count:
>
>> optim(c(-1.2,1), fr, grr, method="CG", control=list(maxit=50))
> $par
> [1] -0.9083598 0.8345932
>
> $value
> [1] 3.636803
>
> $counts
> function gradient
> 202 51
>
> $convergence
> [1] 1
>
> $message
> NULL
>
>
>> optim(c(-1.2,1), fr, method="CG", control=list(maxit=75))$counts
> function gradient
> 304 76
>>
> Note that when I specified `maxit = 50', I got gradient counts = 51, and
> when I specified maxit=75, I got gradient counts = 76.
>
> But look at the difference in function counts. There is 102 more function
> evals, even though I did only 25 more iterations. This tells you that the
> function is evaluated more than once during an iteration.
>
> The above is also true for BFGS, L-BFGS-B. You can check it for yourself.
>
>
> Hope this helps,
> Ravi.
>
>
> -----Original Message-----
> From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org]
> On Behalf Of Christophe Dutang
> Sent: Friday, August 06, 2010 5:15 AM
> To: r-devel at r-project.org
> Subject: [Rd] on the optim function
>
> Dear useRs,
>
> I have just discovered that the R optim function does not return the number
> of iterations.
>
> I still wonder why line 632-634 of optim C, the iter variable is not
> returned (for the BFGS method for example) ?
>
> Is there any trick to compute the iteration number with function call
> number?
>
> Kind regards
>
> Christophe
>
> --
> Christophe Dutang
> Ph.D. student at ISFA, Lyon, France
> website: http://dutangc.free.fr
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
Christophe Dutang
Ph.D. student at ISFA, Lyon, France
website: http://dutangc.free.fr
More information about the R-devel
mailing list