[R] optim

Thu Aug 4 12:24:41 CEST 2005

nwew wrote:
> Dear R-helpers,
> 
> The function optim implements algorithms that I would like to use.
> 
> 
> I have function implemented in R, which given the parameters of which 
> minimization is to take place returns a scalar as well as the gradient.
> 
> Unfortunately optim requires two function _fn_ and _gr_ where fn returns the 
> function value and gr the gradient. Splitting my function in two functions 
> would be easy, however I am wondering if evaluating both is not doubling the 
> the very high computational costs. Most of the computational intensive 
> operations are identical if computing the function value and gradient.
> 
> Question: is there a way to tweek optim that only one function evaluation is 
> necessary? Are ther other implementations of the algorithm, which do assume 
> that the function to be minimized returns the function value and the gradient 
> as well?
> 
> Thanks
> Eryk.
> 

Hi, Eryk,

?optim does not *require* the "gr" argument. If one is not supplied, 
numerical gradients are used for which optim evaluates "fn" twice. 
However, in many cases analytical gradients can both improve numerical 
accuracy and computational speed.

Trivial example:

fn <- function(beta) {
   sum((y - x %*% beta)^2)
}

gr <- function(beta) {
   colSums(-2 * x * drop((y - x %*% beta)))
}

set.seed(1)
n <- 10000
p <- 5
g <- factor(rep(LETTERS[1:p], each = n/p))
x <- model.matrix(~g)
beta <- rnorm(p)
y <- drop(x %*% beta + rnorm(n))
start <- rep(0, p)

system.time(f1 <- optim(start, fn, gr, hessian = TRUE))
# [1] 0.47 0.00 0.48   NA   NA

system.time(f2 <- optim(start, fn, hessian = TRUE))
# [1] 0.54 0.00 0.53   NA   NA

f1$par
#[1] -0.6408643  0.2128109 -0.8378791  1.5983054  0.3366216

f2$par
#[1] -0.6408643  0.2128109 -0.8378791  1.5983054  0.3366216

f1$hessian
#      [,1] [,2] [,3] [,4] [,5]
#[1,] 20000 4000 4000 4000 4000
#[2,]  4000 4000    0    0    0
#[3,]  4000    0 4000    0    0
#[4,]  4000    0    0 4000    0
#[5,]  4000    0    0    0 4000

f2$hessian
#      [,1]         [,2] [,3]         [,4] [,5]
#[1,] 20000 4.000000e+03 4000 4.000000e+03 4000
#[2,]  4000 4.000000e+03    0 2.273737e-07    0
#[3,]  4000 0.000000e+00 4000 0.000000e+00    0
#[4,]  4000 2.273737e-07    0 4.000000e+03    0
#[5,]  4000 0.000000e+00    0 0.000000e+00 4000
#

Does this answer your question? Furthermore, have you read the chapter 
in MASS that discusses optim?

HTH,

--sundar