[R] How to determine sensible values for 'fnscale' and 'parscale' in optim

Fri May 16 15:04:59 CEST 2008

Dear R-help,

I'm using the 'optim' functions to minimise functions, and have read the
documentation, but I'm still not sure how to determine sensible values to
use for the 'fnscale' and 'parscale' options.

If I have understood everything correctly, 'fnscale' should be used to scale
the objective function, so that for example if the default is 'sensible'
(or even 'optimal') for minimising 'f', one should use 'fnscale=1e-6' for
minimizing the function 'function(...) 1e-6 f(...)'.

But in which range of numbers should 'f' lie for the default 'fnscale' to
be reasonable (with other options, such as 'reltol', at their defaults)?
I understand that if 'f takes values around, e.g., 1e-10 (at least for
parameter values close the optimal ones), I need to use 'fnscale'. But how
much should I scale?

The same applies to 'parscale'. How do I termine reasonable values?
To make the question a bit less theoretical, how would one go about
choosing good values of 'fnscale' and 'parscale' to use when finding,
for example, the MLEs of a bivariate normal distribution using optim.

Here's code for this example:

-----------------------------------------------
library(MASS) # needed mvrnorm
library(mvtnorm) # needed for dmvnorm

set.sed(20080516)
n=1000
mu1=3
mu2=5
sig1=7
sig2=20
rho=.5
sigmat=matrix(c(sig1^2,sig1*sig2*rho,sig1*sig2*rho,sig2^2),2)
xy=mvrnorm(n,c(mu1,mu2),sigmat) # n = 1000 observations from this
                                # distribution.

obj=function(par,xy) # The function to maximize.
{
mu=par[1:2]
sigmat=matrix(c(par[3]^2,par[3]*par[4]*par[5],par[3]*par[4]*par[5],par[4]^2),2)
mean(dmvnorm(xy, mu, sigmat, log=TRUE))
}

# Using optim to find the MLEs.
optim( c(5,5,10,10,.5), obj, control=list(fnscale=-1), xy=xy)

# We could of course also calculated MLEs directly.
colMeans(xy)
sd(xy)*sqrt(1-1/n)
cor(xy)
-----------------------------------------------

Here optim converges to (approximately) the correct values, even with 
not very good initial values (though with method="CG" we do not get 
convergence without increasing maxit). But how should one choose 'fnscale'
and 'parscale' for faster or better convergence?

-- 
Karl Ove Hufthammer