# [R] How to improve the "OPTIM" results

kathie kathryn.lord2000 at gmail.com
Sun Apr 6 04:47:58 CEST 2008

```Dear Spencer Graves,

I knew that the true optimum values are about (0.69, 0.0, 0.0, -0.3, 0.3,
2.3, -3.0 ,-3.0).  That's why I used these values as starting values.

Nonetheless, the last three estimates are so bad.

Regards,

Kathryn Lord

Spencer Graves wrote:
>
> Dear Katheryn:
>
>       I'm confused by your claim that, "Even though I used the true
> parameter values as initial values, the results are not very good."
>
>       When I ran it, 'optim' quit with \$value = -35835, substantially
> less than fr2(theta0) = -0.3.
>
>       Could you please review your question, because I believe this will
>
>
> MORE GENERAL OPTIM ISSUES
>
>       It is known that 'optim' has problems.  Perhaps the simplest thing
> to do is to call 'optim' with each of the 'methods' in sequence, using
> the 'optim' found by each 'method' as the starting value for the next.
> When I do this, I often skip 'SANN', because it typically takes so much
> more time than the other methods.  However, if there might be multiple
> local minima, then SANN may be the best way to find a global minimum,
> though you may want to call 'optim' again with another method, starting
> from optimal solution returned by 'SANN'.
>
>       Another relatively simple thing to do is to call optim with
> hessian = TRUE and then compute eigen(optim(..., hessian=TRUE)\$hessian,
> symmetric=TRUE).  If optim found an honest local minimum, all the
> eigenvalues will be positive.  For your example, the eigenvalues were as
> follows:
>
> 19000, 11000,
>      0.06, 0.008, 0.003,
>      -0.0002, -0.06,
> -6000
>
>       This says that locally, the "minimum" identified was really a
> saddle point, being a parabola opening up in two dimensions (with
> curvatures of 19000 and 11000 respectively), a parabola opening down in
> one dimension (with curvature -6000), and relatively flat by comparison
> in the other 5 dimensions.  In cases like this, it should be moderately
> easy and fast to explore the dimension with the largest negative
> eigenvalue with the other dimensions fixed until local minima in each
> direction were found.  Then 'optim' could be restarted from both those
> local minima, and the minimum of those two solutions could be returned.
>
>       If all eigenvalues were positive, it might then be wise to restart
> with parscale = the square roots of the diagonal of the hessian;  I
> haven't tried this, but I believe it should work.  Using 'parscale' can
> fix convergence problems -- or create some where they did not exist
> initially.
>
>       I'm considering creating a package 'optimMLE' that would automate
> some of this and package it with common 'methods' that would assume that
> sum(fn(...)) was either a log(likelihood) or the negative of a
> log(likelihood), etc.  However, before I do, I need to make more
> progress on some of my other commitments, review RSiteSearch("optim",
> "fun") to make sure I'm not duplicating something that already exists,
> me off-line.
>
>       Hope this helps.
>       Spencer
>
> kathie wrote:
>> Dear R users,
>>
>> I used to "OPTIM" to minimize the obj. function below.  Even though I
>> used
>> the true parameter values as initial values, the results are not very
>> good.
>>
>> How could I improve my results?  Any suggestion will be greatly
>> appreciated.
>>
>> Regards,
>>
>> Kathryn Lord
>>
>> #------------------------------------------------------------------------------------------
>>
>> x = c(0.35938587,  0.34889725,  0.20577608,  0.15298888, -4.24741146,
>> -1.32943326,
>>         0.29082527, -2.38156942, -6.62584473, -0.22596237,  6.97893687,
>> 0.22001081,
>>        -1.39729222, -5.17860124,  0.52456484,  0.46791660,  0.03065136,
>> 0.32155177,
>>        -1.12530013,  0.02608057, -0.22323154,  3.30955460,  0.54653982,
>> 0.29403011,
>>         0.47769874,  2.42216260,  3.93518355,  0.23237890,  0.09647044,
>> -0.48768933,
>>         0.37736377,  0.43739341, -0.02416010,  4.02788119,  0.07320802,
>> -0.29393054,
>>         0.25184609,  0.76044448, -3.34121918,  1.16028677, -0.60352008,
>> -2.86454069,
>>        -0.84411691,  0.24841071, -0.11764954,  5.92662106,  1.03932953,
>> -6.21987657,
>>        -0.54763352,  0.20263192)                         # data
>>
>> theta0= c(log(2),0,c(0,-.3,.3),log(c(10,.05,.05)))  # initial value(In
>> fact,
>> true parameter value)
>> n = length(x)
>>
>> fr2 = function(theta)
>> {
>> 	a1 = theta; a2 = theta
>> 	mu1 = theta; mu2 = theta; mu3 = theta
>> 	g1 = theta; g2 = theta; g3 = theta
>>
>> 	w1=exp(a1)/(1+exp(a1)+exp(a2))
>> 	w2=exp(a2)/(1+exp(a1)+exp(a2))
>> 	w3=1-w1-w2
>>
>> 	obj =((w1^2)/(2*sqrt(exp(g1)*pi))
>>          +  (w2^2)/(2*sqrt(exp(g2)*pi))
>>          +  (w3^2)/(2*sqrt(exp(g2)*pi))
>>
>> 	 +  2*w1*w2*dnorm((mu1-mu2)/sqrt(exp(g1)+exp(g2)))/sqrt(exp(g1)+exp(g2))
>>          +
>> 2*w1*w3*dnorm((mu1-mu3)/sqrt(exp(g1)+exp(g3)))/sqrt(exp(g1)+exp(g3))
>>          +
>> 2*w2*w3*dnorm((mu2-mu3)/sqrt(exp(g2)+exp(g3)))/sqrt(exp(g2)+exp(g3))
>>
>>          -  (2/n)*(sum(w1*dnorm((x-mu1)/sqrt(exp(g1)))/sqrt(exp(g1))
>>                      + w2*dnorm((x-mu2)/sqrt(exp(g2)))/sqrt(exp(g2))
>>                      + w3*dnorm((x-mu3)/sqrt(exp(g3)))/sqrt(exp(g3)) )))
>>
>> 	return(obj)
>> }
>>
>> optim(theta0, fr2, method=BFGS", control = list (fnscale=10,
>> parscale=c(.01,.01,.01,.01,.01,1,1,1), maxit=100000))
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help