[R] optim, L-BFGS-B | constrained bounds on parms?

Fri Sep 19 17:32:37 CEST 2014

One choice is to add a penalty to the objective to enforce the
constraint(s) along with bounds to keep the parameters from going wild.

This generally works reasonably well. Sometimes it helps to run just a
few iterations with a big penalty scale to force the parameters into a
feasible region, though a lot depends on the particular problem in my
experience, with some being straightforward and others needing a lot of
fiddle.

I suspect a math programming approach is overkill, though R does have
some packages for that. Your mileage may vary.

Note that L-BFGS-B used by R is a version for which Nocedal et al.
reported a bug in 2011 and provided a new Fortran code. I've recently
put up an implementation of the new code on r-forge under the optimizer
project. Still testing, but I think it's OK. You could also use Rvmmin
that has bounds, or nmkb from dfoptim (though you cannot start on bounds).

Best, JN

On 14-09-19 06:00 AM, r-help-request at r-project.org wrote:
> Message: 27
> Date: Fri, 19 Sep 2014 00:55:04 -0400
> From: Evan Cooch <evan.cooch at gmail.com>
> To: r-help at r-project.org
> Subject: [R] optim, L-BFGS-B | constrained bounds on parms?
> Message-ID: <541BB728.6030704 at gmail.com>
> Content-Type: text/plain; charset="UTF-8"
> 
> Or, something to that effect. Following is an example of what I'm 
> working with basic ABO blood type ML estimation from observed type 
> (phenotypic) frequencies. First, I generate a log-likelihood function. 
> mu[1] -> mu[2] are allele freqs for A and B alleles, respectively. Since 
> freq of O allele is redundant, I use 1-mu[1]-mu[2] for that. The terms 
> in the function are the probability expressions for the expected values 
> of each phenotype.
> 
> But, that is somewhat besides the point:
> 
> f_abo <- function(mu) { 
> 25*log(mu[1]^2+2*mu[1]*(1-mu[1]-mu[2]))+25*log(mu[2]^2+2*mu[2]*(1-mu[1]-mu[2]))+50*log(2*mu[1]*mu[2])+15*log((1-mu[1]-mu[2])^2) 
> }
> 
> 
> So, I want to come up with MLE for mu[1] and mu[2] (for alleleic freqs 
> for A and B alleles, respectively. Now, given the data, I know (from 
> having maximized this likelihood outside of R) that the MLE for mu[1] is 
> 0.37176, and for mu[2], the same -- mu[2]=0.371763. I confirm this in 
> MATLAB, and Maple, and Mathematica, using various non-linear 
> solvers/optimization routines. They all yielded recisely the right answers.
> 
> But, stuck trying to come up with a general approach to getting the 
> 'right estimates' in R, that doesn't rely on strong prior knowledge of 
> the parameters. I tried the following - I used L-BFGDS-B' because this 
> is a 'boxed' optimzation: mu[1] and mu[2] are both parameters on the 
> interval [0,1].
> 
>   results <- optim(c(0.3,0.3), f_abo,
>       method = "L-BFGS-B", lower=c(0.1,0.1), upper=c(0.9,0.9),
>        hessian = TRUE,control=list(fnscale=-1))
> 
> but that through the following error at me:
> 
> L-BFGS-B needs finite values of 'fn'
> 
> OK, fine. Taking that literally, and thinking a bit, clear that the 
> problem is that the upper bound on the parms creates the problem. So, I 
> try the crude approach of making the upper bound for each 0.5:
> 
> 
>   results <- optim(c(0.3,0.3), f_abo,
>       method = "L-BFGS-B", lower=c(0.1,0.1), upper=c(0.5,0.5),
>        hessian = TRUE,control=list(fnscale=-1))
> 
> 
> No errors this time, but no estimates either. At all.
> 
> OK -- so I 'cheat', and since I know that mu[1]=mu[2]=0.37176, I make 
> another change to the upper limit, using 0.4 for both parms:
> 
> 
> 
>   results <- optim(c(0.3,0.3), f_abo,
>       method = "L-BFGS-B", lower=c(0.1,0.1), upper=c(0.4,0.4),
>        hessian = TRUE,control=list(fnscale=-1))
> 
> 
> Works perfectly, and...right estimates too. ;-)
> 
> But, I could get there from here because I had prior knowledge of the 
> parameter values. In other words, I cheated (not a thinly veiled 
> suggestion that prior information is cheating, of course ;-)
> 
> What I'm trying to figure out is how to do a constrained optimization 
> with R, where mu[1] and mu[2] are estimated subject to the constraint that
> 
> 0 <= mu[1]+mu[2] <= 1
> 
> There seems to be no obvious way to impose that -- which creates a 
> problem for optim since if I set 'vague' bounds on the parms (as per 
> original attempt), optim tries combinations (like mu[1]=0.9, mu[2]=0.9), 
> which aren't plausible, given the constraint that 0 <= mu[1]+mu[2] <= 1. 
> Further, in this example, mu[1]=mu[2]. That might not be the case, and I 
> might need to set upper bound on a parameter to be >0.5. But, without 
> knowing which parameter, I'd need to set both from (say) 0.1 -> 0.9.
> 
> Is this possible with optim, or do I need to use a different package? If 
> I can get there from here using optim, what do I need to do, either to 
> my call to the optim routine, or the function that I pass to it?
> 
> This sort of thing is quite easy in (say) Maple. I simply execute
> 
> NLPSolve(f_abo,initialpoint={mu[1]=0.2,mu[2]=0.2},{mu[1]+mu[2]<=1},mu[1]=0.1..0.9,mu[2]=0.1..0.9,maximize);
> 
> where I'm telling the NLPSolve function that there is a constraint for 
> mu[1] and mu[2] (as above), which lets me set bounds on the parameter 
> over larger interval. Can I do the same in R?
> 
> Again, I'm trying to avoid having to use a 'good guess'. I know I can 
> gene count to come up with a quick and dirty starting point (the basis 
> for the EM algorithm commonly used for this), but again, I'm trying to 
> avoid that.
> 
> Thanks very much in advance.
> 
> 	[[alternative HTML version deleted]]