[R] optim, L-BFGS-B | constrained bounds on parms?
Prof J C Nash (U30A)
nashjc at uottawa.ca
Fri Sep 19 17:32:37 CEST 2014
One choice is to add a penalty to the objective to enforce the
constraint(s) along with bounds to keep the parameters from going wild.
This generally works reasonably well. Sometimes it helps to run just a
few iterations with a big penalty scale to force the parameters into a
feasible region, though a lot depends on the particular problem in my
experience, with some being straightforward and others needing a lot of
fiddle.
I suspect a math programming approach is overkill, though R does have
some packages for that. Your mileage may vary.
Note that L-BFGS-B used by R is a version for which Nocedal et al.
reported a bug in 2011 and provided a new Fortran code. I've recently
put up an implementation of the new code on r-forge under the optimizer
project. Still testing, but I think it's OK. You could also use Rvmmin
that has bounds, or nmkb from dfoptim (though you cannot start on bounds).
Best, JN
On 14-09-19 06:00 AM, r-help-request at r-project.org wrote:
> Message: 27
> Date: Fri, 19 Sep 2014 00:55:04 -0400
> From: Evan Cooch <evan.cooch at gmail.com>
> To: r-help at r-project.org
> Subject: [R] optim, L-BFGS-B | constrained bounds on parms?
> Message-ID: <541BB728.6030704 at gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> Or, something to that effect. Following is an example of what I'm
> working with basic ABO blood type ML estimation from observed type
> (phenotypic) frequencies. First, I generate a log-likelihood function.
> mu[1] -> mu[2] are allele freqs for A and B alleles, respectively. Since
> freq of O allele is redundant, I use 1-mu[1]-mu[2] for that. The terms
> in the function are the probability expressions for the expected values
> of each phenotype.
>
> But, that is somewhat besides the point:
>
> f_abo <- function(mu) {
> 25*log(mu[1]^2+2*mu[1]*(1-mu[1]-mu[2]))+25*log(mu[2]^2+2*mu[2]*(1-mu[1]-mu[2]))+50*log(2*mu[1]*mu[2])+15*log((1-mu[1]-mu[2])^2)
> }
>
>
> So, I want to come up with MLE for mu[1] and mu[2] (for alleleic freqs
> for A and B alleles, respectively. Now, given the data, I know (from
> having maximized this likelihood outside of R) that the MLE for mu[1] is
> 0.37176, and for mu[2], the same -- mu[2]=0.371763. I confirm this in
> MATLAB, and Maple, and Mathematica, using various non-linear
> solvers/optimization routines. They all yielded recisely the right answers.
>
> But, stuck trying to come up with a general approach to getting the
> 'right estimates' in R, that doesn't rely on strong prior knowledge of
> the parameters. I tried the following - I used L-BFGDS-B' because this
> is a 'boxed' optimzation: mu[1] and mu[2] are both parameters on the
> interval [0,1].
>
> results <- optim(c(0.3,0.3), f_abo,
> method = "L-BFGS-B", lower=c(0.1,0.1), upper=c(0.9,0.9),
> hessian = TRUE,control=list(fnscale=-1))
>
> but that through the following error at me:
>
> L-BFGS-B needs finite values of 'fn'
>
> OK, fine. Taking that literally, and thinking a bit, clear that the
> problem is that the upper bound on the parms creates the problem. So, I
> try the crude approach of making the upper bound for each 0.5:
>
>
> results <- optim(c(0.3,0.3), f_abo,
> method = "L-BFGS-B", lower=c(0.1,0.1), upper=c(0.5,0.5),
> hessian = TRUE,control=list(fnscale=-1))
>
>
> No errors this time, but no estimates either. At all.
>
> OK -- so I 'cheat', and since I know that mu[1]=mu[2]=0.37176, I make
> another change to the upper limit, using 0.4 for both parms:
>
>
>
> results <- optim(c(0.3,0.3), f_abo,
> method = "L-BFGS-B", lower=c(0.1,0.1), upper=c(0.4,0.4),
> hessian = TRUE,control=list(fnscale=-1))
>
>
> Works perfectly, and...right estimates too. ;-)
>
> But, I could get there from here because I had prior knowledge of the
> parameter values. In other words, I cheated (not a thinly veiled
> suggestion that prior information is cheating, of course ;-)
>
> What I'm trying to figure out is how to do a constrained optimization
> with R, where mu[1] and mu[2] are estimated subject to the constraint that
>
> 0 <= mu[1]+mu[2] <= 1
>
> There seems to be no obvious way to impose that -- which creates a
> problem for optim since if I set 'vague' bounds on the parms (as per
> original attempt), optim tries combinations (like mu[1]=0.9, mu[2]=0.9),
> which aren't plausible, given the constraint that 0 <= mu[1]+mu[2] <= 1.
> Further, in this example, mu[1]=mu[2]. That might not be the case, and I
> might need to set upper bound on a parameter to be >0.5. But, without
> knowing which parameter, I'd need to set both from (say) 0.1 -> 0.9.
>
> Is this possible with optim, or do I need to use a different package? If
> I can get there from here using optim, what do I need to do, either to
> my call to the optim routine, or the function that I pass to it?
>
> This sort of thing is quite easy in (say) Maple. I simply execute
>
> NLPSolve(f_abo,initialpoint={mu[1]=0.2,mu[2]=0.2},{mu[1]+mu[2]<=1},mu[1]=0.1..0.9,mu[2]=0.1..0.9,maximize);
>
> where I'm telling the NLPSolve function that there is a constraint for
> mu[1] and mu[2] (as above), which lets me set bounds on the parameter
> over larger interval. Can I do the same in R?
>
> Again, I'm trying to avoid having to use a 'good guess'. I know I can
> gene count to come up with a quick and dirty starting point (the basis
> for the EM algorithm commonly used for this), but again, I'm trying to
> avoid that.
>
> Thanks very much in advance.
>
> [[alternative HTML version deleted]]
More information about the R-help
mailing list