[Rd] Minor glitch in optim()
Martin Maechler
maechler at stat.math.ethz.ch
Fri Apr 20 09:55:44 CEST 2018
>>>>> J C Nash <profjcnash at gmail.com>
>>>>> on Tue, 17 Apr 2018 10:18:02 -0400 writes:
> Having worked with optim() and related programs for years, it surprised me
> that I haven't noticed this before, but optim() is inconsistent in how it
> deals with bounds constraints specified at infinity. Here's an example:
# optim-glitch-Ex.R
x0 <- c(1,2,3,4)
fnt <- function(x, fscale=10){
yy <- length(x):1
val <- sum((yy*x)^2)*fscale
}
grt <- function(x, fscale=10){
nn <- length(x)
yy <- nn:1
# gg <- rep(NA,nn)
gg <- 2*(yy^2)*x*fscale
gg
}
npar <- 4
lower <- -Inf
l2 <- rep(-Inf,npar)
a1 <- optim(x0, fnt, grt, lower=lower, method="BFGS") # works
a1
a1a<- optim(x0, fnt, grt, lower=l2, method="BFGS") # changes method!
a1a
> The first call uses BFGS method without warning. The second gives
> a warning that L-BFGS-B should be used, and from the output uses
> this.
> This is a bit of an edge case. My own preference would be for optim()
> to simply fail if bounds of any type are specified without L-BFGS-B
> as the method. I believe that gives clarity, even though infinite
> bounds imply an unconstrained problem.
> The behaviour where a scalar infinite bound is treated as unconstrained
> but a vector is not is inconsistent, however, and I think that at some
> point should be fixed.
I agree that it inconsistent. The help page mentions that
"non-trivial" bounds are treated like that, and if you'd
consider lower = c(-Inf, -Inf) to be "non-trivial", one could
say optim() behaved according to documentation and non-buggy,
but a much more reasonable definition of non-trivial bounds are
bounds that are not equivalent to (lower=-Inf, upper=+Inf) and
the vector versions *are* equivalent.
So I agree about fixing.
> point should be fixed. Possibly the easiest way is to treat infinite
> bounds specified as a vector the same as those specified as a scalar.
I agree.
> That is to adjust the code in File src/library/stats/R/optim.R
> in the block
> if((length(lower) > 1L || length(upper) > 1L ||
> lower[1L] != -Inf || upper[1L] != Inf)
> && !any(method == c("L-BFGS-B","Brent"))) {
> warning("bounds can only be used with method L-BFGS-B (or Brent)")
> method <- "L-BFGS-B"
> }
> Possibly
> if((any(is.finite(lower) || any(is.finite(upper))
> && !any(method == c("L-BFGS-B","Brent"))) {
> warning("bounds can only be used with method L-BFGS-B (or Brent)")
> method <- "L-BFGS-B"
> }
> Best, JN
I aim to go for the first line
if((any(lower > -Inf)) || any(upper < Inf))
which is 100% correct and easier to "read", even though your
version -- after adding the missing ')' -- is faster
and almost always equivalent.
Running checks now. Wont' be in R 3.5.0 of course, but then
probably in R 3.5.0 patched.
Martin Maechler
ETH Zurich
More information about the R-devel
mailing list