[Rd] bounds violations, infinite loops in optim/L-BFGS-B (PR#671)

Tue, 3 Oct 2000 11:45:37 -0400 (EDT)

  Different day, more or less the same story -- turning off optimization
(and leaving out -ffloat-store) leads to hanging in different places, but
not any fewer hanging cases out of 1000 bootstrap samples.  I could try
the no optimization/-ffloat-store combination, but I strongly suspect the
results would be the same.  (As I mentioned before, I also get hanging in
different places with and without adding parscale to the optim() control
list.)
  These aren't pathological samples, as far as I can tell.
  I'm afraid that fussing with compiler options isn't going to do it ...
given enough time I might be able to dig into the lbfgsb.c code and figure
out what ABNORMAL_TERMINATION_IN_LNSRCH means and (at least) how to modify
the iteration counter so that optim() bails out when it gets into this
kind of loop rather than looping indefinitely.
  On the other hand, I might choose to work on adapting a different set of
code -- the "nlminb" function in Splus5 appears to be based on a different
set of Netlib FORTRAN code (dmnfb,dmngb,dmnhb) -- don't know the
particular reasons for using a different algorithm here.

  I managed to strip down the full code that I use for doing the
bootstraps to about 58 lines (and I removed the embarrassing "<<-"
constructs), so that it if anyone wants to work on this they don't have to
come back to me for particular examples that hang for a particular
platform/set of compiler options/etc..  (It's still not pretty.)

  Ben Bolker

# Stripped-down file to find hanging optim() 

total <- rep(c(100,150,200,300,500,900),rep(2,6))
pred <- c(83,73,22,47,53,31,56,83,30,73,44,91)

nllfun2g.boot <- function(q,debug=FALSE,data.pred,data.total) {
  N1 <- q[1]
  N2 <- q[2]
  G <- q[3]
  maxval <- 1e8
  L <- cfunc2(data.total,N1,N2,G)
  r <- -sum(dbinom(data.pred, data.total, L, log=TRUE))
  r[!is.finite(r)] <- maxval*sign(r[!is.finite(r)])
  r
}

cfunc2 <- function(input,N1,N2,G=1.0,debug=FALSE){
  if (debug)cat(c(N1,N2,G),"\n")
  pmin(1,1/((1+((input - N1)/N2))^G))
}

fuzz <- 0.001

do.boot <- function(tot=1000,seed=101) {
  bootmat <- matrix(nrow = tot, ncol = 5)
  set.seed(seed)
  i <- 1
  while (i<=tot) {
    print(i)
    boot.ind <- sample(1:length(pred), rep = TRUE)
    boot.pred <- pred[boot.ind]  ## kluge!
    boot.total <- total[boot.ind]
    n <- try(optim(c(min(boot.total)-1,100,1),nllfun2g.boot,
                   lower=rep(fuzz,3),upper=c(min(boot.total)-fuzz,Inf,Inf),
                   method="L-BFGS-B",control=list(trace=FALSE,
                                       parscale=c(100,100,1)),
                   data.pred=boot.pred,
                   data.total=boot.total))
     if (is.null(class(n))) {
      bootmat[i,] <- c(n$val,n$par, n$convergence)
      i <- i+1
    }
  }
  bootmat
}

bootmat <- do.boot(5)

## find looping bootstrap sample

find.boot <- function(n,seed=101,cat=FALSE) {
  set.seed(seed)
  for (i in 1:n)
    boot.ind <- sample(1:length(pred), rep = TRUE)
  boot.pred <- pred[boot.ind]  ## kluge!
  boot.total <- total[boot.ind]
  if (cat) {
    cat(paste("boot.pred <- c(",paste(boot.pred,collapse=","),")",sep=""),"\n")
    cat(paste("boot.total <- c(",paste(boot.total,collapse=","),")",sep=""),"\n")
  }
  else list(boot.pred,boot.total)
}

On 3 Oct 2000, Peter Dalgaard BSA wrote:

> Kurt Hornik <Kurt.Hornik@ci.tuwien.ac.at> writes:
> 
> > >   Should ffloat-store be enabled in general for compilation on Linux
> > > boxes?
> > 
> > I am not sure it should.  We try to be as defensive about compiler flags
> > as possible.  (We use -D__NO_MATH_INLINES because it fixes a >>bug<<.)
> > I'd prefer modifying the code if possible ...
> 
> It definitely shouldn't... Basically what it does is to defeat the
> "guard digits" that are used in Intel FPUs (internally using 10 bytes
> rather than 8 for doubles). While there are some rare cases where this
> messes up the "endgame" of numerical algorithms, it increases overall
> numerical accuracy. Also, storing all floating-point values off-chip
> causes a substantial decrease in performance.
> 
> In the case(-s ?) I've seen, a routine was trying too hard to squeeze
> the last bit of precision out of a termination criterion.
> 
> Optimization is another likely culprit for this kind of trouble, so
> perhaps one should try turning it off for this code?
> 
> 

-- 
318 Carr Hall                                bolker@zoo.ufl.edu
Zoology Department, University of Florida    http://www.zoo.ufl.edu/bolker
Box 118525                                   (ph)  352-392-5697
Gainesville, FL 32611-8525                   (fax) 352-392-3704

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._