[R] using mle2 for multinomial model optimization

Ben Bolker bolker at ufl.edu
Fri Feb 12 19:59:19 CET 2010


Ravi Varadhan <rvaradhan <at> jhmi.edu> writes:

> 
> 
> Use "Nelder-Mead" as the method in optim.  This will give you the correct
solution.
> 
> m1 <- mle2(mfun, start=svec, vecpar=TRUE, fixed=svec[1:(l-1)], method="Nelder")
> 
> Hope this is helpful,
> Ravi.

  Well-meaning though it is, this solution doesn't get to the root
of the problem.  It does get the right answer in this case, but in
general it could get stuck. 

  The symptom (besides the error message, which you could overcome
using skip.hessian=TRUE -- I should make the package a little
smarter/more helpful about this) is that the default optimizer
gets stuck at its starting position.

  The underlying, big problem is that dmultinom() rounds its 'x' argument,
so that the likelihood surface has lots of little flat spots (see below).
  The second is that the minimum in the log-likelihood is quite shallow, 
so if you're not careful the optimizer finds the flat spot for large 
negative values and gets stuck on it.

  Here is some code that works, & that demonstrates the problems
(I left out a bunch of my false starts and confusions).

cohort<-c(50,54,50)
##imaginary harvest numbers of a cohort harvested 
## over 3 years

## avoid "l" as a variable name, easy to confuse with "1"
L <- length(cohort)+1    #nr of cell probabilities
h <- c(0.2,0.3,1)        #harvest rates for the 3 diferent years
S <- c(0.9,0.8)          #survival rates

## slightly more efficient way of getting predicted probabilities
predprob <- function(S=S,h=h) {
  L <- length(h)+1
  p <- h*c(1,cumprod((1-h[1:length(S)])*S))
  p[L] <- 1-sum(p[-L]) ## add one more value
  p
}

## version of dmultinom WITHOUT ROUNDING & without
##   error-checking
dmnom <- function(x,prob,log=FALSE) {
  r <- lgamma(sum(x) + 1) + sum(x * log(prob) - lgamma(x + 1))
  if (log) r else exp(r)
}

## pass S and h as parameters
## (would be more efficient to precompute predprob() since
##  it doesn't depend on parameters ...)
mfun <- function(d,S=S,h=h) {
  -dmnom(exp(d),prob=predprob(S,h),log=TRUE)
}

## use dmultinom instead (for illustrative purposes, below)
mfun0 <- function(d,S=S,h=h) {
  -dmultinom(exp(d),prob=predprob(S,h),log=TRUE)
}

## avoid using 'c' as a variable name 
c0 <- c(cohort,100)            ## untransformed initial values for the 
                               ## optimization,->100=N-x1-x2-x3
nvec<-c(rep("x",L-1),"N")     #names for the initial vector
nrs<-c(-L,1)            #nrs for the initial vector
svec = log(c0)             ##transformation of the data to avoid 
                                        #  constraints (x>0)
names(svec) <- paste(nvec,nrs,sep="")
parnames(mfun) <- names(svec)

library(bbmle)

## use bounded optimization
m1 = mle2(mfun2,start=svec,
  method="L-BFGS-B",lower=0,upper=5,
  vecpar=TRUE,fixed=svec[-L],
  data=list(S=S,h=h))

coef(m1)
plot(profile(m1B))

## diagnostic pictures
N1vec <- seq(0,5,length=500)
LLvec <-sapply(N1vec,
               function(x) { mfun0(c(svec[-L],N1=x),S=S,h=h)})

LLvec2 <-sapply(N1vec,
               function(x) { mfun(c(svec[-L],N1=x),S=S,h=h)})

plot(N1vec,llvec,type="l")
lines(N1vec,llvec2,col="red")

plot(N1vec[-1],diff(LLvec),type="l")
lines(N1vec[-1],diff(LLvec2),col=2)

## zoom in
plot(N1vec[-1],diff(llvec),type="l",
     xlim=c(1,4),ylim=c(-0.2,0.3))
lines(N1vec[-1],diff(llvec2),col=2,lwd=2)



More information about the R-help mailing list