[R] Subject: Re ZINB by Newton Raphson??

Tue Jun 22 22:52:26 CEST 2010

John,

thanks for the comments, very useful.

Just three short additions specific to the ZINB case:
   1. zeroinfl() doesn't uses BFGS (by default) and not Newton-Raphson
      because we have analytical gradients but not an analytical Hessian.
   2. If the default starting values do not work, zeroinfl() offers EM
      estimation of the starting values (which is typically followed by
      a single iteration of BFGS only). EM is usually much more robust
      but slower, hence it's not the default in zeroinfl().
   3. I pointed the original poster to this information but he still
      insisted on Newton-Raphson for no obvious reason. As I didn't
      want to WTFM again on the list, I stopped using up bandwidth.

thx,
Z

On Tue, 22 Jun 2010, Prof. John C Nash wrote:

> I have not included the previous postings because they came out very 
> strangely on my mail reader. However, the question concerned the choice of 
> minimizer for the zeroinfl() function, which apparently allows any of the 
> current 6 methods of optim() for this purpose. The original poster wanted to 
> use Newton-Raphson.
>
> Newton-Raphson (or just Newton for simplicity) is commonly thought to be the 
> "best" way to approach optimization problems. I've had several people ask me 
> why the optimx() package (see OptimizeR project on r-forge -- probably soon 
> on CRAN, we're just tidying up) does not have such a choice. Since the 
> question comes up fairly frequently, here is a response. I caution that it is 
> based on my experience and others may get different mileage.
>
> My reasons for being cautious about Newton are as follows:
> 1) Newton's method needs a number of safeguards to avoid singular or 
> indefinite Hessian issues. These can be tricky to implement well and to do so 
> in a way that does not hinder the progress of the optimization.
> 2) One needs both gradient and Hessian information, and it needs to be 
> accurate. Numerical approximations are slow and often inadequate for tough 
> problems.
> 3) Far from a solution, Newton is often not very good, likely because the 
> Hessian is not like a nice quadratic over the whole space.
>
> Newton does VERY well at converging when it has a "close enough" start. If 
> you can find an operationally useful way to generate such starts, you deserve 
> awards like the Fields.
>
> We have in our optimx work (Ravi Varadhan and I) developed a prototype 
> safeguarded Newton. As yet we have not included it in optimx(), but probably 
> will do so in a later version after we figure out what advice to give on 
> where it is appropriate to apply it.
>
> In the meantime, I would suggest that BFGS or L-BFGS-B are the closest 
> options in optim() and generally perform quite well. There are updates to 
> BFGS and CG on CRAN in the form of Rvmmin and Rcgmin which are all-R 
> implementations with box constraints too. UCMINF is a very similar 
> implementation of the unconstrained algorithm that seems to have the details 
> done rather well -- though BFGS in optim() is based on my work, I actually 
> find UCMINF often does better. There's also nlm and nlminb.
>
> Via optimx() one can call these, and also some other minimizers, or even 
> "all.methods", though that is meant for learning about methods rather than 
> solving individual problems.
>
> JN
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>