[Rd] Wrongly converging glm()

Fri Jul 21 15:22:58 CEST 2017

Please allow me to add my 3 cents.  Stopping an iterative optimization algorithm at an "appropriate" juncture is very tricky.  All one can say is that the algorithm terminated because it triggered a particular stopping criterion.  A good software will tell you why it stopped - i.e. the stopping criterion that was triggered.  It is extremely difficult to make a failsafe guarantee that the triggered stopping criterion is the correct one and that the answer obtained is trustworthy. It is up to the user to determine whether the answer makes sense.  In the case of maximizing a likelihood function, it is perfectly reasonable to stop when the algorithm has not made any progress in increasing the log likelihood.  In this case, the software should print out something like "algorithm terminated due to lack of improvement in log-likelihood."  Therefore, I don't see a need to issue any warning, but simply report the stopping criterion that was applied to terminate the algorithm.

Best,
Ravi

-----Original Message-----
From: R-devel [mailto:r-devel-bounces at r-project.org] On Behalf Of Therneau, Terry M., Ph.D.
Sent: Friday, July 21, 2017 8:04 AM
To: r-devel at r-project.org; Mark Leeds <markleeds2 at gmail.com>; jorismeys at gmail.com; westra.harmjan at outlook.com
Subject: Re: [Rd] Wrongly converging glm()

I'm chiming in late since I read the news in digest form, and I won't copy the entire conversation to date.

The issue raised comes up quite often in Cox models, so often that the Therneau and Grambsch book has a section on the issue (3.5, p 58).  After a few initial iterations the offending coefficient will increase by a constant at each iteration while the log-likelihood approaches an asymptote (essentially once the other coefficients "settle down").

The coxph routine tries to detect this case and print a warning, and this turns out to be very hard to do accurately.  I worked hard at tuning the threshold(s) for the message several years ago and finally gave up; I am guessing that the warning misses > 5% of the cases when the issue is true, and that 5% of the warnings that do print are incorrect.  
(And these estimates may be too optimistic.)   Highly correlated predictors tend to trip 
it up, e.g., the truncated power spline basis used by the rcs function in Hmisc.

All in all, I am not completely sure whether the message does more harm than good.  I'd be quite reluctant to go down the same path again with the glm function.

Terry Therneau

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel