“So, what I learned the hard way was  termination due to reasonable stopping  
criteria DOES NOT NECESSARILY EQUAL OPTIMAL.”

Yes, I agree, Mark.

Let me add another observation.  In the “optimx” package, John Nash and I 
implemented a check for optimality conditions – first and second order KKT 
conditions.  This involves checking whether the gradient is sufficiently small 
and the Hessian is positive definite (for local minimum) at the final parameter 
values.  However, it can be quite time consuming to compute these quantities 
and in some problems checking KKT can take up more effort than finding the 
solution!  Furthermore, it is difficult to come up with good thresholds for 
determining “small” gradient and “positive definite” Hessian, since these can 
depend upon the scale of the objective function and the parameters.

Ravi

From: Mark Leeds [mailto:marklee...@gmail.com]
Sent: Friday, July 21, 2017 3:09 PM
To: Ravi Varadhan <ravi.varad...@jhu.edu>
Cc: Therneau, Terry M., Ph.D. <thern...@mayo.edu>; r-devel@r-project.org; 
jorism...@gmail.com; westra.harm...@outlook.com
Subject: Re: [Rd] Wrongly converging glm()

Hi Ravi: Well said. In John's Rvmmin package, he has codes for explaining the 
cause
of the termination. The codes returned were fine. The problem was that
the model I was using could have multiple solutions ( regardless of the data
sent in ) so, even though the stopping criteria was reached, it turned out that 
one of the parameters ( there were two parameters ) could have really been 
anything and the same likelihood value would  be returned. So, what I learned 
the hard way was  termination due to reasonable stopping  criteria DOES NOT 
NECESSARILY EQUAL OPTIMAL. But I lived in the dark about this for a long time 
and only happened to notice it when playing around with the likelihood by 
fixing the offending parameter to various values and optimizing over the
non-offending parameter. Thanks for eloquent explanation.
                                                                                
  Mark





















On Fri, Jul 21, 2017 at 9:22 AM, Ravi Varadhan 
<ravi.varad...@jhu.edu<mailto:ravi.varad...@jhu.edu>> wrote:
Please allow me to add my 3 cents.  Stopping an iterative optimization 
algorithm at an "appropriate" juncture is very tricky.  All one can say is that 
the algorithm terminated because it triggered a particular stopping criterion.  
A good software will tell you why it stopped - i.e. the stopping criterion that 
was triggered.  It is extremely difficult to make a failsafe guarantee that the 
triggered stopping criterion is the correct one and that the answer obtained is 
trustworthy. It is up to the user to determine whether the answer makes sense.  
In the case of maximizing a likelihood function, it is perfectly reasonable to 
stop when the algorithm has not made any progress in increasing the log 
likelihood.  In this case, the software should print out something like 
"algorithm terminated due to lack of improvement in log-likelihood."  
Therefore, I don't see a need to issue any warning, but simply report the 
stopping criterion that was applied to terminate the algorithm.

Best,
Ravi

-----Original Message-----
From: R-devel 
[mailto:r-devel-boun...@r-project.org<mailto:r-devel-boun...@r-project.org>] On 
Behalf Of Therneau, Terry M., Ph.D.
Sent: Friday, July 21, 2017 8:04 AM
To: r-devel@r-project.org<mailto:r-devel@r-project.org>; Mark Leeds 
<marklee...@gmail.com<mailto:marklee...@gmail.com>>; 
jorism...@gmail.com<mailto:jorism...@gmail.com>; 
westra.harm...@outlook.com<mailto:westra.harm...@outlook.com>
Subject: Re: [Rd] Wrongly converging glm()
I'm chiming in late since I read the news in digest form, and I won't copy the 
entire conversation to date.

The issue raised comes up quite often in Cox models, so often that the Therneau 
and Grambsch book has a section on the issue (3.5, p 58).  After a few initial 
iterations the offending coefficient will increase by a constant at each 
iteration while the log-likelihood approaches an asymptote (essentially once 
the other coefficients "settle down").

The coxph routine tries to detect this case and print a warning, and this turns 
out to be very hard to do accurately.  I worked hard at tuning the threshold(s) 
for the message several years ago and finally gave up; I am guessing that the 
warning misses > 5% of the cases when the issue is true, and that 5% of the 
warnings that do print are incorrect.
(And these estimates may be too optimistic.)   Highly correlated predictors 
tend to trip
it up, e.g., the truncated power spline basis used by the rcs function in Hmisc.

All in all, I am not completely sure whether the message does more harm than 
good.  I'd be quite reluctant to go down the same path again with the glm 
function.

Terry Therneau
______________________________________________
R-devel@r-project.org<mailto:R-devel@r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to