Re: [R] convergence=0 in optim and nlminb is real?

Adelchi Azzalini Tue, 17 Dec 2013 13:56:29 -0800

It was not my suggestion that an optimizer should check the Hessian onevery occasion (this would be both time consuming and meaningless),but I expected it to do so before claiming that a point is at aminimum, that is, only for the candidate final point.

Neither I have ever thought that nonlinear optimization is a cursoryoperation, especially when the dimensionality is not small. Exactlyfor this reason I expect that an optimizer takes stringent precautionsbefore claiming to have completed its job successfully.


AA



On 17 Dec 2013, at 18:18, Prof J C Nash (U30A) wrote:

As indicated, if optimizers check Hessians on every occasion, R would

enrich all the computer manufacturers. In this case it is not toolarge

a problem, so worth doing.

However, for this problem, the Hessian is being evaluated by doing
numerical approximations to second partial derivatives, so the Hessian
may be almost a fiction of the analytic Hessian. I've seen plenty of

Hessian approximations that are not positive definite, when theanswers

were OK.

That Inf is allowed does not mean that it is recommended. R is very
tolerant of many things that are not generally good ideas. That can be

helpful for some computations, but still cause trouble. It seemsthat it

is not the problem here.

I did not look at all the results for this problem from optimx, but it
appeared that several results were lower than the optim(BFGS) one. Is
any of the optimx results acceptable? Note that optimx DOES offer to

check the KKT conditions, and defaults to doing so unless theproblem is

large. That was included precisely because the optimizers generally
avoid this very expensive computation. But given the range of results

from the optimx answers using "all methods", I'd still want to do alot

of testing of the results.

This may be a useful case to point out that nonlinear optimization is
not a calculation that should be taken for granted. It is much less

reliable than most users think. I rarely find ANY problem for whichallthe optimx methods return the same answer. You really do need tolook at

the answers and make sure that they are meaningful.

JN

On 13-12-17 11:32 AM, Adelchi Azzalini wrote:

On Tue, 17 Dec 2013 08:27:36 -0500, Prof J C Nash (U30A) wrote:

PJCN> If you run all methods in package optimx, you will see results
PJCN> all over the western hemisphere. I suspect a problem with some
PJCN> nasty computational issues. Possibly the replacement of the
PJCN> function with Inf when any eigenvalues < 0 or  nu < 0 is one
PJCN> source of this.

A value Inf is allowed, as indicated in this passage from the
documentation of optim:

 Function fn can return NA or Inf if the function cannot be evaluated
 at the supplied value, but the initial value must have a computable
 finite value of fn.

Incidentally, the documentation of optimx includes the same sentence.

However, this aspect is not crucial anyway, since the pointselected byoptim is within the feasible space (by a good margin), andevaluation of

the Hessian matrix occurs at this point.

PJCN>
PJCN> Note that Hessian eigenvalues are not used to determine
PJCN> convergence in optimization methods. If they did, nobody would

PJCN> ever get promoted from junior lecturer who was under 100 iftheyPJCN> needed to do this, because determining the Hessian from justthe

PJCN> function requires two levels of approximate derivatives.

At the end of the optimization process, when a point is going to be
declared a minimum point, I expect that an optimizer  checks that it
really *is* a minimum. It may do this in other ways other than
computing the eigenvalues, but it must be done somehow. Actually, I
first realized the problem by attempting inversion (to get standard
errors) under the assumption of positive definiteness, and it failed.
For instance

 mnormt:::pd.solve(opt$hessian)

says  "x appears to be not positive definite". This check does not
involve a further level of approximation.

PJCN>

PJCN> If you want to get this problem reliably solved, I think youwill

PJCN> need to
PJCN> 1) sort out a way to avoid the Inf values -- can you constrain
PJCN> the parameters away from such areas, or at least not use Inf.

PJCN> This messes up the gradient computation and hence theoptimizers

PJCN> and also the final Hessian.
PJCN> 2) work out an analytic gradient function.
PJCN>

In my ealier message, I have indicated that this is a semplified
version of the real thing, which is function mst.mle of pkg 'sn'.
What mst.mle does is exactly what you indicated, that is, it
re-parameterizes the problem so that we always stay within the
feasible region and works with analytic gradient function (of the
transformed parameters). The final outcome is the same: we land on
the same point.

However, once the (supposed) point of minimum has been found, the
Hessian matrix must be computed on the original parameterization,
to get standard errors.

Adelchi Azzalini

PJCN>
PJCN>
PJCN> > Date: Mon, 16 Dec 2013 16:09:46 +0100
PJCN> > From: Adelchi Azzalini <azzal...@stat.unipd.it>
PJCN> > To: r-help@r-project.org
PJCN> > Subject: [R] convergence=0 in optim and nlminb is real?
PJCN> > Message-ID:
PJCN> > <20131216160946.91858ff279db26bd65e18...@stat.unipd.it>
PJCN> > Content-Type: text/plain; charset=US-ASCII
PJCN> >
PJCN> > It must be the case that this issue has already been rised
PJCN> > before, but I did not manage to find it in past posting.
PJCN> >
PJCN> > In some cases, optim() and nlminb() declare a successful
PJCN> > convergence, but the corresponding Hessian is not
PJCN> > positive-definite.  A simplified version of the original
PJCN> > problem is given in the code which for readability is placed
PJCN> > below this text.  The example is built making use of package
PJCN> > 'sn', but this is only required to set-up the example: the

PJCN> > question is about the outcome of the optimizers. At the endofPJCN> > the run, a certain point is declared to correspont to aminimum

PJCN> > since 'convergence=0' is reported, but the eigenvalues of the
PJCN> > (numerically evaluated) Hessian matrix at that point are not
PJCN> > all positive.
PJCN> >
PJCN> > Any views on the cause of the problem? (i) the point does not
PJCN> > correspong to a real minimum, (ii) it does dive a minimum but
PJCN> > the Hessian matrix is wrong, (iii) the eigenvalues are not
PJCN> > right. ...and, in case, how to get the real solution.
PJCN> >
PJCN> >
PJCN> > Adelchi Azzalini
PJCN>


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] convergence=0 in optim and nlminb is real?

Reply via email to