Hi, I am ok with setting abs.tol=0. Here is an nlminb.patch that has this. There is just one line of code that has been added:
control$abs.tol <- 0 I have commented where this change happens. I am sorry if I was not being clear. I just wanted to have the authors to also have a look at the source of the problem. Regards, Ravi. ____________________________________________________________________ Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu ----- Original Message ----- From: Duncan Murdoch <murdoch.dun...@gmail.com> Date: Sunday, July 11, 2010 7:49 am Subject: Re: [R] Not nice behaviour of nlminb (windows 32 bit, version 2.11.1) To: Matthew Killeya <matthewkill...@googlemail.com> Cc: Ravi Varadhan <rvarad...@jhmi.edu>, Peter Ehlers <ehl...@ucalgary.ca>, r-help@r-project.org, ba...@stat.wisc.edu > On 11/07/2010 5:00 AM, Matthew Killeya wrote: > >Thanks. Seems to me the easiest sensible fix might be to change the > >default abs.tol=0 in R and add a warning in the help files? > > > > That was exactly my suggestion in the message to which Ravi was > replying, but he apparently has doubts. > > Duncan Murdoch > >Matt > > > >On 11 July 2010 01:41, Duncan Murdoch <murdoch.dun...@gmail.com> wrote: > > > > > >>On 10/07/2010 7:32 PM, Ravi Varadhan wrote: > >> > >> > >>>Hi, > >>> > >>>The best solution would be to identify where the problem is in the > FORTRAN > >>>code and correct it. However, this problem of premature > termination due to > >>>absolute function convergence is highly unlikely to occur in > practice. As > >>>John Nash noted, this is going to be highly unlikely for multi-dimensional > >>>parameters (it is also unlikely for one-dimensional problem). However, > >>>unless we understand the source of the problem, we cannot feel comfortable > >>>in saying with absolute certainty that this will not occur for n > > 1. > >>> Therefore, I would suggest that either we fix the problem at its > source or > >>>we set abs.tol=0, since there is little harm in doing so. > >>> > >>> > >>> > >>> > >>Just for future reference: that's not the kind of answer that > leads to > >>anything getting done. So I'll leave it to the authors of nlminb. > >> > >>Duncan Murdoch > >> > >> Ravi. > >> > >>>____________________________________________________________________ > >>> > >>>Ravi Varadhan, Ph.D. > >>>Assistant Professor, > >>>Division of Geriatric Medicine and Gerontology > >>>School of Medicine > >>>Johns Hopkins University > >>> > >>>Ph. (410) 502-2619 > >>>email: rvarad...@jhmi.edu > >>> > >>> > >>>----- Original Message ----- > >>>From: Duncan Murdoch <murdoch.dun...@gmail.com> > >>>Date: Saturday, July 10, 2010 7:32 am > >>>Subject: Re: [R] Not nice behaviour of nlminb (windows 32 bit, version > >>>2.11.1) > >>>To: Ravi Varadhan <rvarad...@jhmi.edu> > >>>Cc: Matthew Killeya <matthewkill...@googlemail.com>, Peter Ehlers > < > >>>ehl...@ucalgary.ca>, r-help@r-project.org, ba...@stat.wisc.edu > >>> > >>> > >>> > >>> > >>> > >>>>Ravi Varadhan wrote: > >>>> >Hi, > >>>> > > >>>> >The absolute function stopping criterion is not meant for any positive > >>>>objective function. It is meant for functions whose minimum is > 0. Here is > >>>>what David Gay's documentation from PORT says: > >>>> > > >>>> >"6 - absolute function convergence: |f (x)| < V(AFCTOL) = > V(31). This > >>>>test is only of interest in > >>>> >problems where f (x) = 0 means a ‘‘perfect fit’’, such as nonlinear > >>>>least-squares problems." > >>>> > Okay, I've taken a more careful look at the docs, and they > do say > >>>>that the return code 6 does not necessarily indicate convergence: > "The > >>>>desirable return codes are 3, 4, 5, and sometimes 6". So we > shouldn't by > >>>>default terminate on it, we should allow users to choose that if > they want > >>>>faster convergence to perfect fits. > >>>> Would changing the default for abs.tol to zero be a reasonable solution? > >>>> Duncan Murdoch > >>>> >For example, let us try a positive objective function: > >>>> > > >>>> > >>nlminb( obj = function(x) x^2 + 1, start=1, lower=-Inf, upper=Inf, > >>>>control=list(trace=TRUE)) > 0: 2.0000000: 1.00000 > >>>> > 1: 1.0000000: 0.00000 > >>>> > 2: 1.0000000: 0.00000 > >>>> >$par > >>>> >[1] 0 > >>>> > > >>>> >$objective > >>>> >[1] 1 > >>>> > > >>>> >$convergence > >>>> >[1] 0 > >>>> > > >>>> >$message > >>>> >[1] "relative convergence (4)" > >>>> > > >>>> >$iterations > >>>> >[1] 2 > >>>> > > >>>> >$evaluations > >>>> >function gradient 3 2 > > >>>> >Here the absolute function criterion does not kicks in. > > >>>> >Now let us try a function whose minimum value is 0. > >>>> > > >>>> > >>nlminb( obj = function(x) x^2, start=6, grad=function(x) 2*x, > >>>>lower=-Inf, upper=Inf, control=list(trace=TRUE) ) > >>>> >> > 0: 36.000000: 6.00000 > >>>> > 1: 4.0000000: 2.00000 > >>>> > 2: 4.9303807e-32: 2.22045e-16 > >>>> >$par > >>>> >[1] 2.220446e-16 > >>>> > > >>>> >$objective > >>>> >[1] 4.930381e-32 > >>>> > > >>>> >$convergence > >>>> >[1] 0 > >>>> > > >>>> >$message > >>>> >[1] "absolute function convergence (6)" > >>>> > > >>>> >$iterations > >>>> >[1] 2 > >>>> > > >>>> >$evaluations > >>>> >function gradient 4 3 >We see that convergence is > >>>>attained and that the stoppage is due to absolute function criterion. > >>>> > >>>>>Suppose, we now set abs.tol=0: > >>>>> > >>>> > > >>>> > >>nlminb( obj = function(x) x^2, start=6, grad=function(x) 2*x, > >>>>lower=-Inf, upper=Inf, control=list(trace=TRUE, abs.tol=0) ) > >>>> >> > 0: 36.000000: 6.00000 > >>>> > 1: 4.0000000: 2.00000 > >>>> > 2: 4.9303807e-32: 2.22045e-16 > >>>> > 3: 2.4308653e-63: -4.93038e-32 > >>>> > 4: 2.9962729e-95: -5.47382e-48 > >>>> > 5:1.4772766e-126: 1.21543e-63 > >>>> > 6:1.8208840e-158: 1.34940e-79 > >>>> > 7:8.9776511e-190: -2.99627e-95 > >>>> > 8:1.1065809e-221: -3.32653e-111 > >>>> > 9:5.4558652e-253: 7.38638e-127 > >>>> > 10:6.7248731e-285: 8.20053e-143 > >>>> > 11:3.3156184e-316: -1.82088e-158 > >>>> > 12: 0.0000000: -2.02159e-174 > >>>> > 13: 0.0000000: -2.02159e-174 > >>>> >$par > >>>> >[1] -2.021587e-174 > >>>> > > >>>> >$objective > >>>> >[1] 0 > >>>> > > >>>> >$convergence > >>>> >[1] 0 > >>>> > > >>>> >$message > >>>> >[1] "X-convergence (3)" > >>>> > > >>>> >$iterations > >>>> >[1] 13 > >>>> > > >>>> >$evaluations > >>>> >function gradient 15 13 > Now, we see that it > takes a > >>>>while to stop, eventhough it is clear that convergence has been attained > >>>>after 2 iterations. This demonstrates the need for the absolute > function > >>>>criterion for obj functions whose minimum is exactly 0. > Although, there is > >>>>nothing wrong with setting abs.tol=0, except for some loss of > computational > >>>>efficiency. >Now, let us get back to Matthew' example: > >>>> > > >>>> > >>nlminb( obj = function(x) x, start=1, lower=-2, upper=2, > >>>>control=list(trace=TRUE)) > 0: 1.0000000: 1.00000 > >>>> > 1: 0.0000000: 0.00000 > >>>> >$par > >>>> >[1] 0 > >>>> > > >>>> >$objective > >>>> >[1] 0 > >>>> > > >>>> >$convergence > >>>> >[1] 0 > >>>> > > >>>> >$message > >>>> >[1] "absolute function convergence (6)" > >>>> > > >>>> >$iterations > >>>> >[1] 1 > >>>> > > >>>> >$evaluations > >>>> >function gradient 2 2 > >>nlminb( obj = > function(x) x, > >>>>start=1, lower=-2, upper=2, control=list(trace=TRUE, abs.tol=0)) > > 0: > >>>> 1.0000000: 1.00000 > >>>> > 1: 0.0000000: 0.00000 > >>>> > 2: -2.0000000: -2.00000 > >>>> > 3: -2.0000000: -2.00000 > >>>> >$par > >>>> >[1] -2 > >>>> > > >>>> >$objective > >>>> >[1] -2 > >>>> > > >>>> >$convergence > >>>> >[1] 0 > >>>> > > >>>> >$message > >>>> >[1] "both X-convergence and relative convergence (5)" > >>>> > > >>>> >$iterations > >>>> >[1] 3 > >>>> > > >>>> >$evaluations > >>>> >function gradient 3 3 > > >>>> >Thus it is evident that setting abs.tol=0 is a reasonable, general > >>>>solution for functions whose minimum value is non-zero, because > it protects > >>>>against premature termination at iteration `n' whenever |f(x_n)| > < abs.tol. > >>>> The only limitation being that of loss of efficiency in problems > where > >>>>f(x*) = 0. where x* is the local minimum. > >>>> > > >>>> >Ravi. > >>>> >____________________________________________________________________ > >>>> > > >>>> >Ravi Varadhan, Ph.D. > >>>> >Assistant Professor, > >>>> >Division of Geriatric Medicine and Gerontology > >>>> >School of Medicine > >>>> >Johns Hopkins University > >>>> > > >>>> >Ph. (410) 502-2619 > >>>> >email: rvarad...@jhmi.edu > >>>> > > >>>> > > >>>> >----- Original Message ----- > >>>> >From: Duncan Murdoch <murdoch.dun...@gmail.com> > >>>> >Date: Friday, July 9, 2010 6:54 pm > >>>> >Subject: Re: [R] Not nice behaviour of nlminb (windows 32 bit, > version > >>>>2.11.1) > >>>> >To: Matthew Killeya <matthewkill...@googlemail.com> > >>>> >Cc: Peter Ehlers <ehl...@ucalgary.ca>, Ravi Varadhan < > >>>>rvarad...@jhmi.edu>, r-help@r-project.org, ba...@stat.wisc.edu > >>>> > > >>>> > > >>>> > >>On 09/07/2010 6:09 PM, Matthew Killeya wrote: > >>>> >> >Yes clearly a bug... there are numerous variations ... > problem seems > >>>>to be > >>>> >> >for a linear function whenever the first function valuation > is 1. > >>>> >> > Not at all. You can get the same problem on a quadratic > that > >>>>happens to have a zero at an inconvenient place, e.g. > >>>> >> nlminb( obj = function(x) x^2-25, start=6, lower=-Inf, > upper=Inf ) > >>>> >> Ravi's workaround of setting the abs.tol to zero fixes this > example, > >>>>but I think it's pretty clear from the documentation that the > whole thing > >>>>was designed for positive objective functions, so I wouldn't > count on his > >>>>workaround solving all the problems. > >>>> >> Duncan Murdoch > >>>> >> >e.g. two more examples: > >>>> >> > nlminb( obj = function(x) x+1, start=0, lower=-Inf, > upper=Inf ) > >>>> >> > nlminb( obj = function(x) x+2, start=-1, lower=-Inf, > upper=Inf ) > >>>> >> > > >>>> >> >(I wasn't sure where best to report a bug, so emailed the > help list) > >>>> >> > > >>>> >> >On 9 July 2010 22:10, Peter Ehlers <ehl...@ucalgary.ca> wrote: > >>>> >> > > >>>> >> > >>Actually, it looks like any value other than 1.0 > >>>> >> >>(and in (lower, upper)) for start will work. > >>>> >> >> > >>>> >> >> -Peter Ehlers > >>>> >> >> > >>>> >> >> > >>>> >> >>On 2010-07-09 14:45, Ravi Varadhan wrote: > >>>> >> >> > >>>> >> >> >>>Setting abs.tol = 0 works! This turns-off the absolute > >>>>function > >>>> >> >>>convergence > >>>> >> >>>criterion. > >>>> >> >>> > >>>> >> >>> > >>>> >> >>> nlminb( objective=function(x) x, start=1, lower=-2, upper=2, > >>>> >> >>> control=list(abs.tol=0)) > >>>> >> >>>$par > >>>> >> >>>[1] -2 > >>>> >> >>> > >>>> >> >>>$objective > >>>> >> >>>[1] -2 > >>>> >> >>> > >>>> >> >>>$convergence > >>>> >> >>>[1] 0 > >>>> >> >>> > >>>> >> >>>$message > >>>> >> >>>[1] "both X-convergence and relative convergence (5)" > >>>> >> >>> > >>>> >> >>>$iterations > >>>> >> >>>[1] 3 > >>>> >> >>> > >>>> >> >>>$evaluations > >>>> >> >>>function gradient > >>>> >> >>> 3 3 > >>>> >> >>> > >>>> >> >>> > >>>> >> >>>This is clearly a bug. > >>>> >> >>> > >>>> >> >>> > >>>> >> >>>Ravi. > >>>> >> >>> > >>>> >> >>>-----Original Message----- > >>>> >> >>>From: r-help-boun...@r-project.org [ > >>>> >> >>>On > >>>> >> >>>Behalf Of Ravi Varadhan > >>>> >> >>>Sent: Friday, July 09, 2010 4:42 PM > >>>> >> >>>To: 'Duncan Murdoch'; 'Matthew Killeya' > >>>> >> >>>Cc: r-help@r-project.org; ba...@stat.wisc.edu > >>>> >> >>>Subject: Re: [R] Not nice behaviour of nlminb (windows 32 > bit, > >>>>version > >>>> >> >>>2.11.1) > >>>> >> >>> > >>>> >> >>>Duncan, `nlminb' is not intended for non-negative > functions only. > >>>> There > >>>> >> >>>is > >>>> >> >>>indeed something strange happening in the algorithm! > >>>> >> >>> > >>>> >> >>>start<- 1.0 # converges to wrong minimum > >>>> >> >>> > >>>> >> >>>startp<- 1.0 + .Machine$double.eps # correct > >>>> >> >>> > >>>> >> >>>startm<- 1.0 - .Machine$double.eps # correct > >>>> >> >>> > >>>> >> >>> nlminb( objective=obj, start=start, lower=-2, upper=2) > >>>> >> >>> $par > >>>> >> >>>[1] 0 > >>>> >> >>> > >>>> >> >>>$objective > >>>> >> >>>[1] 0 > >>>> >> >>> > >>>> >> >>>$convergence > >>>> >> >>>[1] 0 > >>>> >> >>> > >>>> >> >>>$message > >>>> >> >>>[1] "absolute function convergence (6)" > >>>> >> >>> > >>>> >> >>>$iterations > >>>> >> >>>[1] 1 > >>>> >> >>> > >>>> >> >>>$evaluations > >>>> >> >>>function gradient > >>>> >> >>> 2 2 > >>>> >> >>> > >>>> >> >>> > >>>> >> >>> >>>>nlminb( objective=obj, start=startp, lower=-2, > upper=2) > >>>> >> >>>> > >>>> >> >>>> >>>$par > >>>> >> >>>[1] -2 > >>>> >> >>> > >>>> >> >>>$objective > >>>> >> >>>[1] -2 > >>>> >> >>> > >>>> >> >>>$convergence > >>>> >> >>>[1] 0 > >>>> >> >>> > >>>> >> >>>$message > >>>> >> >>>[1] "both X-convergence and relative convergence (5)" > >>>> >> >>> > >>>> >> >>>$iterations > >>>> >> >>>[1] 3 > >>>> >> >>> > >>>> >> >>>$evaluations > >>>> >> >>>function gradient > >>>> >> >>> 3 3 > >>>> >> >>> > >>>> >> >>> > >>>> >> >>> >>>>nlminb( objective=obj, start=startm, lower=-2, > upper=2) > >>>> >> >>>> > >>>> >> >>>> >>>$par > >>>> >> >>>[1] -2 > >>>> >> >>> > >>>> >> >>>$objective > >>>> >> >>>[1] -2 > >>>> >> >>> > >>>> >> >>>$convergence > >>>> >> >>>[1] 0 > >>>> >> >>> > >>>> >> >>>$message > >>>> >> >>>[1] "both X-convergence and relative convergence (5)" > >>>> >> >>> > >>>> >> >>>$iterations > >>>> >> >>>[1] 3 > >>>> >> >>> > >>>> >> >>>$evaluations > >>>> >> >>>function gradient > >>>> >> >>> 3 3 > >>>> >> >>> > >>>> >> >>> > >>>> >> >>> From the convergence message the `absolute function convergence' > >>>>seems to > >>>> >> >>> be > >>>> >> >>>the culprit, although I do not understand why that stopping > >>>>criterion is > >>>> >> >>>becoming effective, when the algorithm is started at x=1, > but not > >>>>at any > >>>> >> >>>other values. The documentation in IPORT makes it clear > that this > >>>> >> >>>criterion > >>>> >> >>>is effective only for functions where f(x*) = 0, where x* > is a > >>>>local > >>>> >> >>>minimum. In this example, x=0 is not a local minimum for > f(x), so > >>>>that > >>>> >> >>>criterion should not apply. > >>>> >> >>> > >>>> >> >>> > >>>> >> >>>Ravi. > >>>> >> >>> > >>>> >> >>> > >>>> >> >>>-----Original Message----- > >>>> >> >>>From: r-help-boun...@r-project.org [ > >>>> >> >>>On > >>>> >> >>>Behalf Of Duncan Murdoch > >>>> >> >>>Sent: Friday, July 09, 2010 3:45 PM > >>>> >> >>>To: Matthew Killeya > >>>> >> >>>Cc: r-help@r-project.org; ba...@stat.wisc.edu > >>>> >> >>>Subject: Re: [R] Not nice behaviour of nlminb (windows 32 > bit, > >>>>version > >>>> >> >>>2.11.1) > >>>> >> >>> > >>>> >> >>>On 09/07/2010 10:37 AM, Matthew Killeya wrote: > >>>> >> >>> > >>>> >> >>> >>>> nlminb( obj = function(x) x, start=1, lower=-Inf, > >>>>upper=Inf ) > >>>> >> >>>> > >>>> >> >>>> > >>>> >> >>>> >>>If you read the PORT documentation carefully, > you'll > >>>>see that their > >>>> >> >>>convergence criteria are aimed at minimizing positive functions. > >>>> (They > >>>> >> >>>never state this explicitly, as far as I can see.) So one > >>>>stopping > >>>> >> >>>criterion is that |f(x)|< abs.tol, and that's what it > found for > >>>>you. I > >>>> >> >>>don't know if there's a way to turn this off. > >>>> >> >>> > >>>> >> >>>Doug or Deepayan, do you know if nlminb can be made to > work on > >>>>functions > >>>> >> >>>that go negative? > >>>> >> >>> > >>>> >> >>>Duncan Murdoch > >>>> >> >>> > >>>> >> >>> $par > >>>> >> >>> >>>>[1] 0 > >>>> >> >>>> > >>>> >> >>>>$objective > >>>> >> >>>>[1] 0 > >>>> >> >>>> > >>>> >> >>>>$convergence > >>>> >> >>>>[1] 0 > >>>> >> >>>> > >>>> >> >>>>$message > >>>> >> >>>>[1] "absolute function convergence (6)" > >>>> >> >>>> > >>>> >> >>>>$iterations > >>>> >> >>>>[1] 1 > >>>> >> >>>> > >>>> >> >>>>$evaluations > >>>> >> >>>>function gradient > >>>> >> >>>> 2 2 > >>>> >> >>>> > >>>> >> >>>> [[alternative HTML version deleted]] > >>>> >> >>>> > >>>> >> >>>> > >>>> >> >>>> > > >>>> >> > > >>>> > >>>> > > > > >
nlminb <- function (start, objective, gradient = NULL, hessian = NULL, ..., scale = 1, control = list(), lower = -Inf, upper = Inf) { par <- as.double(start) names(par) <- names(start) n <- length(par) iv <- integer(78 + 3 * n) v <- double(130 + (n * (n + 27))/2) .Call(R_port_ivset, 2, iv, v) # patch by Ravi Varadhan control$abs.tol <- 0 # if (length(control)) { nms <- names(control) if (!is.list(control) || is.null(nms)) stop("control argument must be a named list") cpos <- c(eval.max = 17, iter.max = 18, trace = 19, abs.tol = 31, rel.tol = 32, x.tol = 33, step.min = 34, step.max = 35, scale.init = 38, sing.tol = 37, diff.g = 42) pos <- pmatch(nms, names(cpos)) if (any(nap <- is.na(pos))) { warning(paste("unrecognized control element(s) named `", paste(nms[nap], collapse = ", "), "' ignored", sep = "")) pos <- pos[!nap] control <- control[!nap] } ivpars <- pos < 4 if (any(ivpars)) iv[cpos[pos[ivpars]]] <- as.integer(unlist(control[ivpars])) if (any(!ivpars)) v[cpos[pos[!ivpars]]] <- as.double(unlist(control[!ivpars])) } obj <- quote(objective(.par, ...)) rho <- new.env(parent = environment()) assign(".par", par, envir = rho) grad <- hess <- low <- upp <- NULL if (!is.null(gradient)) { grad <- quote(gradient(.par, ...)) if (!is.null(hessian)) { if (is.logical(hessian)) stop("Logical `hessian' argument not allowed. See documentation.") hess <- quote(hessian(.par, ...)) } } if (any(lower != -Inf) || any(upper != Inf)) { low <- rep(as.double(lower), length.out = length(par)) upp <- rep(as.double(upper), length.out = length(par)) } else low <- upp <- numeric(0L) .Call(R_port_nlminb, obj, grad, hess, rho, low, upp, d = rep(as.double(scale), length.out = length(par)), iv, v) ans <- list(par = get(".par", envir = rho)) ans$objective <- v[10L] ans$convergence <- as.integer(if (iv[1] %in% 3L:6L) 0L else 1L) ans$message <- if (19 <= iv[1] && iv[1] <= 43) { if (any(B <- iv[1] == cpos)) sprintf("'control' component '%s' = %g, is out of range", names(cpos)[B], v[iv[1]]) else sprintf("V[IV[1]] = V[%d] = %g is out of range (see PORT docu.)", iv[1], v[iv[1]]) } else switch(as.character(iv[1]), `3` = "X-convergence (3)", `4` = "relative convergence (4)", `5` = "both X-convergence and relative convergence (5)", `6` = "absolute function convergence (6)", `7` = "singular convergence (7)", `8` = "false convergence (8)", `9` = "function evaluation limit reached without convergence (9)", `10` = "iteration limit reached without convergence (9)", `14` = "storage has been allocated (?) (14)", `15` = "LIV too small (15)", `16` = "LV too small (16)", `63` = "fn cannot be computed at initial par (63)", `65` = "gr cannot be computed at initial par (65)") if (is.null(ans$message)) ans$message <- paste("See PORT documentation. Code (", iv[1], ")", sep = "") ans$iterations <- iv[31] ans$evaluations <- c(`function` = iv[6], gradient = iv[30]) ans }
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.