That's consistent/not surprising if the problem lies in the numerical gradient calculation step ...
On 2019-05-06 10:06 a.m., Ravi Varadhan wrote: > Optim's Nelder-Mead works correctly for this example. > > >> optim(par=10, fn=fn, method="Nelder-Mead") > x=10, ret=100.02 (memory) > x=11, ret=121 (calculate) > x=9, ret=81 (calculate) > x=8, ret=64 (calculate) > x=6, ret=36 (calculate) > x=4, ret=16 (calculate) > x=0, ret=0 (calculate) > x=-4, ret=16 (calculate) > x=-4, ret=16 (memory) > x=2, ret=4 (calculate) > x=-2, ret=4 (calculate) > x=1, ret=1 (calculate) > x=-1, ret=1 (calculate) > x=0.5, ret=0.25 (calculate) > x=-0.5, ret=0.25 (calculate) > x=0.25, ret=0.0625 (calculate) > x=-0.25, ret=0.0625 (calculate) > x=0.125, ret=0.015625 (calculate) > x=-0.125, ret=0.015625 (calculate) > x=0.0625, ret=0.00390625 (calculate) > x=-0.0625, ret=0.00390625 (calculate) > x=0.03125, ret=0.0009765625 (calculate) > x=-0.03125, ret=0.0009765625 (calculate) > x=0.015625, ret=0.0002441406 (calculate) > x=-0.015625, ret=0.0002441406 (calculate) > x=0.0078125, ret=6.103516e-05 (calculate) > x=-0.0078125, ret=6.103516e-05 (calculate) > x=0.00390625, ret=1.525879e-05 (calculate) > x=-0.00390625, ret=1.525879e-05 (calculate) > x=0.001953125, ret=3.814697e-06 (calculate) > x=-0.001953125, ret=3.814697e-06 (calculate) > x=0.0009765625, ret=9.536743e-07 (calculate) > $par > [1] 0 > > $value > [1] 0 > > $counts > function gradient > 32 NA > > $convergence > [1] 0 > > $message > NULL > > > > > ________________________________ > From: R-devel <r-devel-boun...@r-project.org> on behalf of Duncan Murdoch > <murdoch.dun...@gmail.com> > Sent: Friday, May 3, 2019 8:18:44 AM > To: peter dalgaard > Cc: Florian Gerber; r-devel@r-project.org > Subject: Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when > working with parent environments > > > It looks as though this happens when calculating numerical gradients: x > is reduced by eps, and fn is called; then x is increased by eps, and fn > is called again. No check is made that x has other references after the > first call to fn. > > I'll put together a patch if nobody else gets there first... > > Duncan Murdoch > > On 03/05/2019 7:13 a.m., peter dalgaard wrote: >> Yes, I think you are right. I was at first confused by the fact that after >> the optim() call, >> >>> environment(fn)$xx >> [1] 10 >>> environment(fn)$ret >> [1] 100.02 >> >> so not 9.999, but this could come from x being assigned the final value >> without calling fn. >> >> -pd >> >> >>> On 3 May 2019, at 11:58 , Duncan Murdoch <murdoch.dun...@gmail.com> wrote: >>> >>> Your results below make it look like a bug in optim(): it is not >>> duplicating a value when it should, so changes to x affect xx as well. >>> >>> Duncan Murdoch >>> >>> On 03/05/2019 4:41 a.m., Serguei Sokol wrote: >>>> On 03/05/2019 10:31, Serguei Sokol wrote: >>>>> On 02/05/2019 21:35, Florian Gerber wrote: >>>>>> Dear all, >>>>>> >>>>>> when using optim() for a function that uses the parent environment, I >>>>>> see the following unexpected behavior: >>>>>> >>>>>> makeFn <- function(){ >>>>>> xx <- ret <- NA >>>>>> fn <- function(x){ >>>>>> if(!is.na(xx) && x==xx){ >>>>>> cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="") >>>>>> return(ret) >>>>>> } >>>>>> xx <<- x; ret <<- sum(x^2) >>>>>> cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="") >>>>>> ret >>>>>> } >>>>>> fn >>>>>> } >>>>>> fn <- makeFn() >>>>>> optim(par=10, fn=fn, method="L-BFGS-B") >>>>>> # x=10, ret=100 (calculate) >>>>>> # x=10.001, ret=100.02 (calculate) >>>>>> # x=9.999, ret=100.02 (memory) >>>>>> # $par >>>>>> # [1] 10 >>>>>> # >>>>>> # $value >>>>>> # [1] 100 >>>>>> # (...) >>>>>> >>>>>> I would expect that optim() does more than 3 function evaluations and >>>>>> that the optimization converges to 0. >>>>>> >>>>>> Same problem with optim(par=10, fn=fn, method="BFGS"). >>>>>> >>>>>> Any ideas? >>>>> I don't have an answer but may be an insight. For some mysterious >>>>> reason xx is getting changed when in should not. Consider: >>>>>> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in >>>>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx >>>>> <<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) >>>>>> optim(par=10, fn=fn, method="L-BFGS-B") >>>>> 1 in x,xx,ret= 10 NA NA >>>>> out x,xx,ret= 10 10 100 >>>>> 2 in x,xx,ret= 10.001 10 100 >>>>> out x,xx,ret= 10.001 10.001 100.02 >>>>> 3 in x,xx,ret= 9.999 9.999 100.02 >>>>> $par >>>>> [1] 10 >>>>> >>>>> $value >>>>> [1] 100 >>>>> >>>>> $counts >>>>> function gradient >>>>> 1 1 >>>>> >>>>> $convergence >>>>> [1] 0 >>>>> >>>>> $message >>>>> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" >>>>> >>>>> At the third call, xx has value 9.999 while it should have kept the >>>>> value 10.001. >>>>> >>>> A little follow-up: if you untie the link between xx and x by replacing >>>> the expression "xx <<- x" by "xx <<- x+0" it works as expected: >>>> > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in >>>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<- >>>> x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) >>>> > optim(par=10, fn=fn, method="L-BFGS-B") >>>> 1 in x,xx,ret= 10 NA NA >>>> out x,xx,ret= 10 10 100 >>>> 2 in x,xx,ret= 10.001 10 100 >>>> out x,xx,ret= 10.001 10.001 100.02 >>>> 3 in x,xx,ret= 9.999 10.001 100.02 >>>> out x,xx,ret= 9.999 9.999 99.98 >>>> 4 in x,xx,ret= 9 9.999 99.98 >>>> out x,xx,ret= 9 9 81 >>>> 5 in x,xx,ret= 9.001 9 81 >>>> out x,xx,ret= 9.001 9.001 81.018 >>>> 6 in x,xx,ret= 8.999 9.001 81.018 >>>> out x,xx,ret= 8.999 8.999 80.982 >>>> 7 in x,xx,ret= 1.776357e-11 8.999 80.982 >>>> out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22 >>>> 8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22 >>>> out x,xx,ret= 0.001 0.001 1e-06 >>>> 9 in x,xx,ret= -0.001 0.001 1e-06 >>>> out x,xx,ret= -0.001 -0.001 1e-06 >>>> 10 in x,xx,ret= -1.334475e-23 -0.001 1e-06 >>>> out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46 >>>> 11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46 >>>> out x,xx,ret= 0.001 0.001 1e-06 >>>> 12 in x,xx,ret= -0.001 0.001 1e-06 >>>> out x,xx,ret= -0.001 -0.001 1e-06 >>>> $par >>>> [1] -1.334475e-23 >>>> $value >>>> [1] 1.780823e-46 >>>> $counts >>>> function gradient >>>> 4 4 >>>> $convergence >>>> [1] 0 >>>> $message >>>> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" >>>> Serguei. >>>> ______________________________________________ >>>> R-devel@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>>> >>> >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel