[Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
Dear all, when using optim() for a function that uses the parent environment, I see the following unexpected behavior: makeFn <- function(){ xx <- ret <- NA fn <- function(x){ if(!is.na(xx) && x==xx){ cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="") return(ret) } xx <<- x; ret <<- sum(x^2) cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="") ret } fn } fn <- makeFn() optim(par=10, fn=fn, method="L-BFGS-B") # x=10, ret=100 (calculate) # x=10.001, ret=100.02 (calculate) # x=9.999, ret=100.02 (memory) # $par # [1] 10 # # $value # [1] 100 # (...) I would expect that optim() does more than 3 function evaluations and that the optimization converges to 0. Same problem with optim(par=10, fn=fn, method="BFGS"). Any ideas? See also my related post: https://stackoverflow.com/questions/53826521/r-optim-unexpected-behavior-when-working-with-parent-environments platform x86_64-pc-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status major 3 minor 6.0 year 2019 month 04 day 26 svn rev 76424 language R version.string R version 3.6.0 (2019-04-26) nickname Planting of a Tree Best, Florian __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
On 02/05/2019 21:35, Florian Gerber wrote: Dear all, when using optim() for a function that uses the parent environment, I see the following unexpected behavior: makeFn <- function(){ xx <- ret <- NA fn <- function(x){ if(!is.na(xx) && x==xx){ cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="") return(ret) } xx <<- x; ret <<- sum(x^2) cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="") ret } fn } fn <- makeFn() optim(par=10, fn=fn, method="L-BFGS-B") # x=10, ret=100 (calculate) # x=10.001, ret=100.02 (calculate) # x=9.999, ret=100.02 (memory) # $par # [1] 10 # # $value # [1] 100 # (...) I would expect that optim() does more than 3 function evaluations and that the optimization converges to 0. Same problem with optim(par=10, fn=fn, method="BFGS"). Any ideas? I don't have an answer but may be an insight. For some mysterious reason xx is getting changed when in should not. Consider: > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) > optim(par=10, fn=fn, method="L-BFGS-B") 1 in x,xx,ret= 10 NA NA out x,xx,ret= 10 10 100 2 in x,xx,ret= 10.001 10 100 out x,xx,ret= 10.001 10.001 100.02 3 in x,xx,ret= 9.999 9.999 100.02 $par [1] 10 $value [1] 100 $counts function gradient 1 1 $convergence [1] 0 $message [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" At the third call, xx has value 9.999 while it should have kept the value 10.001. Serguei. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
On 03/05/2019 10:31, Serguei Sokol wrote: On 02/05/2019 21:35, Florian Gerber wrote: Dear all, when using optim() for a function that uses the parent environment, I see the following unexpected behavior: makeFn <- function(){ xx <- ret <- NA fn <- function(x){ if(!is.na(xx) && x==xx){ cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="") return(ret) } xx <<- x; ret <<- sum(x^2) cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="") ret } fn } fn <- makeFn() optim(par=10, fn=fn, method="L-BFGS-B") # x=10, ret=100 (calculate) # x=10.001, ret=100.02 (calculate) # x=9.999, ret=100.02 (memory) # $par # [1] 10 # # $value # [1] 100 # (...) I would expect that optim() does more than 3 function evaluations and that the optimization converges to 0. Same problem with optim(par=10, fn=fn, method="BFGS"). Any ideas? I don't have an answer but may be an insight. For some mysterious reason xx is getting changed when in should not. Consider: > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) > optim(par=10, fn=fn, method="L-BFGS-B") 1 in x,xx,ret= 10 NA NA out x,xx,ret= 10 10 100 2 in x,xx,ret= 10.001 10 100 out x,xx,ret= 10.001 10.001 100.02 3 in x,xx,ret= 9.999 9.999 100.02 $par [1] 10 $value [1] 100 $counts function gradient 1 1 $convergence [1] 0 $message [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" At the third call, xx has value 9.999 while it should have kept the value 10.001. A little follow-up: if you untie the link between xx and x by replacing the expression "xx <<- x" by "xx <<- x+0" it works as expected: > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<- x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) > optim(par=10, fn=fn, method="L-BFGS-B") 1 in x,xx,ret= 10 NA NA out x,xx,ret= 10 10 100 2 in x,xx,ret= 10.001 10 100 out x,xx,ret= 10.001 10.001 100.02 3 in x,xx,ret= 9.999 10.001 100.02 out x,xx,ret= 9.999 9.999 99.98 4 in x,xx,ret= 9 9.999 99.98 out x,xx,ret= 9 9 81 5 in x,xx,ret= 9.001 9 81 out x,xx,ret= 9.001 9.001 81.018 6 in x,xx,ret= 8.999 9.001 81.018 out x,xx,ret= 8.999 8.999 80.982 7 in x,xx,ret= 1.776357e-11 8.999 80.982 out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22 8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22 out x,xx,ret= 0.001 0.001 1e-06 9 in x,xx,ret= -0.001 0.001 1e-06 out x,xx,ret= -0.001 -0.001 1e-06 10 in x,xx,ret= -1.334475e-23 -0.001 1e-06 out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46 11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46 out x,xx,ret= 0.001 0.001 1e-06 12 in x,xx,ret= -0.001 0.001 1e-06 out x,xx,ret= -0.001 -0.001 1e-06 $par [1] -1.334475e-23 $value [1] 1.780823e-46 $counts function gradient 4 4 $convergence [1] 0 $message [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" Serguei. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R problems with lapack with gfortran
Dear Thomas, thank you for your input. I've debugged one of the packages and I confirm that the breakage is related to passing of strings from C to Fortran. Indeed, BLAS and LAPACK define a large number of subroutines that take one or more explicit single-character strings as arguments. Other than that, BLAS has only one function (xerbla), which takes a string of unspecified length, LAPACK only has four (ilaenv, ilaenv2stage, lsamen, xerbla). The C interfaces to BLAS/LAPACK from Netlib depend on the historic behavior that explicit single-character strings are interoperable, concretely CBLAS and LAPACKE provide C interfaces/code that calls into Fortran BLAS/LAPACK without passing the 1s as lengths of the character strings (functions taking a string of unspecified length are trivial and re-implemented in C). This has been working fine for very many years as the Fortran code never needed to access the length it knew was 1. R has been using the same practice, which long predates ISO_C_BINDING/BIND(C), and I've seen online discussions where people assumed interoperability of length 1 strings, once mentioning also a citation from Fortran 2003 Handbook that says "A Fortran character string with a length other than 1 is not interoperable" (which invites interpretation that length 1 strings were ). I am not an expert to say whether the current Fortran standard requires that interoperability and I assume that it does not given this gfortran change. This gfortran change breaks this interoperability: if a C function calls a Fortran function, passing it a single-character string for a parameter taking explicit single-character Fortran string, it may crash. I've debugged one case with R package BDgraph, this example "library(BDgraph); data.sim <- bdgraph.sim( n = 70, p = 5, size = 7, vis = TRUE )" crashes due to corruption of C stack by Fortran function DPOSV, when compiled with the new gfortran and with -O2. To see the problem, one can just look at the disassembly of DPOSV (LAPACK), neither the package nor R is not necessary: SUBROUTINE DPOSV( UPLO, N, NRHS, A, LDA, B, LDB, INFO ) CHARACTER UPLO In one case, DPOSV calls DPOTRS before returning. The new gfortran with -O2 performs tail-call optimization, jumping to DPOTRS. In the annotated disassembly snippet, at 11747f1, DPOSV tries to ensure that there is constant 1 as string length of UPLO when tail-calling into DPOTRS, so it writes it to stack where there already should have been 1 as length of UPLO passed to DPOSV. But this argument has not been passed to DPOSV, so this causes stack corruption. 1174ce: 0f 85 62 ff ff ff jne 117436 <== jump if ERROR CALL DPOTRS( UPLO, N, NRHS, A, LDA, B, LDB, INFO ) 1174d4: 48 8b 04 24 mov (%rsp),%rax <=== rax holds LDB 1174d8: 4c 89 7c 24 68 mov %r15,0x68(%rsp) <=== save INFO to output param 1174dd: 49 89 d8 mov %rbx,%r8 <== pass LDA as LDA 1174e0: 4c 89 e1 mov %r12,%rcx <= pass A as A 1174e3: 4c 8b 4c 24 08 mov 0x8(%rsp),%r9 <= pass B as B 1174e8: 4c 89 ea mov %r13,%rdx <= pass NRHS as NRHS 1174eb: 48 89 ee mov %rbp,%rsi <= pass N as N 1174ee: 4c 89 f7 mov %r14,%rdi <= pass UPLO as UPLO 1174f1: 48 c7 44 24 70 01 00 movq $0x1,0x70(%rsp) <=== pass 1 hidden arg on stack CORRUPTS C STACK 1174f8: 00 00 1174fa: 48 89 44 24 60 mov %rax,0x60(%rsp) <=== pass LDB as LDB (stack) END 1174ff: 48 83 c4 28 add $0x28,%rsp <== remove 5 vars from stack (sframe) 117503: 5b pop %rbx 117504: 5d pop %rbp 117505: 41 5c pop %r12 117507: 41 5d pop %r13 117509: 41 5e pop %r14 11750b: 41 5f pop %r15 <=== restore register to level before call CALL DPOTRS( UPLO, N, NRHS, A, LDA, B, LDB, INFO ) 11750d: e9 de 56 ef ff jmpq cbf0 <=== tail call to dpotrs Note that DPOSV never uses the length of the string (UPLO) from the hidden argument, the compiler clearly knows that its length is 1. In calls where the length is passed in registers, this does not cause trouble (like LSAME) and indeed is needed as the registers have different values IF( .NOT.LSAME( UPLO, 'U' ) .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN 117448: b9 01 00 00 00 mov $0x1,%ecx 11744d: ba 01 00 00 00 mov $0x1,%edx 117452: 48 8d 35 bb 12 09 00 lea 0x912bb(%rip),%rsi # 1a8714 117459: 4c 89 f7 mov %r14,%rdi 11745c: e8 1f 3d ef ff callq b180 bu
Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
Your results below make it look like a bug in optim(): it is not duplicating a value when it should, so changes to x affect xx as well. Duncan Murdoch On 03/05/2019 4:41 a.m., Serguei Sokol wrote: On 03/05/2019 10:31, Serguei Sokol wrote: On 02/05/2019 21:35, Florian Gerber wrote: Dear all, when using optim() for a function that uses the parent environment, I see the following unexpected behavior: makeFn <- function(){ xx <- ret <- NA fn <- function(x){ if(!is.na(xx) && x==xx){ cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="") return(ret) } xx <<- x; ret <<- sum(x^2) cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="") ret } fn } fn <- makeFn() optim(par=10, fn=fn, method="L-BFGS-B") # x=10, ret=100 (calculate) # x=10.001, ret=100.02 (calculate) # x=9.999, ret=100.02 (memory) # $par # [1] 10 # # $value # [1] 100 # (...) I would expect that optim() does more than 3 function evaluations and that the optimization converges to 0. Same problem with optim(par=10, fn=fn, method="BFGS"). Any ideas? I don't have an answer but may be an insight. For some mysterious reason xx is getting changed when in should not. Consider: fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) optim(par=10, fn=fn, method="L-BFGS-B") 1 in x,xx,ret= 10 NA NA out x,xx,ret= 10 10 100 2 in x,xx,ret= 10.001 10 100 out x,xx,ret= 10.001 10.001 100.02 3 in x,xx,ret= 9.999 9.999 100.02 $par [1] 10 $value [1] 100 $counts function gradient 1 1 $convergence [1] 0 $message [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" At the third call, xx has value 9.999 while it should have kept the value 10.001. A little follow-up: if you untie the link between xx and x by replacing the expression "xx <<- x" by "xx <<- x+0" it works as expected: > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<- x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) > optim(par=10, fn=fn, method="L-BFGS-B") 1 in x,xx,ret= 10 NA NA out x,xx,ret= 10 10 100 2 in x,xx,ret= 10.001 10 100 out x,xx,ret= 10.001 10.001 100.02 3 in x,xx,ret= 9.999 10.001 100.02 out x,xx,ret= 9.999 9.999 99.98 4 in x,xx,ret= 9 9.999 99.98 out x,xx,ret= 9 9 81 5 in x,xx,ret= 9.001 9 81 out x,xx,ret= 9.001 9.001 81.018 6 in x,xx,ret= 8.999 9.001 81.018 out x,xx,ret= 8.999 8.999 80.982 7 in x,xx,ret= 1.776357e-11 8.999 80.982 out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22 8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22 out x,xx,ret= 0.001 0.001 1e-06 9 in x,xx,ret= -0.001 0.001 1e-06 out x,xx,ret= -0.001 -0.001 1e-06 10 in x,xx,ret= -1.334475e-23 -0.001 1e-06 out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46 11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46 out x,xx,ret= 0.001 0.001 1e-06 12 in x,xx,ret= -0.001 0.001 1e-06 out x,xx,ret= -0.001 -0.001 1e-06 $par [1] -1.334475e-23 $value [1] 1.780823e-46 $counts function gradient 4 4 $convergence [1] 0 $message [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" Serguei. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
Yes, I think you are right. I was at first confused by the fact that after the optim() call, > environment(fn)$xx [1] 10 > environment(fn)$ret [1] 100.02 so not 9.999, but this could come from x being assigned the final value without calling fn. -pd > On 3 May 2019, at 11:58 , Duncan Murdoch wrote: > > Your results below make it look like a bug in optim(): it is not duplicating > a value when it should, so changes to x affect xx as well. > > Duncan Murdoch > > On 03/05/2019 4:41 a.m., Serguei Sokol wrote: >> On 03/05/2019 10:31, Serguei Sokol wrote: >>> On 02/05/2019 21:35, Florian Gerber wrote: Dear all, when using optim() for a function that uses the parent environment, I see the following unexpected behavior: makeFn <- function(){ xx <- ret <- NA fn <- function(x){ if(!is.na(xx) && x==xx){ cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="") return(ret) } xx <<- x; ret <<- sum(x^2) cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="") ret } fn } fn <- makeFn() optim(par=10, fn=fn, method="L-BFGS-B") # x=10, ret=100 (calculate) # x=10.001, ret=100.02 (calculate) # x=9.999, ret=100.02 (memory) # $par # [1] 10 # # $value # [1] 100 # (...) I would expect that optim() does more than 3 function evaluations and that the optimization converges to 0. Same problem with optim(par=10, fn=fn, method="BFGS"). Any ideas? >>> I don't have an answer but may be an insight. For some mysterious >>> reason xx is getting changed when in should not. Consider: fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in >>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx >>> <<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) optim(par=10, fn=fn, method="L-BFGS-B") >>> 1 in x,xx,ret= 10 NA NA >>> out x,xx,ret= 10 10 100 >>> 2 in x,xx,ret= 10.001 10 100 >>> out x,xx,ret= 10.001 10.001 100.02 >>> 3 in x,xx,ret= 9.999 9.999 100.02 >>> $par >>> [1] 10 >>> >>> $value >>> [1] 100 >>> >>> $counts >>> function gradient >>>11 >>> >>> $convergence >>> [1] 0 >>> >>> $message >>> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" >>> >>> At the third call, xx has value 9.999 while it should have kept the >>> value 10.001. >>> >> A little follow-up: if you untie the link between xx and x by replacing >> the expression "xx <<- x" by "xx <<- x+0" it works as expected: >> > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in >> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<- >> x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) >> > optim(par=10, fn=fn, method="L-BFGS-B") >> 1 in x,xx,ret= 10 NA NA >> out x,xx,ret= 10 10 100 >> 2 in x,xx,ret= 10.001 10 100 >> out x,xx,ret= 10.001 10.001 100.02 >> 3 in x,xx,ret= 9.999 10.001 100.02 >> out x,xx,ret= 9.999 9.999 99.98 >> 4 in x,xx,ret= 9 9.999 99.98 >> out x,xx,ret= 9 9 81 >> 5 in x,xx,ret= 9.001 9 81 >> out x,xx,ret= 9.001 9.001 81.018 >> 6 in x,xx,ret= 8.999 9.001 81.018 >> out x,xx,ret= 8.999 8.999 80.982 >> 7 in x,xx,ret= 1.776357e-11 8.999 80.982 >> out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22 >> 8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22 >> out x,xx,ret= 0.001 0.001 1e-06 >> 9 in x,xx,ret= -0.001 0.001 1e-06 >> out x,xx,ret= -0.001 -0.001 1e-06 >> 10 in x,xx,ret= -1.334475e-23 -0.001 1e-06 >> out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46 >> 11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46 >> out x,xx,ret= 0.001 0.001 1e-06 >> 12 in x,xx,ret= -0.001 0.001 1e-06 >> out x,xx,ret= -0.001 -0.001 1e-06 >> $par >> [1] -1.334475e-23 >> $value >> [1] 1.780823e-46 >> $counts >> function gradient >> 44 >> $convergence >> [1] 0 >> $message >> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" >> Serguei. >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
It looks as though this happens when calculating numerical gradients: x is reduced by eps, and fn is called; then x is increased by eps, and fn is called again. No check is made that x has other references after the first call to fn. I'll put together a patch if nobody else gets there first... Duncan Murdoch On 03/05/2019 7:13 a.m., peter dalgaard wrote: Yes, I think you are right. I was at first confused by the fact that after the optim() call, environment(fn)$xx [1] 10 environment(fn)$ret [1] 100.02 so not 9.999, but this could come from x being assigned the final value without calling fn. -pd On 3 May 2019, at 11:58 , Duncan Murdoch wrote: Your results below make it look like a bug in optim(): it is not duplicating a value when it should, so changes to x affect xx as well. Duncan Murdoch On 03/05/2019 4:41 a.m., Serguei Sokol wrote: On 03/05/2019 10:31, Serguei Sokol wrote: On 02/05/2019 21:35, Florian Gerber wrote: Dear all, when using optim() for a function that uses the parent environment, I see the following unexpected behavior: makeFn <- function(){ xx <- ret <- NA fn <- function(x){ if(!is.na(xx) && x==xx){ cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="") return(ret) } xx <<- x; ret <<- sum(x^2) cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="") ret } fn } fn <- makeFn() optim(par=10, fn=fn, method="L-BFGS-B") # x=10, ret=100 (calculate) # x=10.001, ret=100.02 (calculate) # x=9.999, ret=100.02 (memory) # $par # [1] 10 # # $value # [1] 100 # (...) I would expect that optim() does more than 3 function evaluations and that the optimization converges to 0. Same problem with optim(par=10, fn=fn, method="BFGS"). Any ideas? I don't have an answer but may be an insight. For some mysterious reason xx is getting changed when in should not. Consider: fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) optim(par=10, fn=fn, method="L-BFGS-B") 1 in x,xx,ret= 10 NA NA out x,xx,ret= 10 10 100 2 in x,xx,ret= 10.001 10 100 out x,xx,ret= 10.001 10.001 100.02 3 in x,xx,ret= 9.999 9.999 100.02 $par [1] 10 $value [1] 100 $counts function gradient 11 $convergence [1] 0 $message [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" At the third call, xx has value 9.999 while it should have kept the value 10.001. A little follow-up: if you untie the link between xx and x by replacing the expression "xx <<- x" by "xx <<- x+0" it works as expected: > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<- x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) > optim(par=10, fn=fn, method="L-BFGS-B") 1 in x,xx,ret= 10 NA NA out x,xx,ret= 10 10 100 2 in x,xx,ret= 10.001 10 100 out x,xx,ret= 10.001 10.001 100.02 3 in x,xx,ret= 9.999 10.001 100.02 out x,xx,ret= 9.999 9.999 99.98 4 in x,xx,ret= 9 9.999 99.98 out x,xx,ret= 9 9 81 5 in x,xx,ret= 9.001 9 81 out x,xx,ret= 9.001 9.001 81.018 6 in x,xx,ret= 8.999 9.001 81.018 out x,xx,ret= 8.999 8.999 80.982 7 in x,xx,ret= 1.776357e-11 8.999 80.982 out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22 8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22 out x,xx,ret= 0.001 0.001 1e-06 9 in x,xx,ret= -0.001 0.001 1e-06 out x,xx,ret= -0.001 -0.001 1e-06 10 in x,xx,ret= -1.334475e-23 -0.001 1e-06 out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46 11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46 out x,xx,ret= 0.001 0.001 1e-06 12 in x,xx,ret= -0.001 0.001 1e-06 out x,xx,ret= -0.001 -0.001 1e-06 $par [1] -1.334475e-23 $value [1] 1.780823e-46 $counts function gradient 44 $convergence [1] 0 $message [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" Serguei. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] mccollect with NULL in R 3.6
On Thu, May 2, 2019 at 7:24 PM Tomas Kalibera wrote: > > On 5/1/19 12:25 AM, Gergely Daróczi wrote: > > Dear All, > > > > I'm running into issues with calling mccollect on a list containing NULL > > using R 3.6 (this used to work in 3.5.3): > > > > jobs <- lapply( > > list(NULL, 'foobar'), > > function(x) mcparallel(identity(x))) > > mccollect(jobs, wait = FALSE, timeout = 0) > > #> Error in names(res) <- pnames[match(s, pids)] : > > #> 'names' attribute [2] must be the same length as the vector [1] > > > > Note, setting a "name" for the jobs does not help, but the above works with > > "wait=TRUE", and also if I change the order of NULL and "foobar", although > > in that case, the second value (NULL) is ommitted. It also works with > > mclapply fine. > > > > Any ideas/suggestion on how to get mccollect work with the above example?. > > NULL is not a valid job identification. Perhaps mccollect() could give a > clearer error message, but I don't see, given its documentation, what > else than throwing an error it should do. What is the problem you were > trying to solve? Thank you very much for looking into this! What was interesting to me is that it used to work before 3.6 -- I have a script iterating over a list of data frames to train models, but it started to fail with the 3.6 release. The "NULL is not a valid job identification" problem doesn't seem to stand for my production job, as each list element has a proper name, but I think I can reproduce this with this minimal example as well: library(parallel) jobs <- lapply(1:2, function(x) { mcparallel(if (x == 1) NULL else x, name = as.character(x)) }) mccollect(jobs, wait = FALSE, timeout = 2) #> Error in names(res) <- pnames[match(s, pids)] : #> 'names' attribute [1] must be the same length as the vector [0] So the jobs have proper name, but the NULL return value is causing problems. Note, that it only causes problems when the NULL value is the first, eg switching 1 and 2 works, also running this on 1:3 and returning NULL on 2 etc. Now, I'm aware that 7 months ago this was added to the docs at https://github.com/wch/r-source/commit/f0d15be765dcf92e2349429428d49cd5b212abb4 that NULL should not be returned, so it seems to be a user error on my end, but it seems to fail only when NULL is the first returned element in mccollect, and working OK eg when NULL is the 2nd or other element (although with side effects, eg missed elements). So maybe failing with an explicit error message whenever mccollect hits a NULL for consistency might help here instead of skipping "delivered.result <- delivered.result + 1L" when the returned value is not raw at https://github.com/wch/r-source/commit/f0d15be765dcf92e2349429428d49cd5b212abb4#diff-e634fbaed323aac88667e7826865b160R72 ? Or even better (at least for my use case), maybe allowing to return NULL and throwing the warning on line 108 in that case. Thanks for considering this, Gergely > > Best > Tomas > > > > > Thanks, > > Gergely > > > > [[alternative HTML version deleted]] > > > > __ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Strange error messages from parallel::mcparallel family under 3.6.0
Dear All, Since upgrading to 3.6.0, I've been getting a strange error messages from the child process when using mcparallel/mccollect. Before filing a report in the Bugzilla, I want to figure out whether I had been doing something wrong all this time and R 3.6.0 has exposed it, or whether something else is going on. # Background # Ultimately, what I want to do is to be able to set a time limit for an expression to be evaluated that would be enforced even inside compiled code. (R.utils::withTimeout() uses base::setTimeLimit(), which can only enforce within R code.) # Implementation # The approach that my implementation, statnet.common::forkTimeout() (source attached for convenience), uses is to call mcparallel() to evaluate the expression in a child process, then mccollect() with wait=FALSE and a timeout to give it a chance to finish. If it runs past the timeout, the child process is killed and an onTimeout value is returned. (This only works on Unix-alikes, but it's better than nothing.) # The problem # Since 3.6.0---and I've tested fresh installs of 3.6.0 and 3.5.3 side- by-side---I've been getting strange messages. Running source("forkTimeout.R") # attached repeat print(forkTimeout({Sys.sleep(1);TRUE}, timeout=3)) results in [1] TRUE [1] TRUE Error in mcexit(0L) : ignoring SIGPIPE signal [1] TRUE [1] TRUE Error in mcexit(0L) : ignoring SIGPIPE signal [1] TRUE [1] TRUE [1] TRUE until interrupted. Running options(error=traceback) repeat print(forkTimeout({Sys.sleep(1);TRUE}, timeout=3)) results in sporadic messages of the form: Error in mcexit(0L) : ignoring SIGPIPE signal 6: selectChildren(jobs, timeout) 5: parallel::mccollect(child, wait = FALSE, timeout = timeout) at forkTimeout.R#75 4: withCallingHandlers(expr, warning = function(w) invokeRestart("muffleWarning")) 3: suppressWarnings(parallel::mccollect(child, wait = FALSE, timeout = timeout)) at forkTimeout.R#75 2: forkTimeout({ Sys.sleep(1) ... 1: print(forkTimeout({ Sys.sleep(1) ... So, these messages do not appear to prevent the child process from returning valid output, but I've never seen them before R 3.6.0, so I wonder if I am doing something wrong. Session info is also attached. Thanks in advance, Pavel -- Pavel Krivitsky Lecturer in Statistics National Institute of Applied Statistics Research Australia (NIASRA) School of Mathematics and Applied Statistics | Building 39C Room 154 University of Wollongong NSW 2522 Australia T +61 2 4221 3713 Web (NIASRA): http://niasra.uow.edu.au/index.html Web (Personal): http://www.krivitsky.net/research ORCID: -0002-9101-3362 NOTICE: This email is intended for the addressee named and may contain confidential information. If you are not the intended recipient, please delete it and notify the sender. Please consider the environment before printing this email. R version 3.6.0 (2019-04-26) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Debian GNU/Linux buster/sid Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3 LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3 locale: [1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_AU.utf8LC_COLLATE=en_AU.UTF-8 [5] LC_MONETARY=en_AU.utf8LC_MESSAGES=en_AU.UTF-8 [7] LC_PAPER=en_AU.utf8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_AU.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.6.0 tools_3.6.0parallel_3.6.0 #' Evaluate an \R expression with a hard time limit by forking a process #' #' This function uses #' #ifndef windows #' [parallel::mcparallel()], #' #endif #' #ifdef windows #' `parallel::mcparallel()`, #' #endif #' so the time limit is not #' enforced on Windows. However, unlike functions using [setTimeLimit()], the time #' limit is enforced even on native code. #' #' @param expr expression to be evaluated. #' @param timeout number of seconds to wait for the expression to #' evaluate. #' @param unsupported a character vector of length 1 specifying how to #' handle a platform that does not support #' #ifndef windows #' [parallel::mcparallel()], #' #endif #' #ifdef windows #' `parallel::mcparallel()`, #' #endif #' \describe{ #' #' \item{`"warning"` or `"message"`}{Issue a warning or a message, #' respectively, then evaluate the expression without the time limit #' enforced.} #' #' \item{`"error"`}{Stop with an error.} #' #' \item{`"silent"`}{Evaluate the expression without the time limit #' enforced, without any notice.} #' #' } Partial matching is used. #' @param onTimeout Value to be returned on time-out. #' #' @return Result of evaluating `expr` if completed, `onTimeout` #' otherwise. #' #' @note `onTimeout` can itself be an expre
Re: [Rd] mccollect with NULL in R 3.6
On 5/3/19 3:04 PM, Gergely Daróczi wrote: On Thu, May 2, 2019 at 7:24 PM Tomas Kalibera wrote: On 5/1/19 12:25 AM, Gergely Daróczi wrote: Dear All, I'm running into issues with calling mccollect on a list containing NULL using R 3.6 (this used to work in 3.5.3): jobs <- lapply( list(NULL, 'foobar'), function(x) mcparallel(identity(x))) mccollect(jobs, wait = FALSE, timeout = 0) #> Error in names(res) <- pnames[match(s, pids)] : #> 'names' attribute [2] must be the same length as the vector [1] Note, setting a "name" for the jobs does not help, but the above works with "wait=TRUE", and also if I change the order of NULL and "foobar", although in that case, the second value (NULL) is ommitted. It also works with mclapply fine. Any ideas/suggestion on how to get mccollect work with the above example?. NULL is not a valid job identification. Perhaps mccollect() could give a clearer error message, but I don't see, given its documentation, what else than throwing an error it should do. What is the problem you were trying to solve? Thank you very much for looking into this! What was interesting to me is that it used to work before 3.6 -- I have a script iterating over a list of data frames to train models, but it started to fail with the 3.6 release. The "NULL is not a valid job identification" problem doesn't seem to stand for my production job, as each list element has a proper name, but I think I can reproduce this with this minimal example as well: library(parallel) jobs <- lapply(1:2, function(x) { mcparallel(if (x == 1) NULL else x, name = as.character(x)) }) mccollect(jobs, wait = FALSE, timeout = 2) #> Error in names(res) <- pnames[match(s, pids)] : #> 'names' attribute [1] must be the same length as the vector [0] So the jobs have proper name, but the NULL return value is causing problems. Note, that it only causes problems when the NULL value is the first, eg switching 1 and 2 works, also running this on 1:3 and returning NULL on 2 etc. Thanks for the clarification, I got confused by that you wrote about "calling mccollect on a list containing NULL", so I was answering a different question. Now I see you actually meant collecting results from jobs that return NULL, and even though that writing jobs that return NULL is incorrect, I see that there is really a bug that causes the R error. Now, I'm aware that 7 months ago this was added to the docs at https://github.com/wch/r-source/commit/f0d15be765dcf92e2349429428d49cd5b212abb4 that NULL should not be returned, so it seems to be a user error on my end, but it seems to fail only when NULL is the first returned element in mccollect, and working OK eg when NULL is the 2nd or other element (although with side effects, eg missed elements). NULL has been reserved even well before that documentation update, and if it somewhat worked as a result, it had been a coincidence. The documentation change was only to make it more obvious, but it already said "‘mccollect’ will return ‘NULL’ for a terminating job that has sent its results already after which the job is no longer available." So maybe failing with an explicit error message whenever mccollect hits a NULL for consistency might help here instead of skipping "delivered.result <- delivered.result + 1L" when the returned value is not raw at https://github.com/wch/r-source/commit/f0d15be765dcf92e2349429428d49cd5b212abb4#diff-e634fbaed323aac88667e7826865b160R72 ? Or even better (at least for my use case), maybe allowing to return NULL and throwing the warning on line 108 in that case. Thanks for the report, I've fixed mccollect() in R-devel to return NULL from the job. To be ported to R-pathed unless problems appear. Best Tomas Thanks for considering this, Gergely Best Tomas Thanks, Gergely [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R problems with lapack with gfortran
Hi Tomas, thanks a lot for your analysis. I have created https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90329 for this, and put you in CC (if your e-mail address for GCC bugzilla is still current). Regards Thomas __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] mccollect with NULL in R 3.6
On Fri, May 3, 2019 at 4:34 PM Tomas Kalibera wrote: > > On 5/3/19 3:04 PM, Gergely Daróczi wrote: > > On Thu, May 2, 2019 at 7:24 PM Tomas Kalibera > > wrote: > >> On 5/1/19 12:25 AM, Gergely Daróczi wrote: > >>> Dear All, > >>> > >>> I'm running into issues with calling mccollect on a list containing NULL > >>> using R 3.6 (this used to work in 3.5.3): > >>> > >>> jobs <- lapply( > >>> list(NULL, 'foobar'), > >>> function(x) mcparallel(identity(x))) > >>> mccollect(jobs, wait = FALSE, timeout = 0) > >>> #> Error in names(res) <- pnames[match(s, pids)] : > >>> #> 'names' attribute [2] must be the same length as the vector [1] > >>> > >>> Note, setting a "name" for the jobs does not help, but the above works > >>> with > >>> "wait=TRUE", and also if I change the order of NULL and "foobar", although > >>> in that case, the second value (NULL) is ommitted. It also works with > >>> mclapply fine. > >>> > >>> Any ideas/suggestion on how to get mccollect work with the above example?. > >> NULL is not a valid job identification. Perhaps mccollect() could give a > >> clearer error message, but I don't see, given its documentation, what > >> else than throwing an error it should do. What is the problem you were > >> trying to solve? > > Thank you very much for looking into this! > > > > What was interesting to me is that it used to work before 3.6 -- I > > have a script iterating over a list of data frames to train models, > > but it started to fail with the 3.6 release. > > > > The "NULL is not a valid job identification" problem doesn't seem to > > stand for my production job, as each list element has a proper name, > > but I think I can reproduce this with this minimal example as well: > > > > library(parallel) > > jobs <- lapply(1:2, function(x) { > > mcparallel(if (x == 1) NULL else x, name = as.character(x)) > > }) > > mccollect(jobs, wait = FALSE, timeout = 2) > > #> Error in names(res) <- pnames[match(s, pids)] : > > #> 'names' attribute [1] must be the same length as the vector [0] > > > > So the jobs have proper name, but the NULL return value is causing > > problems. Note, that it only causes problems when the NULL value is > > the first, eg switching 1 and 2 works, also running this on 1:3 and > > returning NULL on 2 etc. > Thanks for the clarification, I got confused by that you wrote about > "calling mccollect on a list containing NULL", so I was answering a > different question. Now I see you actually meant collecting results from > jobs that return NULL, and even though that writing jobs that return > NULL is incorrect, I see that there is really a bug that causes the R error. > > Now, I'm aware that 7 months ago this was added to the docs at > > https://github.com/wch/r-source/commit/f0d15be765dcf92e2349429428d49cd5b212abb4 > > that NULL should not be returned, so it seems to be a user error on my > > end, but it seems to fail only when NULL is the first returned element > > in mccollect, and working OK eg when NULL is the 2nd or other element > > (although with side effects, eg missed elements). > NULL has been reserved even well before that documentation update, and > if it somewhat worked as a result, it had been a coincidence. The > documentation change was only to make it more obvious, but it already > said "‘mccollect’ will return ‘NULL’ for a terminating job that has sent > its results already after which the job is no longer available." > > So maybe failing with an explicit error message whenever mccollect > > hits a NULL for consistency might help here instead of skipping > > "delivered.result <- delivered.result + 1L" when the returned value is > > not raw at > > https://github.com/wch/r-source/commit/f0d15be765dcf92e2349429428d49cd5b212abb4#diff-e634fbaed323aac88667e7826865b160R72 > > ? Or even better (at least for my use case), maybe allowing to return > > NULL and throwing the warning on line 108 in that case. > > Thanks for the report, I've fixed mccollect() in R-devel to return NULL > from the job. To be ported to R-pathed unless problems appear. Thank you very much for the great news and your work! > > Best > Tomas > > > > > Thanks for considering this, > > Gergely > > > > >> Best > >> Tomas > >> > >>> Thanks, > >>> Gergely > >>> > >>>[[alternative HTML version deleted]] > >>> > >>> __ > >>> R-devel@r-project.org mailing list > >>> https://stat.ethz.ch/mailman/listinfo/r-devel > >> > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel