Re: [Rd] optim(…?=, =?utf-8?Q?method=‘L-BFGS-B’) stops with an error message while violating the lower bound
> Spencer Graves > on Sat, 8 Oct 2016 18:03:43 -0500 writes: [.] > 2. It would be interesting to know if the > current algorithm behind optim and optimx with > method='L-BFGS-B' incorporates Morales and Nocedal (2011) > 'Remark on “Algorithm 778: L-BFGS-B: Fortran Subroutines > for Large-Scale Bound Constrained Optimization”'. I > created this vignette and started this threat hoping that > someone on the R Core team might decide it's worth > checking things like that. well I hope you mean "thread" rather "threat" ;-) I've now looked at the reference above, which is indeed quite interesting. doi 10.1145/2049662.2049669 --> http://dl.acm.org/citation.cfm?doid=2049662.2049669 A "free" (pre-publication I assume) version of the manuscript is http://www.eecs.northwestern.edu/~morales/PSfiles/acm-remark.pdf The authors, Morales and Nocedal, the 2nd one being one of the original L-BFGS-B(1997) paper, make two remarks, the 2nd one about the "machine epsilon" used, and I can assure you that R's optim() version never suffered from that; we've always been using a C translation of the fortran code, and then used DBL_EPSILON. R's (main) source file for that is in .../src/appl/lbfgsb.c, e.g., here https://svn.r-project.org/R/trunk/src/appl/lbfgsb.c OTOH, their remark 1 is very relevant and promising faster / more reliable convergence. I'd be "happy" if optim() could gain a new option, say, "L-BFGS-B-2011" which would incorporate what they call "modified L-BFGS-B". However, I did not find published code to go together with their remark. Ideally, some of you interested in this, would provide a patch against the above lbfgsb.c file Martin Maechler, ETH Zurich __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Bug/Inconsistency in merge() with all.x when first nonmatching column in y is matrix
I've noticed inconsistent behavior with merge() when using all.x=TRUE. After some digging I found the following test cases: 1) The snippet below doesn't work as expected, as the non-matching columns of rows in a but not b take the value from the first matching row instead of being NA: --- Snip >>> NUM<-25; a <- data.frame(id=factor(letters[1:NUM]), qq=rep(NA, NUM), rr=rep(1.0,NUM)) b <- data.frame(id=c("e","a","f","y","x")) b$mm <- as.vector(c(1,2,3.1,4.0,NA))%o%3.14 b$nn <- rep("from b", 5) merge(a,b,by="id",all.x=TRUE) <<< Snip --- 2) The modified snippet below works as expected: --- Snip >>> NUM<-25; a <- data.frame(id=factor(letters[1:NUM]), qq=rep(NA, NUM), rr=rep(1.0,NUM)) b <- data.frame(id=c("e","a","f","y","x")) b$nn <- rep("from b", 5) b$mm <- as.vector(c(1,2,3.1,4.0,NA))%o%3.14 merge(a,b,by="id",all.x=TRUE) <<< Snip --- In src/library/base/R/merge.R:154, I see the following: --- Snip >>> for(i in seq_along(y)) { ## do it this way to invoke methods for e.g. factor if(is.matrix(y[[1]])) y[[1]][zap, ] <- NA else is.na(y[[i]]) <- zap } <<< Snip --- Changing the '1's in the if statement to 'i's fixes this issue for me, i.e.: --- Snip >>> for(i in seq_along(y)) { ## do it this way to invoke methods for e.g. factor if(is.matrix(y[[i]])) y[[i]][zap, ] <- NA else is.na(y[[i]]) <- zap } <<< Snip --- I'm actually not sure if the "if statement" is even needed (the "else" case seems to handle matrices just fine). --Russ Hamilton __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] optim(…?=, =?utf-8?Q?method=‘L-BFGS-B’) stops with an error message while violating the lower bound
I believe the code can be found here: http://users.iems.northwestern.edu/~nocedal/lbfgsb.html. Specifically, lbfgsb.f in version 3.0 starts: This is a modified version of L-BFGS-B. Minor changes in the updated c code appear preceded by a line comment as follows c c c-jlm-jn c c Major changes are described in the accompanying paper: c c Jorge Nocedal and Jose Luis Morales, Remark on "Algorithm 778: c L-BFGS-B: Fortran Subroutines for Large-Scale Bound Constrained c Optimization" (2011). To appear in ACM Transactions on c Mathematical Software, c c The paper describes an improvement and a correction to Algorithm 778. c It is shown that the performance of the algorithm can be improved c significantly by making a relatively simple modication to the subspace c minimization phase. The correction concerns an error caused by the use c of routine dpmeps to estimate machine precision. It is released under the New 3-clause BSD license, so porting it to C for inclusion into R should be OK as long as the i's are dotted and t's crossed. Avi On Mon, Oct 10, 2016 at 5:54 AM, Martin Maechler wrote: >> Spencer Graves >> on Sat, 8 Oct 2016 18:03:43 -0500 writes: > > [.] > > > 2. It would be interesting to know if the > > current algorithm behind optim and optimx with > > method='L-BFGS-B' incorporates Morales and Nocedal (2011) > > 'Remark on “Algorithm 778: L-BFGS-B: Fortran Subroutines > > for Large-Scale Bound Constrained Optimization”'. I > > created this vignette and started this threat hoping that > > someone on the R Core team might decide it's worth > > checking things like that. > > well I hope you mean "thread" rather "threat" ;-) > > I've now looked at the reference above, which is indeed quite > interesting. > doi 10.1145/2049662.2049669 > --> http://dl.acm.org/citation.cfm?doid=2049662.2049669 > A "free" (pre-publication I assume) version of the manuscript is > http://www.eecs.northwestern.edu/~morales/PSfiles/acm-remark.pdf > > The authors, Morales and Nocedal, the 2nd one being one of the > original L-BFGS-B(1997) paper, make two remarks, the 2nd one > about the "machine epsilon" used, and I can assure you that R's > optim() version never suffered from that; we've always been > using a C translation of the fortran code, and then used DBL_EPSILON. > R's (main) source file for that is in .../src/appl/lbfgsb.c, e.g., here > https://svn.r-project.org/R/trunk/src/appl/lbfgsb.c > > OTOH, their remark 1 is very relevant and promising faster / > more reliable convergence. > I'd be "happy" if optim() could gain a new option, say, "L-BFGS-B-2011" > which would incorporate what they call "modified L-BFGS-B". > > However, I did not find published code to go together with their > remark. > Ideally, some of you interested in this, would provide a patch > against the above lbfgsb.c file > > Martin Maechler, > ETH Zurich > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] PKG_LIBS in make child processes
[cross-posted from bioc-devel list] Hi all, I have a subtle question related to how R CMD SHLIB handles variables in make child processes. In more detail: I am the maintainer of the 'msa' package which has been in Bioconductor since April 2015. This package integrates three open-source libraries for multiple sequence alignment. This is organized in the following way: in src/, there are three sub-directories, one for each of the libraries (plus another one for a garbage collector library, but that is not relevant at this point). src/Makevars is made such that the libraries are compiled individually to static libraries in their respective sub-directory, then these static libraries are copied to src/, and finally the static libraries are integrated into msa.so. The Makevars file looks as follows: PKG_LIBS=`${R_HOME}/bin${R_ARCH_BIN}/Rscript -e "if (Sys.info()['sysname'] == 'Darwin') cat('-Wl,-all_load ./libgc.a ./libClustalW.a ./libClustalOmega.a ./libMuscle.a') else cat('-Wl,--whole-archive ./libgc.a ./libClustalW.a ./libClustalOmega.a ./libMuscle.a -Wl,--no-whole-archive')"` PKG_CXXFLAGS=-I"./gc-7.2/include" -I"./Muscle/" -I"./ClustalW/src" -I"./ClustalOmega/src" .PHONY: all mylibs all: $(SHLIB) $(SHLIB): mylibs mylibs: build_gc build_muscle build_clustalw build_clustalomega build_gc: make --file=msaMakefile --directory=gc-7.2 @echo "" @echo "-- GC -" @echo "" @echo "- Compilation finished -" @echo "" build_muscle: make --file=msaMakefile --directory=Muscle @echo "" @echo " MUSCLE " @echo "" @echo "- Compilation finished -" @echo "" build_clustalw: make --file=msaMakefile --directory=ClustalW @echo "" @echo "--- ClustalW ---" @echo "" @echo "- Compilation finished -" @echo "" build_clustalomega: make --file=msaMakefile --directory=ClustalOmega @echo "" @echo "- ClustalOmega -" @echo "" @echo "- Compilation finished -" @echo "" This has always worked on Linux and Mac OS so far. Now I have received an error report from a user who cannot install the package on a 64-bit openSUSE 13.1 system using R 3.3.1. It turned out that R CMD SHLIB as called in the make child processes (make target 'build_muscle' above) uses the value of PKG_LIBS defined in the first line of the top-level Makevars file shown above (which of course does not work and makes no sense), while this does not happen on any other Unix-like system I have tried so far (Ubuntu, CentOS, Mac OS). Maybe somebody can shed some light on how variables defined inside the Makevars file propagate to child processes. Thanks so much in advance! Best regards, Ulrich __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel