Re: [Rd] R problems with lapack with gfortran
On Sat, May 04, 2019 at 06:05:48PM +0200, peter dalgaard wrote: > > - figure out Fortran2003 specification for C/Fortran interoperability > -- this _sounds_ like the right solution, but I don't think many > understand how to use it and what is implied (in particular, will it > require making changes to LAPACK itself?) This is probably the best solution as it should allow portability to any Fortran processor that supports F2003 or newer standard. See the gtk-fortran project. It has a python program that was used to generate the needed ISO C interfaces. -- Steve __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] read.table() fails with https in R 3.6 but not in R 3.5
In versions of R prior to 3.6.0 the following invocation succeeds, returning the data frame shown: > read.table("https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text";, > header=TRUE) Dekade Anzahl 11900 11467254 21910 13023370 31920 13434601 41930 13296355 51940 12121250 61950 13191131 71960 10587420 81970 10944129 91980 11279439 10 1990 12052652 But in version 3.6.0 it fails: > read.table("https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text";, > header=TRUE) Error in file(file, "rt") : cannot open the connection to 'https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text' In addition: Warning message: In file(file, "rt") : cannot open URL 'https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text': HTTP status was '403 Forbidden' The table at this URL is generated by a query processor and the same failure happens in 3.6.0 with other queries at this website. This website does not appear to serve data via http: replacing https by http in the above gives the same results, and in 3.6.0 the error message contains the URL with http but in the warning message the URL is with https. I have also tried a few other websites that serve (non-generated) tabular data via https (e.g. https://graphchallenge.s3.amazonaws.com/synthetic/gc3/Theory-16-25-81-Bk.tsv) and with these read.table() succeeds in 3.6.0, so the problem isn't https in general. Maybe it has to do with the page being generated rather than static? There's only one reference to https in the 3.6.0 NEWS, concerning libcurl; I can't tell if it's relevant. In case it matters, this is with R packaged for openSUSE, and I've found the above difference between 3.5 and 3.6 on both openSUSE Leap 15.0 and openSUSE Tumbleweed. Steve Berman __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R problems with lapack with gfortran
On Sat, May 04, 2019 at 06:42:47PM +0200, Thomas König wrote: > > > - figure out Fortran2003 specification for C/Fortran interoperability > > -- this _sounds_ like the right solution, but I don't think many > > understand how to use it and what is implied (in particular, will > > it require making changes to LAPACK itself?) > > That would actually be fairly easy. If you declare the subroutines > BIND(C), as in > >subroutine foo(a,b) BIND(C,name="foo_") >real a >character*1 b >end > > you will get the calling signature that you already have in your C > sources. > > This also has the advantage of being standards compliant, and would be > probably be the preferred method. > With the caveat that one may need to use the VALUE attribute to account for pass-by-value vs pass-by-reference. -- Steve __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Porting R example datasets to GNU Octave
Hi, R Developers, I'm interested in porting the R example datasets package to GNU Octave and Matlab. Would you have objections to my doing so? This would involve transforming the example data and metadata into a format that Octave understands, and porting all of the datasets' Example code pieces to Octave M-code. (This would require no work on your part; it'd be my project.) I think this would be a benefit to the scientific programming community. In addition to helping Octave users, having code for identical example data sets in both languages would serve as a Rosetta Stone for not only users moving from R to Octave, but for users coming from Octave or Matlab to R. Since R's datasets package is GPL, I think I'd be within my rights to just do this. But I wanted to ask first, to make sure I didn't ruffle any feathers. I would include documentation indicating that R is the original source (well, intermediate source) for these datasets, and have links pointing back to R's documentation. Cheers, Andrew Janke __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Incorrect date range in austres example dataset?
Hi, R developers, It seems there might be an issue with the "austres" example dataset in the "datasets" package. The description in austres.Rd says it's "measured quarterly from March 1971 to March 1994". But there are only 89 observations in the data as defined in the source code. By my count, that only brings you up to about March 1993. Is there an issue with the data transcription, or the Description? I'm looking at the source code from the R 3.6.0 source distribution. Cheers, Andrew Janke __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Bug in function boxplot's axis labeling
Dear developeRs, I appreciate that boxplot now labels the axes with variable names per default. However, with argument "horizontal=TRUE" (which I always use), the default axis labels are mixed up, as can e.g. be seen with require(boot) boxplot(time ~ poison, poisons, horizontal=TRUE) The correct labels would be obtained by boxplot(time ~ poison, poisons, horizontal=TRUE, xlab="time", ylab="poison") Best, Ulrike -- ## ## Prof. Ulrike Groemping ## FB II ## Beuth University of Applied Sciences Berlin ## ## prof.beuth-hochschule.de/groemping ## Phone: +49(0)30 4504 5127 ## Fax: +49(0)30 4504 66 5127 ## Home office: +49(0)30 394 04 863 ## __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R problems with lapack with gfortran
On 5/4/19 6:49 PM, Steve Kargl wrote: On Sat, May 04, 2019 at 06:42:47PM +0200, Thomas König wrote: - figure out Fortran2003 specification for C/Fortran interoperability -- this _sounds_ like the right solution, but I don't think many understand how to use it and what is implied (in particular, will it require making changes to LAPACK itself?) That would actually be fairly easy. If you declare the subroutines BIND(C), as in subroutine foo(a,b) BIND(C,name="foo_") real a character*1 b end you will get the calling signature that you already have in your C sources. This also has the advantage of being standards compliant, and would be probably be the preferred method. With the caveat that one may need to use the VALUE attribute to account for pass-by-value vs pass-by-reference. This seems clean solution, but as I said before not easy, because currently the tradition is to call the Fortran interface directly from C (not via any C wrappers). This means one could not substitute LAPACK/BLAS at dynamic linking time, unless all LAPACK/BLAS implementations agreed on such a C interface. Now the substitution is based on the original Fortran interface. In case of R, if we only used the included reference BLAS/LAPACK, we could do this, define our wrappers, say "c_dgemm" for "dgemm", change R to call via that interface, ask maintainers of all packages to change their code to call via their interface, and this should work with all Fortran 2003 compilers. But, R is often used also with optimized BLAS/LAPACK implementations that can be substituted at dynamic linking time. And there we could do nothing at R level to help: we cannot generate such wrappers for an existing LAPACK/BLAS implementation (we don't have the source code, the compiler, etc). It would be certainly a good thing if BLAS/LAPACK, with all implementation and uses, switched to a way that is compliant with current Fortran standard. But this should best start with the reference BLAS/LAPACK, continue with other BLAS/LAPACK implementations, and then with systems using those libraries, including R and its packages. Unless/before this happens, it would really be great if we could still use gfortran to build and use this fundamental software library. Best Tomas __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] read.table() fails with https in R 3.6 but not in R 3.5
On 04.05.19 19:04, Stephen Berman wrote: > In versions of R prior to 3.6.0 the following invocation succeeds, > returning the data frame shown: > >> read.table("https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text";, >> header=TRUE) >Dekade Anzahl > 11900 11467254 > 21910 13023370 > 31920 13434601 > 41930 13296355 > 51940 12121250 > 61950 13191131 > 71960 10587420 > 81970 10944129 > 91980 11279439 > 10 1990 12052652 > > But in version 3.6.0 it fails: > >> read.table("https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text";, >> header=TRUE) > Error in file(file, "rt") : > cannot open the connection to > 'https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text' > In addition: Warning message: > In file(file, "rt") : > cannot open URL > 'https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text': > HTTP status was '403 Forbidden' I can reproduce the behavior on Debian using the CRAN supplied package for R 3.6.0. Trying to read the page with 'curl' produces also a 403 error plus some HTML text (in German) explaining that I am treated as a 'robot' due to the supplied User-Agent (here: curl/7.52.1). One suggested solution is to adjust that value which does solve the issue: > options(HTTPUserAgent='mozilla') > read.table("https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text";, header=TRUE) Dekade Anzahl 11900 11467254 21910 13023370 31920 13434601 41930 13296355 51940 12121250 61950 13191131 71960 10587420 81970 10944129 91980 11279439 10 1990 12052652 Other solutions are to simulate a login or to get in touch with DWDS directly. Greetings Ralf -- Ralf Stubner Senior Software Engineer / Trainer daqana GmbH Dortustraße 48 14467 Potsdam T: +49 331 23 61 93 11 F: +49 331 23 61 93 90 M: +49 162 20 91 196 Mail: ralf.stub...@daqana.com Sitz: Potsdam Register: AG Potsdam HRB 27966 Ust.-IdNr.: DE300072622 Geschäftsführer: Dr.-Ing. Stefan Knirsch, Prof. Dr. Dr. Karl-Kuno Kunze signature.asc Description: OpenPGP digital signature __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Porting R example datasets to GNU Octave
On 5 May 2019 at 10:47, Andrew Janke wrote: | I'm interested in porting the R example datasets package to GNU Octave | and Matlab. Would you have objections to my doing so? You don't even have to ask... [...] | Since R's datasets package is GPL, I think I'd be within my rights to | just do this. But I wanted to ask first, to make sure I didn't ruffle | any feathers. I would include documentation indicating that R is the | original source (well, intermediate source) for these datasets, and have | links pointing back to R's documentation. That is the right way to do that. Respect both copyright (citing and referencing source) and licensing (by picking a license compatible with GPL 2 or later; many of us just prefer to stick to GPL which Octave uses too). Dirk, in no way speaking for R Core but just handing out his $0.02 -- http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Error in glm(..., family=quasi(..., variance=list(...)))
> Wollschlaeger, Daniel > on Fri, 26 Apr 2019 15:13:36 + writes: > In a glm() call using a quasi() family, one may define a custom variance function in the form of a "list containing components varfun, validmu, dev.resids, initialize and name" (quoting the help page for family). In trying to do so, I run into the following issue that I have not seen discussed previously: > x <- runif(1000, min=0, max=1) > y <- x + rnorm(1000, mean=0, sd=1)*x^(3/4) > vf <- function(mu) { abs(mu)^(3/4) } > vm <- function(mu) { rep(TRUE, length(mu)) } > dr <- function(y, mu, wt) { (y-mu)^2 } > it <- expression({ n <- rep.int(1, nobs); mustart <- y }) > glm(y ~ x, family=quasi(link="identity", variance=list(varfun=vf, validmu=vm, dev.resids=dr, initialize=it, name="custom"))) > This gives "Error in switch(vtemp, constant = { : EXPR must be a length 1 vector" > from line 576 in file family.R > (https://github.com/wch/r-source/blob/5a156a0865362bb8381dcd69ac335f5174a4f60c/src/library/stats/R/family.R#L576). > I believe this is due to line 573 "vtemp <- substitute(variance)" and 574 > "if (!is.character(vtemp)) vtemp <- deparse(vtemp)" where vtemp becomes a > length 2 character vector because, by default, deparse() breaks lines at > width.cutoff = 60L characters. In stepping through quasi() during debug, > setting vtemp <- (vtemp, collapse=" ") on line 576 avoids the error. but really in this case, neither substitute() nor deparse() should be called ! > A workaround from https://tolstoy.newcastle.edu.au/R/help/05/06/6795.html appears to be to define one's own complete quasi2() function with the desired variance function pre-stored. > Is this known/expected, or should I file a bug? and you *have* filed a bug report. Thank you! --> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17560 Note that the above R-help "workaround" goes back to 2005, and I find it amazing this has never been taken up. I've committed fix to this bug avoiding the misleading substitute() and deparse() altogether in this case. The plan is to port the bug fix also to "R 3.6.0 patched" so the problem would be solved in the next released version of R. > Many thanks and best regards > Daniel >> sessionInfo() > R version 3.6.0 (2019-04-26) .. Thank you again! Martin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R problems with lapack with gfortran
On 5/6/19 12:57 PM, Janne Blomqvist wrote: > On Mon, May 6, 2019 at 11:55 AM Tomas Kalibera > wrote: >> On 5/4/19 6:49 PM, Steve Kargl wrote: >>> On Sat, May 04, 2019 at 06:42:47PM +0200, Thomas König wrote: > - figure out Fortran2003 specification for C/Fortran interoperability > -- this _sounds_ like the right solution, but I don't think many > understand how to use it and what is implied (in particular, will > it require making changes to LAPACK itself?) That would actually be fairly easy. If you declare the subroutines BIND(C), as in subroutine foo(a,b) BIND(C,name="foo_") real a character*1 b end you will get the calling signature that you already have in your C sources. This also has the advantage of being standards compliant, and would be probably be the preferred method. >>> With the caveat that one may need to use the VALUE attribute to >>> account for pass-by-value vs pass-by-reference. >> This seems clean solution, but as I said before not easy, because >> currently the tradition is to call the Fortran interface directly from C >> (not via any C wrappers). This means one could not substitute >> LAPACK/BLAS at dynamic linking time, unless all LAPACK/BLAS >> implementations agreed on such a C interface. Now the substitution is >> based on the original Fortran interface. >> >> In case of R, if we only used the included reference BLAS/LAPACK, we >> could do this, define our wrappers, say "c_dgemm" for "dgemm", change R >> to call via that interface, ask maintainers of all packages to change >> their code to call via their interface, and this should work with all >> Fortran 2003 compilers. >> >> But, R is often used also with optimized BLAS/LAPACK implementations >> that can be substituted at dynamic linking time. And there we could do >> nothing at R level to help: we cannot generate such wrappers for an >> existing LAPACK/BLAS implementation (we don't have the source code, the >> compiler, etc). >> >> It would be certainly a good thing if BLAS/LAPACK, with all >> implementation and uses, switched to a way that is compliant with >> current Fortran standard. But this should best start with the reference >> BLAS/LAPACK, continue with other BLAS/LAPACK implementations, and then >> with systems using those libraries, including R and its packages. >> Unless/before this happens, it would really be great if we could still >> use gfortran to build and use this fundamental software library. > Hi, > > I don't think modifying your own builtin LAPACK makes sense, as you > mention that would make R incompatible with another LAPACK provided by > the system. And modifying LAPACK upstream by adding BIND(C) wouldn't > work either, as that would break all the existing Fortran code calling > LAPACK (as LAPACK is F77-style implicit interfaces, the Fortran caller > has no knowledge of the interface and thus it must match the compiler > default Fortran ABI). I meant creating another layer of functions using BIND(C), which will have different names (e.g. c_dgemm), and call into the original Fortran functions (e.g. dgemm). Fortran programs could still call into the original Fortran functions. Something like c_dgemm in Chapter 10.2 of "Numerical Computing with Modern Fortran" by Richard J. Hanson and Tim Hopkins, the code is available from https://archive.siam.org/books/ot134/Chapter10/index.php. So LAPACK/BLACK implementations won't break their applications by adding these new functions. Also indeed it would be useful to have C prototypes matching those interfaces distributed with BLAS/LAPACK. > So the remaining place where this could be fixed would be in your C > prototypes for LAPACK functions, so that they match what LAPACK > expects. Arguably that's the correct approach anyway. I suppose for > this task extending the GFortran -fc-prototypes option to generate C > prototypes for external functions as well would help? Well but the non-BIND(C) interface is different for different Fortran compilers, and it changed even between gfortran versions (the type of the lengths). So we would not be able to switch BLAS/LAPACK implementations anymore at dynamic linking time. We would still have to talk to all R package maintainers that manage packages that call directly to BLAS/LAPACK. I still think -fc-prototypes would be useful, certainly it should work for all BIND(C) procedures (the 20190426 does not for the dgemm example from the book), and it would be useful even for non-BIND(C) procedures. > AFAICS, this interface mismatch problem affects other Fortran > compilers as well, just that by sheer luck this has worked mostly so > far (that is, other Fortran compilers also expect a hidden string > length argument, with no exception for length==1 strings). But with > increasingly sophisticated interprocedural optimizations such sins can > no longer be forgiven. Best Tomas [[alterna
Re: [Rd] R problems with lapack with gfortran
On Mon, May 6, 2019 at 11:55 AM Tomas Kalibera wrote: > > On 5/4/19 6:49 PM, Steve Kargl wrote: > > On Sat, May 04, 2019 at 06:42:47PM +0200, Thomas König wrote: > >>> - figure out Fortran2003 specification for C/Fortran interoperability > >>> -- this _sounds_ like the right solution, but I don't think many > >>> understand how to use it and what is implied (in particular, will > >>> it require making changes to LAPACK itself?) > >> That would actually be fairly easy. If you declare the subroutines > >> BIND(C), as in > >> > >> subroutine foo(a,b) BIND(C,name="foo_") > >> real a > >> character*1 b > >> end > >> > >> you will get the calling signature that you already have in your C > >> sources. > >> > >> This also has the advantage of being standards compliant, and would be > >> probably be the preferred method. > >> > > With the caveat that one may need to use the VALUE attribute to > > account for pass-by-value vs pass-by-reference. > > This seems clean solution, but as I said before not easy, because > currently the tradition is to call the Fortran interface directly from C > (not via any C wrappers). This means one could not substitute > LAPACK/BLAS at dynamic linking time, unless all LAPACK/BLAS > implementations agreed on such a C interface. Now the substitution is > based on the original Fortran interface. > > In case of R, if we only used the included reference BLAS/LAPACK, we > could do this, define our wrappers, say "c_dgemm" for "dgemm", change R > to call via that interface, ask maintainers of all packages to change > their code to call via their interface, and this should work with all > Fortran 2003 compilers. > > But, R is often used also with optimized BLAS/LAPACK implementations > that can be substituted at dynamic linking time. And there we could do > nothing at R level to help: we cannot generate such wrappers for an > existing LAPACK/BLAS implementation (we don't have the source code, the > compiler, etc). > > It would be certainly a good thing if BLAS/LAPACK, with all > implementation and uses, switched to a way that is compliant with > current Fortran standard. But this should best start with the reference > BLAS/LAPACK, continue with other BLAS/LAPACK implementations, and then > with systems using those libraries, including R and its packages. > Unless/before this happens, it would really be great if we could still > use gfortran to build and use this fundamental software library. Hi, I don't think modifying your own builtin LAPACK makes sense, as you mention that would make R incompatible with another LAPACK provided by the system. And modifying LAPACK upstream by adding BIND(C) wouldn't work either, as that would break all the existing Fortran code calling LAPACK (as LAPACK is F77-style implicit interfaces, the Fortran caller has no knowledge of the interface and thus it must match the compiler default Fortran ABI). So the remaining place where this could be fixed would be in your C prototypes for LAPACK functions, so that they match what LAPACK expects. Arguably that's the correct approach anyway. I suppose for this task extending the GFortran -fc-prototypes option to generate C prototypes for external functions as well would help? AFAICS, this interface mismatch problem affects other Fortran compilers as well, just that by sheer luck this has worked mostly so far (that is, other Fortran compilers also expect a hidden string length argument, with no exception for length==1 strings). But with increasingly sophisticated interprocedural optimizations such sins can no longer be forgiven. -- Janne Blomqvist __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] read.table() fails with https in R 3.6 but not in R 3.5
On Mon, 6 May 2019 11:12:25 +0200 Ralf Stubner wrote: > On 04.05.19 19:04, Stephen Berman wrote: >> In versions of R prior to 3.6.0 the following invocation succeeds, >> returning the data frame shown: >> >>> read.table("https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text";, >>> header=TRUE) >>Dekade Anzahl >> 11900 11467254 >> 21910 13023370 >> 31920 13434601 >> 41930 13296355 >> 51940 12121250 >> 61950 13191131 >> 71960 10587420 >> 81970 10944129 >> 91980 11279439 >> 10 1990 12052652 >> >> But in version 3.6.0 it fails: >> >>> read.table("https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text";, >>> header=TRUE) >> Error in file(file, "rt") : >> cannot open the connection to >> 'https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text' >> In addition: Warning message: >> In file(file, "rt") : >> cannot open URL >> 'https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text': >> HTTP status was '403 Forbidden' > > I can reproduce the behavior on Debian using the CRAN supplied package > for R 3.6.0. Trying to read the page with 'curl' produces also a 403 > error plus some HTML text (in German) explaining that I am treated as a > 'robot' due to the supplied User-Agent (here: curl/7.52.1). One > suggested solution is to adjust that value which does solve the issue: > > > options(HTTPUserAgent='mozilla') I confirm that works for me, too. Thanks! FWIW, the default value of HTTPUserAgent in R 3.6 here is "R (3.6.0 x86_64-suse-linux-gnu x86_64 linux-gnu)", and using this (in R 3.6) fails as I reported, while the default value of HTTPUserAgent in R 3.5 here is "R (3.5.0 x86_64-suse-linux-gnu x86_64 linux-gnu)" and using that (in R 3.5) succeeds. However, setting HTTPUserAgent in R 3.5 to "libcurl/7.60.0" fails just as it does in 3.6. It's not clear to me if this particular website is being too restrictive or if R 3.6 should deal with it, or at least mention the issue in NEWS or somewhere else. Steve Berman __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
Optim's Nelder-Mead works correctly for this example. > optim(par=10, fn=fn, method="Nelder-Mead") x=10, ret=100.02 (memory) x=11, ret=121 (calculate) x=9, ret=81 (calculate) x=8, ret=64 (calculate) x=6, ret=36 (calculate) x=4, ret=16 (calculate) x=0, ret=0 (calculate) x=-4, ret=16 (calculate) x=-4, ret=16 (memory) x=2, ret=4 (calculate) x=-2, ret=4 (calculate) x=1, ret=1 (calculate) x=-1, ret=1 (calculate) x=0.5, ret=0.25 (calculate) x=-0.5, ret=0.25 (calculate) x=0.25, ret=0.0625 (calculate) x=-0.25, ret=0.0625 (calculate) x=0.125, ret=0.015625 (calculate) x=-0.125, ret=0.015625 (calculate) x=0.0625, ret=0.00390625 (calculate) x=-0.0625, ret=0.00390625 (calculate) x=0.03125, ret=0.0009765625 (calculate) x=-0.03125, ret=0.0009765625 (calculate) x=0.015625, ret=0.0002441406 (calculate) x=-0.015625, ret=0.0002441406 (calculate) x=0.0078125, ret=6.103516e-05 (calculate) x=-0.0078125, ret=6.103516e-05 (calculate) x=0.00390625, ret=1.525879e-05 (calculate) x=-0.00390625, ret=1.525879e-05 (calculate) x=0.001953125, ret=3.814697e-06 (calculate) x=-0.001953125, ret=3.814697e-06 (calculate) x=0.0009765625, ret=9.536743e-07 (calculate) $par [1] 0 $value [1] 0 $counts function gradient 32 NA $convergence [1] 0 $message NULL From: R-devel on behalf of Duncan Murdoch Sent: Friday, May 3, 2019 8:18:44 AM To: peter dalgaard Cc: Florian Gerber; r-devel@r-project.org Subject: Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments It looks as though this happens when calculating numerical gradients: x is reduced by eps, and fn is called; then x is increased by eps, and fn is called again. No check is made that x has other references after the first call to fn. I'll put together a patch if nobody else gets there first... Duncan Murdoch On 03/05/2019 7:13 a.m., peter dalgaard wrote: > Yes, I think you are right. I was at first confused by the fact that after > the optim() call, > >> environment(fn)$xx > [1] 10 >> environment(fn)$ret > [1] 100.02 > > so not 9.999, but this could come from x being assigned the final value > without calling fn. > > -pd > > >> On 3 May 2019, at 11:58 , Duncan Murdoch wrote: >> >> Your results below make it look like a bug in optim(): it is not >> duplicating a value when it should, so changes to x affect xx as well. >> >> Duncan Murdoch >> >> On 03/05/2019 4:41 a.m., Serguei Sokol wrote: >>> On 03/05/2019 10:31, Serguei Sokol wrote: On 02/05/2019 21:35, Florian Gerber wrote: > Dear all, > > when using optim() for a function that uses the parent environment, I > see the following unexpected behavior: > > makeFn <- function(){ > xx <- ret <- NA > fn <- function(x){ > if(!is.na(xx) && x==xx){ > cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="") > return(ret) > } > xx <<- x; ret <<- sum(x^2) > cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="") > ret > } > fn > } > fn <- makeFn() > optim(par=10, fn=fn, method="L-BFGS-B") > # x=10, ret=100 (calculate) > # x=10.001, ret=100.02 (calculate) > # x=9.999, ret=100.02 (memory) > # $par > # [1] 10 > # > # $value > # [1] 100 > # (...) > > I would expect that optim() does more than 3 function evaluations and > that the optimization converges to 0. > > Same problem with optim(par=10, fn=fn, method="BFGS"). > > Any ideas? I don't have an answer but may be an insight. For some mysterious reason xx is getting changed when in should not. Consider: > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) > optim(par=10, fn=fn, method="L-BFGS-B") 1 in x,xx,ret= 10 NA NA out x,xx,ret= 10 10 100 2 in x,xx,ret= 10.001 10 100 out x,xx,ret= 10.001 10.001 100.02 3 in x,xx,ret= 9.999 9.999 100.02 $par [1] 10 $value [1] 100 $counts function gradient 11 $convergence [1] 0 $message [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" At the third call, xx has value 9.999 while it should have kept the value 10.001. >>> A little follow-up: if you untie the link between xx and x by replacing >>> the expression "xx <<- x" by "xx <<- x+0" it works as expected: >>> > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in >>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<- >>> x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) >>> > optim(par=10, fn=fn, method="L-BFGS-B") >>> 1 in x,xx,ret= 10 NA NA >>> out x,xx,ret= 10 10 100 >>> 2 in x,xx,ret= 10.001 10 100
Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
That's consistent/not surprising if the problem lies in the numerical gradient calculation step ... On 2019-05-06 10:06 a.m., Ravi Varadhan wrote: > Optim's Nelder-Mead works correctly for this example. > > >> optim(par=10, fn=fn, method="Nelder-Mead") > x=10, ret=100.02 (memory) > x=11, ret=121 (calculate) > x=9, ret=81 (calculate) > x=8, ret=64 (calculate) > x=6, ret=36 (calculate) > x=4, ret=16 (calculate) > x=0, ret=0 (calculate) > x=-4, ret=16 (calculate) > x=-4, ret=16 (memory) > x=2, ret=4 (calculate) > x=-2, ret=4 (calculate) > x=1, ret=1 (calculate) > x=-1, ret=1 (calculate) > x=0.5, ret=0.25 (calculate) > x=-0.5, ret=0.25 (calculate) > x=0.25, ret=0.0625 (calculate) > x=-0.25, ret=0.0625 (calculate) > x=0.125, ret=0.015625 (calculate) > x=-0.125, ret=0.015625 (calculate) > x=0.0625, ret=0.00390625 (calculate) > x=-0.0625, ret=0.00390625 (calculate) > x=0.03125, ret=0.0009765625 (calculate) > x=-0.03125, ret=0.0009765625 (calculate) > x=0.015625, ret=0.0002441406 (calculate) > x=-0.015625, ret=0.0002441406 (calculate) > x=0.0078125, ret=6.103516e-05 (calculate) > x=-0.0078125, ret=6.103516e-05 (calculate) > x=0.00390625, ret=1.525879e-05 (calculate) > x=-0.00390625, ret=1.525879e-05 (calculate) > x=0.001953125, ret=3.814697e-06 (calculate) > x=-0.001953125, ret=3.814697e-06 (calculate) > x=0.0009765625, ret=9.536743e-07 (calculate) > $par > [1] 0 > > $value > [1] 0 > > $counts > function gradient > 32 NA > > $convergence > [1] 0 > > $message > NULL > > > > > > From: R-devel on behalf of Duncan Murdoch > > Sent: Friday, May 3, 2019 8:18:44 AM > To: peter dalgaard > Cc: Florian Gerber; r-devel@r-project.org > Subject: Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when > working with parent environments > > > It looks as though this happens when calculating numerical gradients: x > is reduced by eps, and fn is called; then x is increased by eps, and fn > is called again. No check is made that x has other references after the > first call to fn. > > I'll put together a patch if nobody else gets there first... > > Duncan Murdoch > > On 03/05/2019 7:13 a.m., peter dalgaard wrote: >> Yes, I think you are right. I was at first confused by the fact that after >> the optim() call, >> >>> environment(fn)$xx >> [1] 10 >>> environment(fn)$ret >> [1] 100.02 >> >> so not 9.999, but this could come from x being assigned the final value >> without calling fn. >> >> -pd >> >> >>> On 3 May 2019, at 11:58 , Duncan Murdoch wrote: >>> >>> Your results below make it look like a bug in optim(): it is not >>> duplicating a value when it should, so changes to x affect xx as well. >>> >>> Duncan Murdoch >>> >>> On 03/05/2019 4:41 a.m., Serguei Sokol wrote: On 03/05/2019 10:31, Serguei Sokol wrote: > On 02/05/2019 21:35, Florian Gerber wrote: >> Dear all, >> >> when using optim() for a function that uses the parent environment, I >> see the following unexpected behavior: >> >> makeFn <- function(){ >> xx <- ret <- NA >> fn <- function(x){ >> if(!is.na(xx) && x==xx){ >> cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="") >> return(ret) >> } >> xx <<- x; ret <<- sum(x^2) >> cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="") >> ret >> } >> fn >> } >> fn <- makeFn() >> optim(par=10, fn=fn, method="L-BFGS-B") >> # x=10, ret=100 (calculate) >> # x=10.001, ret=100.02 (calculate) >> # x=9.999, ret=100.02 (memory) >> # $par >> # [1] 10 >> # >> # $value >> # [1] 100 >> # (...) >> >> I would expect that optim() does more than 3 function evaluations and >> that the optimization converges to 0. >> >> Same problem with optim(par=10, fn=fn, method="BFGS"). >> >> Any ideas? > I don't have an answer but may be an insight. For some mysterious > reason xx is getting changed when in should not. Consider: >> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in > x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx > <<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) >> optim(par=10, fn=fn, method="L-BFGS-B") > 1 in x,xx,ret= 10 NA NA > out x,xx,ret= 10 10 100 > 2 in x,xx,ret= 10.001 10 100 > out x,xx,ret= 10.001 10.001 100.02 > 3 in x,xx,ret= 9.999 9.999 100.02 > $par > [1] 10 > > $value > [1] 100 > > $counts > function gradient > 11 > > $convergence > [1] 0 > > $message > [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" > > At the third call, xx has value 9.999 while it should have kept the > value 10.001. > A little follow-up: if you untie the link between xx and x by replacing the expression "xx <<- x"
Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
It seems that it's an old bug that was found in some other packages, but at that time not optim: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=15958 and that Duncan Murdoch posted a patch already last Friday :) Thomas Am 06.05.2019 um 16:40 schrieb Ben Bolker: That's consistent/not surprising if the problem lies in the numerical gradient calculation step ... On 2019-05-06 10:06 a.m., Ravi Varadhan wrote: Optim's Nelder-Mead works correctly for this example. optim(par=10, fn=fn, method="Nelder-Mead") x=10, ret=100.02 (memory) x=11, ret=121 (calculate) x=9, ret=81 (calculate) x=8, ret=64 (calculate) x=6, ret=36 (calculate) x=4, ret=16 (calculate) x=0, ret=0 (calculate) x=-4, ret=16 (calculate) x=-4, ret=16 (memory) x=2, ret=4 (calculate) x=-2, ret=4 (calculate) x=1, ret=1 (calculate) x=-1, ret=1 (calculate) x=0.5, ret=0.25 (calculate) x=-0.5, ret=0.25 (calculate) x=0.25, ret=0.0625 (calculate) x=-0.25, ret=0.0625 (calculate) x=0.125, ret=0.015625 (calculate) x=-0.125, ret=0.015625 (calculate) x=0.0625, ret=0.00390625 (calculate) x=-0.0625, ret=0.00390625 (calculate) x=0.03125, ret=0.0009765625 (calculate) x=-0.03125, ret=0.0009765625 (calculate) x=0.015625, ret=0.0002441406 (calculate) x=-0.015625, ret=0.0002441406 (calculate) x=0.0078125, ret=6.103516e-05 (calculate) x=-0.0078125, ret=6.103516e-05 (calculate) x=0.00390625, ret=1.525879e-05 (calculate) x=-0.00390625, ret=1.525879e-05 (calculate) x=0.001953125, ret=3.814697e-06 (calculate) x=-0.001953125, ret=3.814697e-06 (calculate) x=0.0009765625, ret=9.536743e-07 (calculate) $par [1] 0 $value [1] 0 $counts function gradient 32 NA $convergence [1] 0 $message NULL From: R-devel on behalf of Duncan Murdoch Sent: Friday, May 3, 2019 8:18:44 AM To: peter dalgaard Cc: Florian Gerber; r-devel@r-project.org Subject: Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments It looks as though this happens when calculating numerical gradients: x is reduced by eps, and fn is called; then x is increased by eps, and fn is called again. No check is made that x has other references after the first call to fn. I'll put together a patch if nobody else gets there first... Duncan Murdoch On 03/05/2019 7:13 a.m., peter dalgaard wrote: Yes, I think you are right. I was at first confused by the fact that after the optim() call, environment(fn)$xx [1] 10 environment(fn)$ret [1] 100.02 so not 9.999, but this could come from x being assigned the final value without calling fn. -pd On 3 May 2019, at 11:58 , Duncan Murdoch wrote: Your results below make it look like a bug in optim(): it is not duplicating a value when it should, so changes to x affect xx as well. Duncan Murdoch On 03/05/2019 4:41 a.m., Serguei Sokol wrote: On 03/05/2019 10:31, Serguei Sokol wrote: On 02/05/2019 21:35, Florian Gerber wrote: Dear all, when using optim() for a function that uses the parent environment, I see the following unexpected behavior: makeFn <- function(){ xx <- ret <- NA fn <- function(x){ if(!is.na(xx) && x==xx){ cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="") return(ret) } xx <<- x; ret <<- sum(x^2) cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="") ret } fn } fn <- makeFn() optim(par=10, fn=fn, method="L-BFGS-B") # x=10, ret=100 (calculate) # x=10.001, ret=100.02 (calculate) # x=9.999, ret=100.02 (memory) # $par # [1] 10 # # $value # [1] 100 # (...) I would expect that optim() does more than 3 function evaluations and that the optimization converges to 0. Same problem with optim(par=10, fn=fn, method="BFGS"). Any ideas? I don't have an answer but may be an insight. For some mysterious reason xx is getting changed when in should not. Consider: fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) optim(par=10, fn=fn, method="L-BFGS-B") 1 in x,xx,ret= 10 NA NA out x,xx,ret= 10 10 100 2 in x,xx,ret= 10.001 10 100 out x,xx,ret= 10.001 10.001 100.02 3 in x,xx,ret= 9.999 9.999 100.02 $par [1] 10 $value [1] 100 $counts function gradient 11 $convergence [1] 0 $message [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" At the third call, xx has value 9.999 while it should have kept the value 10.001. A little follow-up: if you untie the link between xx and x by replacing the expression "xx <<- x" by "xx <<- x+0" it works as expected: > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<- x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) > optim(par=10, fn=fn, method="L-BFGS-B") 1 in x,xx,ret= 10 NA NA out x,xx,ret= 10 10 100 2 in x,xx,ret= 10.001
Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
On 06/05/2019 18:21, Thomas Petzoldt wrote: It seems that it's an old bug that was found in some other packages, but at that time not optim: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=15958 I think that the bug description is a little bit misleading. The bug is not in fact that "<<-" produce a reference instead of a copy (that's normal) but in fact that some C or Fortran code modifies a variable "in place" without taking care if there are some references on it or not. Serguei (just splitting hairs) __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel