Re: [Rd] R problems with lapack with gfortran

2019-05-06 Thread Steve Kargl
On Sat, May 04, 2019 at 06:05:48PM +0200, peter dalgaard wrote:
> 
> - figure out Fortran2003 specification for C/Fortran interoperability
> -- this _sounds_ like the right solution, but I don't think many
> understand how to use it and what is implied (in particular, will it
> require making changes to LAPACK itself?)

This is probably the best solution as it should allow portability
to any Fortran processor that supports F2003 or newer standard.
See the gtk-fortran project.  It has a python program that was
used to generate the needed ISO C interfaces.

-- 
Steve

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] read.table() fails with https in R 3.6 but not in R 3.5

2019-05-06 Thread Stephen Berman
In versions of R prior to 3.6.0 the following invocation succeeds,
returning the data frame shown:

> read.table("https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text";,
>  header=TRUE)
   Dekade   Anzahl
11900 11467254
21910 13023370
31920 13434601
41930 13296355
51940 12121250
61950 13191131
71960 10587420
81970 10944129
91980 11279439
10   1990 12052652

But in version 3.6.0 it fails:

> read.table("https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text";,
>  header=TRUE)
Error in file(file, "rt") :
  cannot open the connection to 
'https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text'
In addition: Warning message:
In file(file, "rt") :
  cannot open URL 
'https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text': 
HTTP status was '403 Forbidden'

The table at this URL is generated by a query processor and the same
failure happens in 3.6.0 with other queries at this website.  This
website does not appear to serve data via http: replacing https by http
in the above gives the same results, and in 3.6.0 the error message
contains the URL with http but in the warning message the URL is with
https.  I have also tried a few other websites that serve
(non-generated) tabular data via https
(e.g. 
https://graphchallenge.s3.amazonaws.com/synthetic/gc3/Theory-16-25-81-Bk.tsv)
and with these read.table() succeeds in 3.6.0, so the problem isn't
https in general.  Maybe it has to do with the page being generated
rather than static?  There's only one reference to https in the 3.6.0
NEWS, concerning libcurl; I can't tell if it's relevant.

In case it matters, this is with R packaged for openSUSE, and I've found
the above difference between 3.5 and 3.6 on both openSUSE Leap 15.0 and
openSUSE Tumbleweed.

Steve Berman

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R problems with lapack with gfortran

2019-05-06 Thread Steve Kargl
On Sat, May 04, 2019 at 06:42:47PM +0200, Thomas König wrote:
> 
> > - figure out Fortran2003 specification for C/Fortran interoperability
> > -- this _sounds_ like the right solution, but I don't think many
> > understand how to use it and what is implied (in particular, will
> > it require making changes to LAPACK itself?)
> 
> That would actually be fairly easy.  If you declare the subroutines
> BIND(C), as in
> 
>subroutine foo(a,b) BIND(C,name="foo_")
>real a
>character*1 b
>end
> 
> you will get the calling signature that you already have in your C
> sources.
> 
> This also has the advantage of being standards compliant, and would be
> probably be the preferred method.
> 

With the caveat that one may need to use the VALUE attribute to
account for pass-by-value vs pass-by-reference.

-- 
Steve

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Porting R example datasets to GNU Octave

2019-05-06 Thread Andrew Janke
Hi, R Developers,

I'm interested in porting the R example datasets package to GNU Octave
and Matlab. Would you have objections to my doing so?

This would involve transforming the example data and metadata into a
format that Octave understands, and porting all of the datasets' Example
code pieces to Octave M-code. (This would require no work on your part;
it'd be my project.)

I think this would be a benefit to the scientific programming community.
In addition to helping Octave users, having code for identical example
data sets in both languages would serve as a Rosetta Stone for not only
users moving from R to Octave, but for users coming from Octave or
Matlab to R.

Since R's datasets package is GPL, I think I'd be within my rights to
just do this. But I wanted to ask first, to make sure I didn't ruffle
any feathers. I would include documentation indicating that R is the
original source (well, intermediate source) for these datasets, and have
links pointing back to R's documentation.

Cheers,
Andrew Janke

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Incorrect date range in austres example dataset?

2019-05-06 Thread Andrew Janke
Hi, R developers,

It seems there might be an issue with the "austres" example dataset in
the "datasets" package. The description in austres.Rd says it's
"measured quarterly from March 1971 to March 1994". But there are only
89 observations in the data as defined in the source code. By my count,
that only brings you up to about March 1993. Is there an issue with the
data transcription, or the Description?

I'm looking at the source code from the R 3.6.0 source distribution.

Cheers,
Andrew Janke

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Bug in function boxplot's axis labeling

2019-05-06 Thread Ulrike Grömping

Dear developeRs,

I appreciate that boxplot now labels the axes with variable names per 
default. However, with argument "horizontal=TRUE" (which I always use), 
the default axis labels are mixed up, as can e.g. be seen with


require(boot)
boxplot(time ~ poison, poisons, horizontal=TRUE)

The correct labels would be obtained by

boxplot(time ~ poison, poisons, horizontal=TRUE, xlab="time", ylab="poison")

Best, Ulrike

--
##
## Prof. Ulrike Groemping
## FB II
## Beuth University of Applied Sciences Berlin
##
## prof.beuth-hochschule.de/groemping
## Phone: +49(0)30 4504 5127
## Fax:   +49(0)30 4504 66 5127
## Home office: +49(0)30 394 04 863
##

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R problems with lapack with gfortran

2019-05-06 Thread Tomas Kalibera

On 5/4/19 6:49 PM, Steve Kargl wrote:

On Sat, May 04, 2019 at 06:42:47PM +0200, Thomas König wrote:

- figure out Fortran2003 specification for C/Fortran interoperability
-- this _sounds_ like the right solution, but I don't think many
understand how to use it and what is implied (in particular, will
it require making changes to LAPACK itself?)

That would actually be fairly easy.  If you declare the subroutines
BIND(C), as in

subroutine foo(a,b) BIND(C,name="foo_")
real a
character*1 b
end

you will get the calling signature that you already have in your C
sources.

This also has the advantage of being standards compliant, and would be
probably be the preferred method.


With the caveat that one may need to use the VALUE attribute to
account for pass-by-value vs pass-by-reference.


This seems clean solution, but as I said before not easy, because 
currently the tradition is to call the Fortran interface directly from C 
(not via any C wrappers). This means one could not substitute 
LAPACK/BLAS at dynamic linking time, unless all LAPACK/BLAS 
implementations agreed on such a C interface. Now the substitution is 
based on the original Fortran interface.


In case of R, if we only used the included reference BLAS/LAPACK, we 
could do this, define our wrappers, say "c_dgemm" for "dgemm", change R 
to call via that interface, ask maintainers of all packages to change 
their code to call via their interface, and this should work with all 
Fortran 2003 compilers.


But, R is often used also with optimized BLAS/LAPACK implementations 
that can be substituted at dynamic linking time. And there we could do 
nothing at R level to help: we cannot generate such wrappers for an 
existing LAPACK/BLAS implementation (we don't have the source code, the 
compiler, etc).


It would be certainly a good thing if BLAS/LAPACK, with all 
implementation and uses, switched to a way that is compliant with 
current Fortran standard. But this should best start with the reference 
BLAS/LAPACK, continue with other BLAS/LAPACK implementations, and then 
with systems using those libraries, including R and its packages. 
Unless/before this happens, it would really be great if we could still 
use gfortran to build and use this fundamental software library.


Best
Tomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] read.table() fails with https in R 3.6 but not in R 3.5

2019-05-06 Thread Ralf Stubner
On 04.05.19 19:04, Stephen Berman wrote:
> In versions of R prior to 3.6.0 the following invocation succeeds,
> returning the data frame shown:
> 
>> read.table("https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text";,
>>  header=TRUE)
>Dekade   Anzahl
> 11900 11467254
> 21910 13023370
> 31920 13434601
> 41930 13296355
> 51940 12121250
> 61950 13191131
> 71960 10587420
> 81970 10944129
> 91980 11279439
> 10   1990 12052652
> 
> But in version 3.6.0 it fails:
> 
>> read.table("https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text";,
>>  header=TRUE)
> Error in file(file, "rt") :
>   cannot open the connection to 
> 'https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text'
> In addition: Warning message:
> In file(file, "rt") :
>   cannot open URL 
> 'https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text': 
> HTTP status was '403 Forbidden'

I can reproduce the behavior on Debian using the CRAN supplied package
for R 3.6.0. Trying to read the page with 'curl' produces also a 403
error plus some HTML text (in German) explaining that I am treated as a
'robot' due to the supplied User-Agent (here: curl/7.52.1). One
suggested solution is to adjust that value which does solve the issue:

 > options(HTTPUserAgent='mozilla')
>
read.table("https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text";,
header=TRUE)
   Dekade   Anzahl
11900 11467254
21910 13023370
31920 13434601
41930 13296355
51940 12121250
61950 13191131
71960 10587420
81970 10944129
91980 11279439
10   1990 12052652

Other solutions are to simulate a login or to get in touch with DWDS
directly.

Greetings
Ralf

-- 
Ralf Stubner
Senior Software Engineer / Trainer

daqana GmbH
Dortustraße 48
14467 Potsdam

T: +49 331 23 61 93 11
F: +49 331 23 61 93 90
M: +49 162 20 91 196
Mail: ralf.stub...@daqana.com

Sitz: Potsdam
Register: AG Potsdam HRB 27966
Ust.-IdNr.: DE300072622
Geschäftsführer: Dr.-Ing. Stefan Knirsch, Prof. Dr. Dr. Karl-Kuno Kunze



signature.asc
Description: OpenPGP digital signature
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Porting R example datasets to GNU Octave

2019-05-06 Thread Dirk Eddelbuettel


On 5 May 2019 at 10:47, Andrew Janke wrote:
| I'm interested in porting the R example datasets package to GNU Octave
| and Matlab. Would you have objections to my doing so?

You don't even have to ask...

[...]

| Since R's datasets package is GPL, I think I'd be within my rights to
| just do this. But I wanted to ask first, to make sure I didn't ruffle
| any feathers. I would include documentation indicating that R is the
| original source (well, intermediate source) for these datasets, and have
| links pointing back to R's documentation.

That is the right way to do that. Respect both copyright (citing and
referencing source) and licensing (by picking a license compatible with GPL 2
or later; many of us just prefer to stick to GPL which Octave uses too).

Dirk, in no way speaking for R Core but just handing out his $0.02

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Error in glm(..., family=quasi(..., variance=list(...)))

2019-05-06 Thread Martin Maechler
> Wollschlaeger, Daniel 
> on Fri, 26 Apr 2019 15:13:36 + writes:

> In a glm() call using a quasi() family, one may define a custom variance 
function in the form of a "list containing components varfun, validmu, 
dev.resids, initialize and name" (quoting the help page for family). In trying 
to do so, I run into the following issue that I have not seen discussed 
previously:

> x  <- runif(1000, min=0, max=1)
> y  <- x + rnorm(1000, mean=0, sd=1)*x^(3/4)
> vf <- function(mu) { abs(mu)^(3/4) }
> vm <- function(mu) { rep(TRUE, length(mu)) }
> dr <- function(y, mu, wt) { (y-mu)^2 }
> it <- expression({ n <- rep.int(1, nobs); mustart <- y })
> glm(y ~ x, family=quasi(link="identity", variance=list(varfun=vf, 
validmu=vm, dev.resids=dr, initialize=it, name="custom")))

> This gives "Error in switch(vtemp, constant = { : EXPR must be a length 1 
vector"
 > from line 576 in file family.R
 > (https://github.com/wch/r-source/blob/5a156a0865362bb8381dcd69ac335f5174a4f60c/src/library/stats/R/family.R#L576).
 > I believe this is due to line 573 "vtemp <- substitute(variance)" and 574 
 > "if (!is.character(vtemp)) vtemp <- deparse(vtemp)" where vtemp becomes a 
 > length 2 character vector because, by default, deparse() breaks lines at 
 > width.cutoff = 60L characters. In stepping through quasi() during debug, 
 > setting vtemp <- (vtemp, collapse=" ") on line 576 avoids the error.

but really in this case, neither substitute() nor deparse()
should be called !

> A workaround from https://tolstoy.newcastle.edu.au/R/help/05/06/6795.html 
appears to be to define one's own complete quasi2() function with the desired 
variance function pre-stored.

> Is this known/expected, or should I file a bug?

and you *have* filed a bug report.  Thank you!
--> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17560

Note that the above R-help "workaround" goes back to 2005, and I
find it amazing this has never been taken up.

I've committed fix to this bug avoiding the misleading
substitute() and deparse() altogether in this case.

The plan is to port the bug fix also to "R 3.6.0 patched" so the
problem would be solved in the next released version of R.


> Many thanks and best regards
> Daniel

>> sessionInfo()
> R version 3.6.0 (2019-04-26)
..

Thank you again!
Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R problems with lapack with gfortran

2019-05-06 Thread Tomas Kalibera
On 5/6/19 12:57 PM, Janne Blomqvist wrote:
> On Mon, May 6, 2019 at 11:55 AM Tomas Kalibera  
> wrote:
>> On 5/4/19 6:49 PM, Steve Kargl wrote:
>>> On Sat, May 04, 2019 at 06:42:47PM +0200, Thomas König wrote:
> - figure out Fortran2003 specification for C/Fortran interoperability
> -- this _sounds_ like the right solution, but I don't think many
> understand how to use it and what is implied (in particular, will
> it require making changes to LAPACK itself?)
 That would actually be fairly easy.  If you declare the subroutines
 BIND(C), as in

  subroutine foo(a,b) BIND(C,name="foo_")
  real a
  character*1 b
  end

 you will get the calling signature that you already have in your C
 sources.

 This also has the advantage of being standards compliant, and would be
 probably be the preferred method.

>>> With the caveat that one may need to use the VALUE attribute to
>>> account for pass-by-value vs pass-by-reference.
>> This seems clean solution, but as I said before not easy, because
>> currently the tradition is to call the Fortran interface directly from C
>> (not via any C wrappers). This means one could not substitute
>> LAPACK/BLAS at dynamic linking time, unless all LAPACK/BLAS
>> implementations agreed on such a C interface. Now the substitution is
>> based on the original Fortran interface.
>>
>> In case of R, if we only used the included reference BLAS/LAPACK, we
>> could do this, define our wrappers, say "c_dgemm" for "dgemm", change R
>> to call via that interface, ask maintainers of all packages to change
>> their code to call via their interface, and this should work with all
>> Fortran 2003 compilers.
>>
>> But, R is often used also with optimized BLAS/LAPACK implementations
>> that can be substituted at dynamic linking time. And there we could do
>> nothing at R level to help: we cannot generate such wrappers for an
>> existing LAPACK/BLAS implementation (we don't have the source code, the
>> compiler, etc).
>>
>> It would be certainly a good thing if BLAS/LAPACK, with all
>> implementation and uses, switched to a way that is compliant with
>> current Fortran standard. But this should best start with the reference
>> BLAS/LAPACK, continue with other BLAS/LAPACK implementations, and then
>> with systems using those libraries, including R and its packages.
>> Unless/before this happens, it would really be great if we could still
>> use gfortran to build and use this fundamental software library.
> Hi,
>
> I don't think modifying your own builtin LAPACK makes sense, as you
> mention that would make R incompatible with another LAPACK provided by
> the system. And modifying LAPACK upstream by adding BIND(C) wouldn't
> work either, as that would break all the existing Fortran code calling
> LAPACK (as LAPACK is F77-style implicit interfaces, the Fortran caller
> has no knowledge of the interface and thus it must match the compiler
> default Fortran ABI).
I meant creating another layer of functions using BIND(C), which will 
have different names (e.g. c_dgemm), and call into the original Fortran 
functions (e.g. dgemm). Fortran programs could still call into the 
original Fortran functions. Something like c_dgemm in Chapter 10.2 of 
"Numerical Computing with Modern Fortran" by Richard J. Hanson and Tim 
Hopkins, the code is available from 
https://archive.siam.org/books/ot134/Chapter10/index.php. So 
LAPACK/BLACK implementations won't break their applications by adding 
these new functions. Also indeed it would be useful to have C prototypes 
matching those interfaces distributed with BLAS/LAPACK.
> So the remaining place where this could be fixed would be in your C
> prototypes for LAPACK functions, so that they match what LAPACK
> expects. Arguably that's the correct approach anyway. I suppose for
> this task extending the GFortran -fc-prototypes option to generate C
> prototypes for external functions as well would help?

Well but the non-BIND(C) interface is different for different Fortran 
compilers, and it changed even between gfortran versions (the type of 
the lengths). So we would not be able to switch BLAS/LAPACK 
implementations anymore at dynamic linking time. We would still have to 
talk to all R package maintainers that manage packages that call 
directly to BLAS/LAPACK.

I still think -fc-prototypes would be useful, certainly it should work 
for all BIND(C) procedures (the 20190426 does not for the dgemm example 
from the book), and it would be useful even for non-BIND(C) procedures.

> AFAICS, this interface mismatch problem affects other Fortran
> compilers as well, just that by sheer luck this has worked mostly so
> far (that is, other Fortran compilers also expect a hidden string
> length argument, with no exception for length==1 strings). But with
> increasingly sophisticated interprocedural optimizations such sins can
> no longer be forgiven.

Best
Tomas



[[alterna

Re: [Rd] R problems with lapack with gfortran

2019-05-06 Thread Janne Blomqvist
On Mon, May 6, 2019 at 11:55 AM Tomas Kalibera  wrote:
>
> On 5/4/19 6:49 PM, Steve Kargl wrote:
> > On Sat, May 04, 2019 at 06:42:47PM +0200, Thomas König wrote:
> >>> - figure out Fortran2003 specification for C/Fortran interoperability
> >>> -- this _sounds_ like the right solution, but I don't think many
> >>> understand how to use it and what is implied (in particular, will
> >>> it require making changes to LAPACK itself?)
> >> That would actually be fairly easy.  If you declare the subroutines
> >> BIND(C), as in
> >>
> >> subroutine foo(a,b) BIND(C,name="foo_")
> >> real a
> >> character*1 b
> >> end
> >>
> >> you will get the calling signature that you already have in your C
> >> sources.
> >>
> >> This also has the advantage of being standards compliant, and would be
> >> probably be the preferred method.
> >>
> > With the caveat that one may need to use the VALUE attribute to
> > account for pass-by-value vs pass-by-reference.
>
> This seems clean solution, but as I said before not easy, because
> currently the tradition is to call the Fortran interface directly from C
> (not via any C wrappers). This means one could not substitute
> LAPACK/BLAS at dynamic linking time, unless all LAPACK/BLAS
> implementations agreed on such a C interface. Now the substitution is
> based on the original Fortran interface.
>
> In case of R, if we only used the included reference BLAS/LAPACK, we
> could do this, define our wrappers, say "c_dgemm" for "dgemm", change R
> to call via that interface, ask maintainers of all packages to change
> their code to call via their interface, and this should work with all
> Fortran 2003 compilers.
>
> But, R is often used also with optimized BLAS/LAPACK implementations
> that can be substituted at dynamic linking time. And there we could do
> nothing at R level to help: we cannot generate such wrappers for an
> existing LAPACK/BLAS implementation (we don't have the source code, the
> compiler, etc).
>
> It would be certainly a good thing if BLAS/LAPACK, with all
> implementation and uses, switched to a way that is compliant with
> current Fortran standard. But this should best start with the reference
> BLAS/LAPACK, continue with other BLAS/LAPACK implementations, and then
> with systems using those libraries, including R and its packages.
> Unless/before this happens, it would really be great if we could still
> use gfortran to build and use this fundamental software library.

Hi,

I don't think modifying your own builtin LAPACK makes sense, as you
mention that would make R incompatible with another LAPACK provided by
the system. And modifying LAPACK upstream by adding BIND(C) wouldn't
work either, as that would break all the existing Fortran code calling
LAPACK (as LAPACK is F77-style implicit interfaces, the Fortran caller
has no knowledge of the interface and thus it must match the compiler
default Fortran ABI).

So the remaining place where this could be fixed would be in your C
prototypes for LAPACK functions, so that they match what LAPACK
expects. Arguably that's the correct approach anyway. I suppose for
this task extending the GFortran -fc-prototypes option to generate C
prototypes for external functions as well would help?

AFAICS, this interface mismatch problem affects other Fortran
compilers as well, just that by sheer luck this has worked mostly so
far (that is, other Fortran compilers also expect a hidden string
length argument, with no exception for length==1 strings). But with
increasingly sophisticated interprocedural optimizations such sins can
no longer be forgiven.

-- 
Janne Blomqvist

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] read.table() fails with https in R 3.6 but not in R 3.5

2019-05-06 Thread Stephen Berman
On Mon, 6 May 2019 11:12:25 +0200 Ralf Stubner  wrote:

> On 04.05.19 19:04, Stephen Berman wrote:
>> In versions of R prior to 3.6.0 the following invocation succeeds,
>> returning the data frame shown:
>>
>>> read.table("https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text";,
>>> header=TRUE)
>>Dekade   Anzahl
>> 11900 11467254
>> 21910 13023370
>> 31920 13434601
>> 41930 13296355
>> 51940 12121250
>> 61950 13191131
>> 71960 10587420
>> 81970 10944129
>> 91980 11279439
>> 10   1990 12052652
>>
>> But in version 3.6.0 it fails:
>>
>>> read.table("https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text";,
>>> header=TRUE)
>> Error in file(file, "rt") :
>>   cannot open the connection to
>> 'https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text'
>> In addition: Warning message:
>> In file(file, "rt") :
>>   cannot open URL
>> 'https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text':
>> HTTP status was '403 Forbidden'
>
> I can reproduce the behavior on Debian using the CRAN supplied package
> for R 3.6.0. Trying to read the page with 'curl' produces also a 403
> error plus some HTML text (in German) explaining that I am treated as a
> 'robot' due to the supplied User-Agent (here: curl/7.52.1). One
> suggested solution is to adjust that value which does solve the issue:
>
>  > options(HTTPUserAgent='mozilla')

I confirm that works for me, too.  Thanks!  FWIW, the default value of
HTTPUserAgent in R 3.6 here is "R (3.6.0 x86_64-suse-linux-gnu x86_64
linux-gnu)", and using this (in R 3.6) fails as I reported, while the
default value of HTTPUserAgent in R 3.5 here is "R (3.5.0
x86_64-suse-linux-gnu x86_64 linux-gnu)" and using that (in R 3.5)
succeeds.  However, setting HTTPUserAgent in R 3.5 to "libcurl/7.60.0"
fails just as it does in 3.6.  It's not clear to me if this particular
website is being too restrictive or if R 3.6 should deal with it, or at
least mention the issue in NEWS or somewhere else.

Steve Berman

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments

2019-05-06 Thread Ravi Varadhan
Optim's Nelder-Mead works correctly for this example.


> optim(par=10, fn=fn, method="Nelder-Mead")
x=10, ret=100.02 (memory)
x=11, ret=121 (calculate)
x=9, ret=81 (calculate)
x=8, ret=64 (calculate)
x=6, ret=36 (calculate)
x=4, ret=16 (calculate)
x=0, ret=0 (calculate)
x=-4, ret=16 (calculate)
x=-4, ret=16 (memory)
x=2, ret=4 (calculate)
x=-2, ret=4 (calculate)
x=1, ret=1 (calculate)
x=-1, ret=1 (calculate)
x=0.5, ret=0.25 (calculate)
x=-0.5, ret=0.25 (calculate)
x=0.25, ret=0.0625 (calculate)
x=-0.25, ret=0.0625 (calculate)
x=0.125, ret=0.015625 (calculate)
x=-0.125, ret=0.015625 (calculate)
x=0.0625, ret=0.00390625 (calculate)
x=-0.0625, ret=0.00390625 (calculate)
x=0.03125, ret=0.0009765625 (calculate)
x=-0.03125, ret=0.0009765625 (calculate)
x=0.015625, ret=0.0002441406 (calculate)
x=-0.015625, ret=0.0002441406 (calculate)
x=0.0078125, ret=6.103516e-05 (calculate)
x=-0.0078125, ret=6.103516e-05 (calculate)
x=0.00390625, ret=1.525879e-05 (calculate)
x=-0.00390625, ret=1.525879e-05 (calculate)
x=0.001953125, ret=3.814697e-06 (calculate)
x=-0.001953125, ret=3.814697e-06 (calculate)
x=0.0009765625, ret=9.536743e-07 (calculate)
$par
[1] 0

$value
[1] 0

$counts
function gradient
  32   NA

$convergence
[1] 0

$message
NULL





From: R-devel  on behalf of Duncan Murdoch 

Sent: Friday, May 3, 2019 8:18:44 AM
To: peter dalgaard
Cc: Florian Gerber; r-devel@r-project.org
Subject: Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working 
with parent environments


It looks as though this happens when calculating numerical gradients:  x
is reduced by eps, and fn is called; then x is increased by eps, and fn
is called again.  No check is made that x has other references after the
first call to fn.

I'll put together a patch if nobody else gets there first...

Duncan Murdoch

On 03/05/2019 7:13 a.m., peter dalgaard wrote:
> Yes, I think you are right. I was at first confused by the fact that after 
> the optim() call,
>
>> environment(fn)$xx
> [1] 10
>> environment(fn)$ret
> [1] 100.02
>
> so not 9.999, but this could come from x being assigned the final value 
> without calling fn.
>
> -pd
>
>
>> On 3 May 2019, at 11:58 , Duncan Murdoch  wrote:
>>
>> Your results below make it look like a bug in optim():  it is not 
>> duplicating a value when it should, so changes to x affect xx as well.
>>
>> Duncan Murdoch
>>
>> On 03/05/2019 4:41 a.m., Serguei Sokol wrote:
>>> On 03/05/2019 10:31, Serguei Sokol wrote:
 On 02/05/2019 21:35, Florian Gerber wrote:
> Dear all,
>
> when using optim() for a function that uses the parent environment, I
> see the following unexpected behavior:
>
> makeFn <- function(){
>   xx <- ret <- NA
>   fn <- function(x){
>  if(!is.na(xx) && x==xx){
>  cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="")
>  return(ret)
>  }
>  xx <<- x; ret <<- sum(x^2)
>  cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="")
>  ret
>   }
>   fn
> }
> fn <- makeFn()
> optim(par=10, fn=fn, method="L-BFGS-B")
> # x=10, ret=100 (calculate)
> # x=10.001, ret=100.02 (calculate)
> # x=9.999, ret=100.02 (memory)
> # $par
> # [1] 10
> #
> # $value
> # [1] 100
> # (...)
>
> I would expect that optim() does more than 3 function evaluations and
> that the optimization converges to 0.
>
> Same problem with optim(par=10, fn=fn, method="BFGS").
>
> Any ideas?
 I don't have an answer but may be an insight. For some mysterious
 reason xx is getting changed when in should not. Consider:
> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
 x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx
 <<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
> optim(par=10, fn=fn, method="L-BFGS-B")
 1 in x,xx,ret= 10 NA NA
 out x,xx,ret= 10 10 100
 2 in x,xx,ret= 10.001 10 100
 out x,xx,ret= 10.001 10.001 100.02
 3 in x,xx,ret= 9.999 9.999 100.02
 $par
 [1] 10

 $value
 [1] 100

 $counts
 function gradient
 11

 $convergence
 [1] 0

 $message
 [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"

 At the third call, xx has value 9.999 while it should have kept the
 value 10.001.

>>> A little follow-up: if you untie the link between xx and x by replacing
>>> the expression "xx <<- x" by "xx <<- x+0" it works as expected:
>>>   > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
>>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<-
>>> x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
>>>   > optim(par=10, fn=fn, method="L-BFGS-B")
>>> 1 in x,xx,ret= 10 NA NA
>>> out x,xx,ret= 10 10 100
>>> 2 in x,xx,ret= 10.001 10 100

Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments

2019-05-06 Thread Ben Bolker


  That's consistent/not surprising if the problem lies in the numerical
gradient calculation step ...

On 2019-05-06 10:06 a.m., Ravi Varadhan wrote:
> Optim's Nelder-Mead works correctly for this example.
> 
> 
>> optim(par=10, fn=fn, method="Nelder-Mead")
> x=10, ret=100.02 (memory)
> x=11, ret=121 (calculate)
> x=9, ret=81 (calculate)
> x=8, ret=64 (calculate)
> x=6, ret=36 (calculate)
> x=4, ret=16 (calculate)
> x=0, ret=0 (calculate)
> x=-4, ret=16 (calculate)
> x=-4, ret=16 (memory)
> x=2, ret=4 (calculate)
> x=-2, ret=4 (calculate)
> x=1, ret=1 (calculate)
> x=-1, ret=1 (calculate)
> x=0.5, ret=0.25 (calculate)
> x=-0.5, ret=0.25 (calculate)
> x=0.25, ret=0.0625 (calculate)
> x=-0.25, ret=0.0625 (calculate)
> x=0.125, ret=0.015625 (calculate)
> x=-0.125, ret=0.015625 (calculate)
> x=0.0625, ret=0.00390625 (calculate)
> x=-0.0625, ret=0.00390625 (calculate)
> x=0.03125, ret=0.0009765625 (calculate)
> x=-0.03125, ret=0.0009765625 (calculate)
> x=0.015625, ret=0.0002441406 (calculate)
> x=-0.015625, ret=0.0002441406 (calculate)
> x=0.0078125, ret=6.103516e-05 (calculate)
> x=-0.0078125, ret=6.103516e-05 (calculate)
> x=0.00390625, ret=1.525879e-05 (calculate)
> x=-0.00390625, ret=1.525879e-05 (calculate)
> x=0.001953125, ret=3.814697e-06 (calculate)
> x=-0.001953125, ret=3.814697e-06 (calculate)
> x=0.0009765625, ret=9.536743e-07 (calculate)
> $par
> [1] 0
> 
> $value
> [1] 0
> 
> $counts
> function gradient
>   32   NA
> 
> $convergence
> [1] 0
> 
> $message
> NULL
> 
> 
> 
> 
> 
> From: R-devel  on behalf of Duncan Murdoch 
> 
> Sent: Friday, May 3, 2019 8:18:44 AM
> To: peter dalgaard
> Cc: Florian Gerber; r-devel@r-project.org
> Subject: Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when 
> working with parent environments
> 
> 
> It looks as though this happens when calculating numerical gradients:  x
> is reduced by eps, and fn is called; then x is increased by eps, and fn
> is called again.  No check is made that x has other references after the
> first call to fn.
> 
> I'll put together a patch if nobody else gets there first...
> 
> Duncan Murdoch
> 
> On 03/05/2019 7:13 a.m., peter dalgaard wrote:
>> Yes, I think you are right. I was at first confused by the fact that after 
>> the optim() call,
>>
>>> environment(fn)$xx
>> [1] 10
>>> environment(fn)$ret
>> [1] 100.02
>>
>> so not 9.999, but this could come from x being assigned the final value 
>> without calling fn.
>>
>> -pd
>>
>>
>>> On 3 May 2019, at 11:58 , Duncan Murdoch  wrote:
>>>
>>> Your results below make it look like a bug in optim():  it is not 
>>> duplicating a value when it should, so changes to x affect xx as well.
>>>
>>> Duncan Murdoch
>>>
>>> On 03/05/2019 4:41 a.m., Serguei Sokol wrote:
 On 03/05/2019 10:31, Serguei Sokol wrote:
> On 02/05/2019 21:35, Florian Gerber wrote:
>> Dear all,
>>
>> when using optim() for a function that uses the parent environment, I
>> see the following unexpected behavior:
>>
>> makeFn <- function(){
>>   xx <- ret <- NA
>>   fn <- function(x){
>>  if(!is.na(xx) && x==xx){
>>  cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="")
>>  return(ret)
>>  }
>>  xx <<- x; ret <<- sum(x^2)
>>  cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="")
>>  ret
>>   }
>>   fn
>> }
>> fn <- makeFn()
>> optim(par=10, fn=fn, method="L-BFGS-B")
>> # x=10, ret=100 (calculate)
>> # x=10.001, ret=100.02 (calculate)
>> # x=9.999, ret=100.02 (memory)
>> # $par
>> # [1] 10
>> #
>> # $value
>> # [1] 100
>> # (...)
>>
>> I would expect that optim() does more than 3 function evaluations and
>> that the optimization converges to 0.
>>
>> Same problem with optim(par=10, fn=fn, method="BFGS").
>>
>> Any ideas?
> I don't have an answer but may be an insight. For some mysterious
> reason xx is getting changed when in should not. Consider:
>> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx
> <<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
>> optim(par=10, fn=fn, method="L-BFGS-B")
> 1 in x,xx,ret= 10 NA NA
> out x,xx,ret= 10 10 100
> 2 in x,xx,ret= 10.001 10 100
> out x,xx,ret= 10.001 10.001 100.02
> 3 in x,xx,ret= 9.999 9.999 100.02
> $par
> [1] 10
>
> $value
> [1] 100
>
> $counts
> function gradient
> 11
>
> $convergence
> [1] 0
>
> $message
> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
>
> At the third call, xx has value 9.999 while it should have kept the
> value 10.001.
>
 A little follow-up: if you untie the link between xx and x by replacing
 the expression "xx <<- x"

Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments

2019-05-06 Thread Thomas Petzoldt
It seems that it's an old bug that was found in some other packages, but 
at that time not optim:


https://bugs.r-project.org/bugzilla/show_bug.cgi?id=15958

and that Duncan Murdoch posted a patch already last Friday :)

Thomas

Am 06.05.2019 um 16:40 schrieb Ben Bolker:

   That's consistent/not surprising if the problem lies in the numerical
gradient calculation step ...

On 2019-05-06 10:06 a.m., Ravi Varadhan wrote:

Optim's Nelder-Mead works correctly for this example.



optim(par=10, fn=fn, method="Nelder-Mead")

x=10, ret=100.02 (memory)
x=11, ret=121 (calculate)
x=9, ret=81 (calculate)
x=8, ret=64 (calculate)
x=6, ret=36 (calculate)
x=4, ret=16 (calculate)
x=0, ret=0 (calculate)
x=-4, ret=16 (calculate)
x=-4, ret=16 (memory)
x=2, ret=4 (calculate)
x=-2, ret=4 (calculate)
x=1, ret=1 (calculate)
x=-1, ret=1 (calculate)
x=0.5, ret=0.25 (calculate)
x=-0.5, ret=0.25 (calculate)
x=0.25, ret=0.0625 (calculate)
x=-0.25, ret=0.0625 (calculate)
x=0.125, ret=0.015625 (calculate)
x=-0.125, ret=0.015625 (calculate)
x=0.0625, ret=0.00390625 (calculate)
x=-0.0625, ret=0.00390625 (calculate)
x=0.03125, ret=0.0009765625 (calculate)
x=-0.03125, ret=0.0009765625 (calculate)
x=0.015625, ret=0.0002441406 (calculate)
x=-0.015625, ret=0.0002441406 (calculate)
x=0.0078125, ret=6.103516e-05 (calculate)
x=-0.0078125, ret=6.103516e-05 (calculate)
x=0.00390625, ret=1.525879e-05 (calculate)
x=-0.00390625, ret=1.525879e-05 (calculate)
x=0.001953125, ret=3.814697e-06 (calculate)
x=-0.001953125, ret=3.814697e-06 (calculate)
x=0.0009765625, ret=9.536743e-07 (calculate)
$par
[1] 0

$value
[1] 0

$counts
function gradient
   32   NA

$convergence
[1] 0

$message
NULL





From: R-devel  on behalf of Duncan Murdoch 

Sent: Friday, May 3, 2019 8:18:44 AM
To: peter dalgaard
Cc: Florian Gerber; r-devel@r-project.org
Subject: Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working 
with parent environments


It looks as though this happens when calculating numerical gradients:  x
is reduced by eps, and fn is called; then x is increased by eps, and fn
is called again.  No check is made that x has other references after the
first call to fn.

I'll put together a patch if nobody else gets there first...

Duncan Murdoch

On 03/05/2019 7:13 a.m., peter dalgaard wrote:

Yes, I think you are right. I was at first confused by the fact that after the 
optim() call,


environment(fn)$xx

[1] 10

environment(fn)$ret

[1] 100.02

so not 9.999, but this could come from x being assigned the final value without 
calling fn.

-pd



On 3 May 2019, at 11:58 , Duncan Murdoch  wrote:

Your results below make it look like a bug in optim():  it is not duplicating a 
value when it should, so changes to x affect xx as well.

Duncan Murdoch

On 03/05/2019 4:41 a.m., Serguei Sokol wrote:

On 03/05/2019 10:31, Serguei Sokol wrote:

On 02/05/2019 21:35, Florian Gerber wrote:

Dear all,

when using optim() for a function that uses the parent environment, I
see the following unexpected behavior:

makeFn <- function(){
   xx <- ret <- NA
   fn <- function(x){
  if(!is.na(xx) && x==xx){
  cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="")
  return(ret)
  }
  xx <<- x; ret <<- sum(x^2)
  cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="")
  ret
   }
   fn
}
fn <- makeFn()
optim(par=10, fn=fn, method="L-BFGS-B")
# x=10, ret=100 (calculate)
# x=10.001, ret=100.02 (calculate)
# x=9.999, ret=100.02 (memory)
# $par
# [1] 10
#
# $value
# [1] 100
# (...)

I would expect that optim() does more than 3 function evaluations and
that the optimization converges to 0.

Same problem with optim(par=10, fn=fn, method="BFGS").

Any ideas?

I don't have an answer but may be an insight. For some mysterious
reason xx is getting changed when in should not. Consider:

fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in

x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx
<<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})

optim(par=10, fn=fn, method="L-BFGS-B")

1 in x,xx,ret= 10 NA NA
out x,xx,ret= 10 10 100
2 in x,xx,ret= 10.001 10 100
out x,xx,ret= 10.001 10.001 100.02
3 in x,xx,ret= 9.999 9.999 100.02
$par
[1] 10

$value
[1] 100

$counts
function gradient
 11

$convergence
[1] 0

$message
[1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"

At the third call, xx has value 9.999 while it should have kept the
value 10.001.


A little follow-up: if you untie the link between xx and x by replacing
the expression "xx <<- x" by "xx <<- x+0" it works as expected:
   > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<-
x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
   > optim(par=10, fn=fn, method="L-BFGS-B")
1 in x,xx,ret= 10 NA NA
out x,xx,ret= 10 10 100
2 in x,xx,ret= 10.001

Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments

2019-05-06 Thread Serguei Sokol

On 06/05/2019 18:21, Thomas Petzoldt wrote:
It seems that it's an old bug that was found in some other packages, 
but at that time not optim:


https://bugs.r-project.org/bugzilla/show_bug.cgi?id=15958
I think that the bug description is a little bit misleading. The bug is 
not in fact that "<<-" produce a reference instead of a copy (that's 
normal) but in fact that some C or Fortran code modifies a variable "in 
place" without taking care if there are some references on it or not.


Serguei (just splitting hairs)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel