Re: [Rd] encoding argument of source() in 3.5.0

2018-06-05 Thread Tomas Kalibera

Thanks for the report, fixed in R-devel (74848).

Best
Tomas

On 06/04/2018 02:41 PM, NELSON, Michael wrote:


On R 3.5.0 (Mac)

The issue appears when using the default (libcurl) method and specifying the 
encoding

Note that using method='internal' causes a segfault if used in conjunction with 
encoding. (and works when encoding is not set)

urlR <- "http://home.versanet.de/~s-berman/source2.R";
# works
url_default <- url(urlR)
scan(url_default, "")
# Read 7 items
# [1] "source.test2"   "<-" "function()" "{"  
"print(\"Non-ascii:" "äöüß\")"
# [7] "}"

url_default_en <- url(urlR, encoding = "UTF-8")
scan(url_default_en, "")
# Read 0 items
# character(0)
url_internal <- url(urlR, method = 'internal')
scan(url_internal, "")
# Read 7 items
# [1] "source.test2"   "<-" "function()" "{"  
"print(\"Non-ascii:" "äöüß\")"
# [7] "}"

url_internal_en <- url(urlR, encoding = "UTF-8", method = 'internal')
#scan(url_internal_en, "")
#*** caught segfault ***
#  address 0x0, cause 'memory not mapped'

url_libcurl <- url(urlR, method = 'libcurl')
scan(url_libcurl, "")
# Read 7 items
# [1] "source.test2"   "<-" "function()" "{"  
"print(\"Non-ascii:" "äöüß\")"
# [7] "}"
url_libcurl_en <- url(urlR, encoding = "UTF-8", method = 'libcurl')
scan(url_libcurl_en, "")
# Read 0 items
# character(0)


Michael


From: R-devel [r-devel-boun...@r-project.org] on behalf of Stephen Berman 
[stephen.ber...@gmx.net]
Sent: Monday, 4 June 2018 7:26 PM
To: Martin Maechler
Cc: R-devel
Subject: Re: [Rd] encoding argument of source() in 3.5.0

On Mon, 4 Jun 2018 10:44:11 +0200 Martin Maechler  
wrote:


peter dalgaard
 on Sun, 3 Jun 2018 23:51:24 +0200 writes:

 > Looks like this actually comes from readLines(), nothing
 > to do with source() as such: In current R-devel (still):

 >> f <- file("http://home.versanet.de/~s-berman/source2.R";, 
encoding="UTF-8")
 >> readLines(f)
 > character(0)
 >> close(f)
 >> f <- file("http://home.versanet.de/~s-berman/source2.R";)
 >> readLines(f)
 > [1] "source.test2 <- function() {"   "print(\"Non-ascii: äöüß\")"
 > [3] "}"

 > -pd

and that's not even readLines(), but rather how exactly the
connection is defined [even in your example above]

   > urlR <- "http://home.versanet.de/~s-berman/source2.R";
   > readLines(urlR, encoding="UTF-8")
   [1] "source.test2 <- function() {"   "print(\"Non-ascii: äöüß\")"
   [3] "}"
   > f <- file(urlR, encoding = "UTF-8")
   > readLines(f)
   character(0)

and the same behavior with scan()  instead of readLines() :


scan(urlR,"") # works

Read 7 items
[1] "source.test2"   "<-" "function()" "{"
[5] "print(\"Non-ascii:" "äöüß\")""}"

scan(f,"") # fails

Read 0 items
character(0)
So it seems as if the bug is in the file() [or url()] C code ..

Yes, the problem seems to be restricted to loading files from a
(non-local) URL; i.e. this works fine on my computer:

   > source("file:///home/steve/prog/R/source2.R", encoding="UTF-8")

Also, I noticed this works too:

   > read.table("http://home.versanet.de/~s-berman/table2";, encoding="UTF-8", 
skip=1)

where (if I read the source correctly) using `skip=1' makes read.table()
call readLines().  (The read.table() invocation also works without
`skip'.)


But then we also have to consider Windows .. where I think most changes have
happened during the  R-3.4.4 --> R-3.5.0  transition.

Yes, please.  I need (or at least it would be convenient) to be able to
load R code containing non-ascii characters from the web under
MS-Windows.

Steve Berman

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
__
This email has been scanned for the NSW Ministry of Health by the Websense 
Hosted Email Security System.
Emails and attachments are monitored to ensure compliance with the NSW Ministry 
of health's Electronic Messaging Policy.
__

___
Disclaimer: This message is intended for the addressee named and may contain 
confidential information.
If you are not the intended recipient, please delete it and notify the sender.
Views expressed in this message are those of the individual sender, and are not 
necessarily the views of the NSW Ministry of Health.
___
This email has been scanned for the NSW Ministry of Health by the Websense 
Hosted Email Security System.
Emails and attachments are monitored to ensure 

[Rd] histoRicalg -- project to document older methods used by R and transfer knowledge

2018-06-05 Thread J C Nash
After some thought, I decided r-devel was probably the best of the R lists
for this item. Do feel free to share, as the purpose is to improve documentation
and identify potential issues.

John Nash



The R Consortium has awarded some modest funding for "histoRicalg",
a project to document and transfer knowledge of some older algorithms
used by R and by other computational systems. These older codes
are mainly in Fortran, but some are in C, with the original implementations
possibly in other programming languages. My efforts
were prompted by finding some apparent bugs in codes, which could be either
from the original programs or else the implementations. Two examples
in particular -- in nlm() and in optim::L-BFGS-B -- gave impetus
to the project.

As a first task, I am hoping to establish a "Working Group on
Algorithms Used in R" to identify and prioritize issues and to
develop procedures for linking older and younger workers to enable
the transfer of knowledge. Expressions of interest are welcome,
either to me (nashjc _at_ uottawa.ca) or to the mailing list
(https://lists.r-consortium.org/g/rconsortium-project-histoRicalg).
A preliminary web-site is at https://gitlab.com/nashjc/histoRicalg.

While active membership of the Working Group is desirable, given
the nature of this project, I anticipate that most members will
contribute mainly by providing timely and pertinent ideas. Some
may not even be R users, since the underlying algorithms are used
by other computing systems and the documentation effort has many
common features. We will also need participation of younger
workers willing to learn about the methods that underly the
computations in R.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Byte-compilation failure on different architectures / low-memory systems

2018-06-05 Thread Dirk Eddelbuettel


On 4 June 2018 at 20:06, Tomas Kalibera wrote:
| thanks for the report. Access to the test system is not necessary, the 
| memory requirements of the byte-code compiler are usually 
| platform-independent and specifically with this package I can reproduce 
| they are very high. We'll have a look what we can do, certainly there 
| should at least be a way to recover and use the uncompiled version when 
| memory allocation fails, this is already done by the JIT but not when 
| compiling during installation. Before R or the package is patched, the 
| only workaround for memory constrained systems is probably to disable 
| byte-compilation of this package, as I read you are doing already.

Yes. And as a shortcut, we just turned it off unconditionally, ie on all
build architectures.  Worked fine as per

   https://buildd.debian.org/status/package.php?p=fbasics

it has been built everywhere where we have R 3.5.0 (some 20 or so platforms).

The fix you suggest sounds ideal: if possible recover, and maybe WARN.

Dirk

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel