Re: [Rd] Runnable R packages

2019-02-07 Thread Peter Meissner
Doesn't Rtools provide everything needed to build R packages and R on
Windows - including gcc?

Am Sa., 2. Feb. 2019 um 22:29 Uhr schrieb Abs Spurdle :

> Creating an .exe file isn't necessarily difficult.
> The main problems are that you have to write and compile the C (or other)
> files.
> Otherwise, the complexity depends on the level of Inter Process
> Communication that's required.
>
> Simply starting R with some initial conditions, is easy.
> Even if you want to prompt the user to install missing packages, it isn't
> necessarily difficult.
>
> It would be possible to take this one step further, and write an .exe
> builder, that automates the process of creating .exe files.
> Obviously, it would require a compiler and supporting libraries.
> I have a preference for GCC, and I'm not sure if you can run GCC on Windows
> without Cygwin.
>
> I may (or may not) look into this further, in a few weeks time.
>
>
> On Sun, Feb 3, 2019 at 2:27 AM Barry Rowlingson <
> b.rowling...@lancaster.ac.uk> wrote:
>
> > I don't think anyone denies that you *could* make an EXE to do all
> > that. The discussion is on *how easy* it should be to create a single
> > file that contains an initial "main" function plus a set of bundled
> > code (potentially as a package) and which when run will install its
> > package code (which is contained in itself, its not in a repo),
> > install dependencies, and run the main() function.
> >
> > Now, I could build a self-executable shar file that bundled a package
> > together with a script to do all the above. But if there was a "RUN"
> > command in R, and a convention that a function called "foo::main"
> > would be run by `R CMD RUN foo_1.1.1.tar.gz` then it would be so much
> > easier to develop and test.
> >
> > If people think this adds value, then if they want to offer that value
> > to me as $ or £, I'd consider writing it if their total value was more
> > than my cost
> >
> > Barry
> >
>
>
> ___
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Bug Report: read.table with UTF-8 encoded file imports infinity symbol as Integer 8

2019-02-07 Thread David Byrne
Bug
Using read.table(file, encoding="UTF-8") to import a UTF-8 encoded
file containing the infinity symbol (' ∞ ') results in the infinity
symbol imported as the number 8. Other Unicode characters seem
unaffected, example, Zhe: ж

Expected Behavior:
The imported data.frame should represent the infinity symbol as the
expected 'Inf' so that normal mathematical operations can be processed

Stack Overflow Post:
I created a question on Stack Overflow where one other member was able
to reproduce the same issues I was having. This question can be found
at:
https://stackoverflow.com/questions/54522196/r-read-table-with-utf-8-encoded-file-reads-infinity-symbol-as-8-int

Method to Reproduce - 1:
A simple method to reproduce this issues is to use R-Studio: In the
console, type the following:
> read.table(text=" ∞", encoding="UTF-8")

The result should be a data.frame with a single value of '8'

Repeating the same with ж Results in correct expected behavior

Method to Reproduce - 2:
Create a .csv file containing the infinity and Zhe characters (I have
attached the file for convenience, hopefully it is no rejected by your
email service). Launch an interactive session using

> r --vanilla

Enter the following statement taking care to replace the
 with the appropriate one:

> read.table("/unicode_chars.csv", sep=",", encoding="UTF-8")


This should result in a two element data.frame; the first being the
incorrect value of 8 with an additional  and the second the
correct value of Zhe.

Note the additional  prefixed to the front of the '8'. This
appears to be a hidden character for the purposes of letting editors
know the encoding. The following link has some explanation however, it
states this is caused by excel. The file I created was done so using
notepad and not Excel.

https://medium.freecodecamp.org/a-quick-tale-about-feff-the-invisible-character-cd25cd4630e7

System Details:
OS:
> Windows 10.0.17134 Build 17134


R Version:
> platform   x86_64-w64-mingw32
> arch   x86_64
> os mingw32
> system x86_64, mingw32
> status
> major  3
> minor  4.1
> year   2017
> month  06
> day30
> svn rev72865
> language   R
> version.string R version 3.4.1 (2017-06-30)
> nickname   Single Candle
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug Report: read.table with UTF-8 encoded file imports infinity symbol as Integer 8

2019-02-07 Thread peter dalgaard
This doesn't seem to be happening on MacOS, neither in Terminal nor RStudio, (R 
3.5.1, R-devel, R-patched). So probably Windows specific. 

-pd

> On 7 Feb 2019, at 11:17 , David Byrne  wrote:
> 
> Bug
> Using read.table(file, encoding="UTF-8") to import a UTF-8 encoded
> file containing the infinity symbol (' ∞ ') results in the infinity
> symbol imported as the number 8. Other Unicode characters seem
> unaffected, example, Zhe: ж
> 
> Expected Behavior:
> The imported data.frame should represent the infinity symbol as the
> expected 'Inf' so that normal mathematical operations can be processed
> 
> Stack Overflow Post:
> I created a question on Stack Overflow where one other member was able
> to reproduce the same issues I was having. This question can be found
> at:
> https://stackoverflow.com/questions/54522196/r-read-table-with-utf-8-encoded-file-reads-infinity-symbol-as-8-int
> 
> Method to Reproduce - 1:
> A simple method to reproduce this issues is to use R-Studio: In the
> console, type the following:
>> read.table(text=" ∞", encoding="UTF-8")
> 
> The result should be a data.frame with a single value of '8'
> 
> Repeating the same with ж Results in correct expected behavior
> 
> Method to Reproduce - 2:
> Create a .csv file containing the infinity and Zhe characters (I have
> attached the file for convenience, hopefully it is no rejected by your
> email service). Launch an interactive session using
> 
>> r --vanilla
> 
> Enter the following statement taking care to replace the
>  with the appropriate one:
> 
>> read.table("/unicode_chars.csv", sep=",", encoding="UTF-8")
> 
> 
> This should result in a two element data.frame; the first being the
> incorrect value of 8 with an additional  and the second the
> correct value of Zhe.
> 
> Note the additional  prefixed to the front of the '8'. This
> appears to be a hidden character for the purposes of letting editors
> know the encoding. The following link has some explanation however, it
> states this is caused by excel. The file I created was done so using
> notepad and not Excel.
> 
> https://medium.freecodecamp.org/a-quick-tale-about-feff-the-invisible-character-cd25cd4630e7
> 
> System Details:
> OS:
>> Windows 10.0.17134 Build 17134
> 
> 
> R Version:
>> platform   x86_64-w64-mingw32
>> arch   x86_64
>> os mingw32
>> system x86_64, mingw32
>> status
>> major  3
>> minor  4.1
>> year   2017
>> month  06
>> day30
>> svn rev72865
>> language   R
>> version.string R version 3.4.1 (2017-06-30)
>> nickname   Single Candle
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug Report: read.table with UTF-8 encoded file imports infinity symbol as Integer 8

2019-02-07 Thread David Byrne
I can confirm that it doesn't happen on Ubuntu 18.04.1 so Peter is
most likely correct; it looks like its Windows specific.

On Thu, 7 Feb 2019 at 12:55, peter dalgaard  wrote:
>
> This doesn't seem to be happening on MacOS, neither in Terminal nor RStudio, 
> (R 3.5.1, R-devel, R-patched). So probably Windows specific.
>
> -pd
>
> > On 7 Feb 2019, at 11:17 , David Byrne  wrote:
> >
> > Bug
> > Using read.table(file, encoding="UTF-8") to import a UTF-8 encoded
> > file containing the infinity symbol (' ∞ ') results in the infinity
> > symbol imported as the number 8. Other Unicode characters seem
> > unaffected, example, Zhe: ж
> >
> > Expected Behavior:
> > The imported data.frame should represent the infinity symbol as the
> > expected 'Inf' so that normal mathematical operations can be processed
> >
> > Stack Overflow Post:
> > I created a question on Stack Overflow where one other member was able
> > to reproduce the same issues I was having. This question can be found
> > at:
> > https://stackoverflow.com/questions/54522196/r-read-table-with-utf-8-encoded-file-reads-infinity-symbol-as-8-int
> >
> > Method to Reproduce - 1:
> > A simple method to reproduce this issues is to use R-Studio: In the
> > console, type the following:
> >> read.table(text=" ∞", encoding="UTF-8")
> >
> > The result should be a data.frame with a single value of '8'
> >
> > Repeating the same with ж Results in correct expected behavior
> >
> > Method to Reproduce - 2:
> > Create a .csv file containing the infinity and Zhe characters (I have
> > attached the file for convenience, hopefully it is no rejected by your
> > email service). Launch an interactive session using
> >
> >> r --vanilla
> >
> > Enter the following statement taking care to replace the
> >  with the appropriate one:
> >
> >> read.table("/unicode_chars.csv", sep=",", encoding="UTF-8")
> >
> >
> > This should result in a two element data.frame; the first being the
> > incorrect value of 8 with an additional  and the second the
> > correct value of Zhe.
> >
> > Note the additional  prefixed to the front of the '8'. This
> > appears to be a hidden character for the purposes of letting editors
> > know the encoding. The following link has some explanation however, it
> > states this is caused by excel. The file I created was done so using
> > notepad and not Excel.
> >
> > https://medium.freecodecamp.org/a-quick-tale-about-feff-the-invisible-character-cd25cd4630e7
> >
> > System Details:
> > OS:
> >> Windows 10.0.17134 Build 17134
> >
> >
> > R Version:
> >> platform   x86_64-w64-mingw32
> >> arch   x86_64
> >> os mingw32
> >> system x86_64, mingw32
> >> status
> >> major  3
> >> minor  4.1
> >> year   2017
> >> month  06
> >> day30
> >> svn rev72865
> >> language   R
> >> version.string R version 3.4.1 (2017-06-30)
> >> nickname   Single Candle
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
>
>
>
>
>
>
>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug Report: read.table with UTF-8 encoded file imports infinity symbol as Integer 8

2019-02-07 Thread Daniel Possenriede
There seems to be something odd with "∞" on Windows (and not only with
read.table)
In native encoding (cp-1252 in my case), "∞" gets converted to "8"

x <-  "∞"
Encoding(x)
#> [1] "unknown"
print(x)
#> [1] "8"
charToRaw(x)
#> [1] 38

"∞" is indeed "8"

identical(x, "8")
#> [1] TRUE

Everything seems fine if  "∞" is UTF-8 encoded.

y <- "\u221E"
Encoding(y)
#> [1] "UTF-8"
print(y)
#> [1]  "∞"
charToRaw(y)
#> [1] e2 88 9e

Unless the string is converted back to native encoding.

format(y)
#> [1] "8"

This ought to be "", equivalently to

format("∝")
#> [1] ""

Session Info:

si <- sessionInfo()
si$running
#> [1] "Windows 10 x64 (build 17134)"
si$R.version$version.string
#> [1] "R version 3.5.2 (2018-12-20)"
si$locale
#> [1]
"LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETARY=German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252"



Am Do., 7. Feb. 2019 um 14:33 Uhr schrieb David Byrne <
david.byrne...@gmail.com>:

> I can confirm that it doesn't happen on Ubuntu 18.04.1 so Peter is
> most likely correct; it looks like its Windows specific.
>
> On Thu, 7 Feb 2019 at 12:55, peter dalgaard  wrote:
> >
> > This doesn't seem to be happening on MacOS, neither in Terminal nor
> RStudio, (R 3.5.1, R-devel, R-patched). So probably Windows specific.
> >
> > -pd
> >
> > > On 7 Feb 2019, at 11:17 , David Byrne 
> wrote:
> > >
> > > Bug
> > > Using read.table(file, encoding="UTF-8") to import a UTF-8 encoded
> > > file containing the infinity symbol (' ∞ ') results in the infinity
> > > symbol imported as the number 8. Other Unicode characters seem
> > > unaffected, example, Zhe: ж
> > >
> > > Expected Behavior:
> > > The imported data.frame should represent the infinity symbol as the
> > > expected 'Inf' so that normal mathematical operations can be processed
> > >
> > > Stack Overflow Post:
> > > I created a question on Stack Overflow where one other member was able
> > > to reproduce the same issues I was having. This question can be found
> > > at:
> > >
> https://stackoverflow.com/questions/54522196/r-read-table-with-utf-8-encoded-file-reads-infinity-symbol-as-8-int
> > >
> > > Method to Reproduce - 1:
> > > A simple method to reproduce this issues is to use R-Studio: In the
> > > console, type the following:
> > >> read.table(text=" ∞", encoding="UTF-8")
> > >
> > > The result should be a data.frame with a single value of '8'
> > >
> > > Repeating the same with ж Results in correct expected behavior
> > >
> > > Method to Reproduce - 2:
> > > Create a .csv file containing the infinity and Zhe characters (I have
> > > attached the file for convenience, hopefully it is no rejected by your
> > > email service). Launch an interactive session using
> > >
> > >> r --vanilla
> > >
> > > Enter the following statement taking care to replace the
> > >  with the appropriate one:
> > >
> > >> read.table("/unicode_chars.csv", sep=",",
> encoding="UTF-8")
> > >
> > >
> > > This should result in a two element data.frame; the first being the
> > > incorrect value of 8 with an additional  and the second the
> > > correct value of Zhe.
> > >
> > > Note the additional  prefixed to the front of the '8'. This
> > > appears to be a hidden character for the purposes of letting editors
> > > know the encoding. The following link has some explanation however, it
> > > states this is caused by excel. The file I created was done so using
> > > notepad and not Excel.
> > >
> > >
> https://medium.freecodecamp.org/a-quick-tale-about-feff-the-invisible-character-cd25cd4630e7
> > >
> > > System Details:
> > > OS:
> > >> Windows 10.0.17134 Build 17134
> > >
> > >
> > > R Version:
> > >> platform   x86_64-w64-mingw32
> > >> arch   x86_64
> > >> os mingw32
> > >> system x86_64, mingw32
> > >> status
> > >> major  3
> > >> minor  4.1
> > >> year   2017
> > >> month  06
> > >> day30
> > >> svn rev72865
> > >> language   R
> > >> version.string R version 3.4.1 (2017-06-30)
> > >> nickname   Single Candle
> > > __
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> > --
> > Peter Dalgaard, Professor,
> > Center for Statistics, Copenhagen Business School
> > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> > Phone: (+45)38153501
> > Office: A 4.23
> > Email: pd@cbs.dk  Priv: pda...@gmail.com
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug Report: read.table with UTF-8 encoded file imports infinity symbol as Integer 8

2019-02-07 Thread Paul McQuesten
Windows Notepad prefixes UTF-8 files with a Byte Order Mark (\UFEFF).
Per https://en.wikipedia.org/wiki/Byte_order_mark, this is permitted in
UTF-8, but not required.
I suppose that there are other Windows programs which do likewise (in
addition to Excel and Notepad).

"The Unicode Standard permits the BOM in UTF-8
,[3]
 but does not
require or recommend its use.[4]
 Byte order has
no meaning in UTF-8,[5]
 so
its only use in UTF-8 is to signal at the start that the text stream is
encoded in UTF-8, or that it was converted to UTF-8 from a stream that
contained an optional BOM. The standard also does not recommend removing a
BOM when it is there, so that round-tripping between encodings does not
lose information, and so that code that relies on it continues to work.[6]
[7]
 The IETF
recommends that if a protocol either (a) always uses UTF-8, or (b) has some
other way to indicate what encoding is being used, then it "SHOULD forbid
use of U+FEFF as a signature."[8]
"

On Thu, Feb 7, 2019 at 8:10 AM Daniel Possenriede 
wrote:

> There seems to be something odd with "∞" on Windows (and not only with
> read.table)
> In native encoding (cp-1252 in my case), "∞" gets converted to "8"
>
> x <-  "∞"
> Encoding(x)
> #> [1] "unknown"
> print(x)
> #> [1] "8"
> charToRaw(x)
> #> [1] 38
>
> "∞" is indeed "8"
>
> identical(x, "8")
> #> [1] TRUE
>
> Everything seems fine if  "∞" is UTF-8 encoded.
>
> y <- "\u221E"
> Encoding(y)
> #> [1] "UTF-8"
> print(y)
> #> [1]  "∞"
> charToRaw(y)
> #> [1] e2 88 9e
>
> Unless the string is converted back to native encoding.
>
> format(y)
> #> [1] "8"
>
> This ought to be "", equivalently to
>
> format("∝")
> #> [1] ""
>
> Session Info:
>
> si <- sessionInfo()
> si$running
> #> [1] "Windows 10 x64 (build 17134)"
> si$R.version$version.string
> #> [1] "R version 3.5.2 (2018-12-20)"
> si$locale
> #> [1]
>
> "LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETARY=German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252"
>
>
>
> Am Do., 7. Feb. 2019 um 14:33 Uhr schrieb David Byrne <
> david.byrne...@gmail.com>:
>
> > I can confirm that it doesn't happen on Ubuntu 18.04.1 so Peter is
> > most likely correct; it looks like its Windows specific.
> >
> > On Thu, 7 Feb 2019 at 12:55, peter dalgaard  wrote:
> > >
> > > This doesn't seem to be happening on MacOS, neither in Terminal nor
> > RStudio, (R 3.5.1, R-devel, R-patched). So probably Windows specific.
> > >
> > > -pd
> > >
> > > > On 7 Feb 2019, at 11:17 , David Byrne 
> > wrote:
> > > >
> > > > Bug
> > > > Using read.table(file, encoding="UTF-8") to import a UTF-8 encoded
> > > > file containing the infinity symbol (' ∞ ') results in the infinity
> > > > symbol imported as the number 8. Other Unicode characters seem
> > > > unaffected, example, Zhe: ж
> > > >
> > > > Expected Behavior:
> > > > The imported data.frame should represent the infinity symbol as the
> > > > expected 'Inf' so that normal mathematical operations can be
> processed
> > > >
> > > > Stack Overflow Post:
> > > > I created a question on Stack Overflow where one other member was
> able
> > > > to reproduce the same issues I was having. This question can be found
> > > > at:
> > > >
> >
> https://stackoverflow.com/questions/54522196/r-read-table-with-utf-8-encoded-file-reads-infinity-symbol-as-8-int
> > > >
> > > > Method to Reproduce - 1:
> > > > A simple method to reproduce this issues is to use R-Studio: In the
> > > > console, type the following:
> > > >> read.table(text=" ∞", encoding="UTF-8")
> > > >
> > > > The result should be a data.frame with a single value of '8'
> > > >
> > > > Repeating the same with ж Results in correct expected behavior
> > > >
> > > > Method to Reproduce - 2:
> > > > Create a .csv file containing the infinity and Zhe characters (I have
> > > > attached the file for convenience, hopefully it is no rejected by
> your
> > > > email service). Launch an interactive session using
> > > >
> > > >> r --vanilla
> > > >
> > > > Enter the following statement taking care to replace the
> > > >  with the appropriate one:
> > > >
> > > >> read.table("/unicode_chars.csv", sep=",",
> > encoding="UTF-8")
> > > >
> > > >
> > > > This should result in a two element data.frame; the first being the
> > > > incorrect value of 8 with an additional  and the second the
> > > > correct value of Zhe.
> > > >
> > > > Note the additional  prefixed to the front of the '8'. This
> > > > appears to be a hidden character for the purposes of letting editors
> > > > know the encoding. The following link has some explanation however,
> it
> > > > state