Re: [Rd] Output mis-encoded on Windows w/ RGui 3.5.1 in strange case

2018-07-18 Thread Tomas Kalibera
Thanks, I can now reproduce and it is a bug that is easy to fix, I will 
do so shortly.

Fyi it can be reproduced simply by running these two lines in Rgui:

list()
encodeString("apple")

Best
Tomas

On 07/17/2018 05:16 PM, Kevin Ushey wrote:
> Sorry, I should have been more clear -- if I write the contents of
> that script to a file called 'encoding.R' and source that, then I see
> the reported behavior.
>
> Here's something standalone that you should hopefully be able to copy
> + paste into RGui to reproduce:
>
> code <- '
> x <- 1
> print(list())
> save(x, file = tempfile())
> output <- encodeString("apple")
> print(output)
> '
>
> file <- tempfile(fileext = ".R")
> writeLines(code, con = file)
> source(file)
>
> When I run this, I see:
>
>> code <- '
> +x <- 1
> +print(list())
> +save(x, file = tempfile())
> +output <- encodeString("apple")
> +print(output)
> + '
>> file <- tempfile(fileext = ".R")
>> writeLines(code, con = file)
>> source(file)
> list()
> [1] "\002ÿþapple\003ÿþ"
>
> This is with today's R-devel:
>
>> sessionInfo()
> R Under development (unstable) (2018-07-16 r74967)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows 10 x64 (build 17134)
>
> Matrix products: default
>
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] compiler_3.6.0
>
> I realize the example looks incomplete, but it seems like each step is
> required to reproduce the strange behavior:
>
> 1) You need to print an empty list,
> 2) You need to invoke save() after printing that empty list,
> 3) Then, attempts to call encodeString() will produce the strange output.
>
> For what it's worth, it may be related to a behavior I'm seeing where
> the first name printed for an R list is quoted with backticks even
> when not necessary:
>
>> list(x = 1, y = 2)
> $`x`
> [1] 1
>
> $y
> [1] 2
>
> Thanks,
> Kevin
>
> On Tue, Jul 17, 2018 at 6:12 AM Tomas Kalibera  
> wrote:
>> Hi Kevin,
>>
>> the extra bytes you are seeing are escapes for UTF-8 strings used in
>> input to RGui console. Recently ascii strings are converted to UTF-8 so
>> you would get these escapes for ascii strings now as well. RGui
>> understands these escapes and converts from UTF-8 to wide characters
>> before printing on Windows. The escapes should not be used unless
>> printing to RGui console.
>>
>> I suppose you managed to leak the escapes but I cannot reproduce, the
>> example you sent seems incomplete ("x" not used, not clear what
>> encoding.R is, not clear where the encodeString is run) and none of the
>> variations I ran leaked the escapes on R-devel. Please clarify the
>> example if you believe it is a bug. Please also use current R-devel
>> (I've relatively recently fixed a bug in decoding these escaped strings,
>> perhaps unlikely, but not impossible it could be related).
>>
>> Best
>> Tomas
>>
>> On 07/16/2018 10:01 PM, Kevin Ushey wrote:
>>> Given the following R script:
>>>
>>>  x <- 1
>>>  print(list())
>>>  save(x, file = tempfile())
>>>  output <- encodeString("apple")
>>>  print(output)
>>>
>>> If I source this script from RGui on Windows, I see the output:
>>>
>>>  > source("encoding.R")
>>>  list()
>>>  [1] "\002ÿþapple\003ÿþ"
>>>
>>> That is, it's as though R has injected what looks like byte order
>>> marks around the encoded string:
>>>
>>>  > charToRaw(output)
>>>   [1] 02 ff fe 61 70 70 6c 65 03 ff fe
>>>
>>> FWIW I see the same output in R-patched and R-devel. Any idea what
>>> might be going on? For what it's worth, I don't see the same issue
>>> with R as run from the terminal.
>>>
>>> Thanks,
>>> Kevin
>>>
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Output mis-encoded on Windows w/ RGui 3.5.1 in strange case

2018-07-18 Thread Tomas Kalibera
Fixed in R-devel and R-patched,
Tomas

On 07/18/2018 12:03 PM, Tomas Kalibera wrote:
> Thanks, I can now reproduce and it is a bug that is easy to fix, I 
> will do so shortly.
>
> Fyi it can be reproduced simply by running these two lines in Rgui:
>
> list()
> encodeString("apple")
>
> Best
> Tomas
>
> On 07/17/2018 05:16 PM, Kevin Ushey wrote:
>> Sorry, I should have been more clear -- if I write the contents of
>> that script to a file called 'encoding.R' and source that, then I see
>> the reported behavior.
>>
>> Here's something standalone that you should hopefully be able to copy
>> + paste into RGui to reproduce:
>>
>> code <- '
>> x <- 1
>> print(list())
>> save(x, file = tempfile())
>> output <- encodeString("apple")
>> print(output)
>> '
>>
>> file <- tempfile(fileext = ".R")
>> writeLines(code, con = file)
>> source(file)
>>
>> When I run this, I see:
>>
>>> code <- '
>> +x <- 1
>> +print(list())
>> +save(x, file = tempfile())
>> +output <- encodeString("apple")
>> +print(output)
>> + '
>>> file <- tempfile(fileext = ".R")
>>> writeLines(code, con = file)
>>> source(file)
>> list()
>> [1] "\002ÿþapple\003ÿþ"
>>
>> This is with today's R-devel:
>>
>>> sessionInfo()
>> R Under development (unstable) (2018-07-16 r74967)
>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>> Running under: Windows 10 x64 (build 17134)
>>
>> Matrix products: default
>>
>> locale:
>> [1] LC_COLLATE=English_United States.1252
>> [2] LC_CTYPE=English_United States.1252
>> [3] LC_MONETARY=English_United States.1252
>> [4] LC_NUMERIC=C
>> [5] LC_TIME=English_United States.1252
>>
>> attached base packages:
>> [1] stats graphics  grDevices utils datasets  methods   base
>>
>> loaded via a namespace (and not attached):
>> [1] compiler_3.6.0
>>
>> I realize the example looks incomplete, but it seems like each step is
>> required to reproduce the strange behavior:
>>
>> 1) You need to print an empty list,
>> 2) You need to invoke save() after printing that empty list,
>> 3) Then, attempts to call encodeString() will produce the strange output.
>>
>> For what it's worth, it may be related to a behavior I'm seeing where
>> the first name printed for an R list is quoted with backticks even
>> when not necessary:
>>
>>> list(x = 1, y = 2)
>> $`x`
>> [1] 1
>>
>> $y
>> [1] 2
>>
>> Thanks,
>> Kevin
>>
>> On Tue, Jul 17, 2018 at 6:12 AM Tomas Kalibera  
>> wrote:
>>> Hi Kevin,
>>>
>>> the extra bytes you are seeing are escapes for UTF-8 strings used in
>>> input to RGui console. Recently ascii strings are converted to UTF-8 so
>>> you would get these escapes for ascii strings now as well. RGui
>>> understands these escapes and converts from UTF-8 to wide characters
>>> before printing on Windows. The escapes should not be used unless
>>> printing to RGui console.
>>>
>>> I suppose you managed to leak the escapes but I cannot reproduce, the
>>> example you sent seems incomplete ("x" not used, not clear what
>>> encoding.R is, not clear where the encodeString is run) and none of the
>>> variations I ran leaked the escapes on R-devel. Please clarify the
>>> example if you believe it is a bug. Please also use current R-devel
>>> (I've relatively recently fixed a bug in decoding these escaped strings,
>>> perhaps unlikely, but not impossible it could be related).
>>>
>>> Best
>>> Tomas
>>>
>>> On 07/16/2018 10:01 PM, Kevin Ushey wrote:
 Given the following R script:

  x <- 1
  print(list())
  save(x, file = tempfile())
  output <- encodeString("apple")
  print(output)

 If I source this script from RGui on Windows, I see the output:

  > source("encoding.R")
  list()
  [1] "\002ÿþapple\003ÿþ"

 That is, it's as though R has injected what looks like byte order
 marks around the encoded string:

  > charToRaw(output)
   [1] 02 ff fe 61 70 70 6c 65 03 ff fe

 FWIW I see the same output in R-patched and R-devel. Any idea what
 might be going on? For what it's worth, I don't see the same issue
 with R as run from the terminal.

 Thanks,
 Kevin

 __
 R-devel@r-project.org  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel
>
>


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Tiny bug in lm()?

2018-07-18 Thread Martin Maechler
> Brett Presnell 
> on Sun, 24 Jun 2018 13:57:04 +0100 writes:

> I meant ncol(y) of course.

> Brett Presnell  writes:

>> I suppose that this never affects anything, but in line
>> 57 of lm.R, where the coefficients are defined for an
>> empty model, when y is a matrix, shouldn't the value be
>> matrix(,0,nrow(y)) rather than matrix(,0,3)?

Yes ("ncol(y)") and actually it should be 'double', not
'logical'.

OTOH: Multivariate empty models are probably pretty rare, so no
  wonder this has never been reported the last 13.374 years
  this has been in the sources.

Of course, I will still fix it.  Thank you, Brett, for reporting!

Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Output mis-encoded on Windows w/ RGui 3.5.1 in strange case

2018-07-18 Thread Kevin Ushey
Thank you for the quick fix! I could've sworn the 'save()' dance was a
necessary part of the reproducible example, but evidently not ...
On Wed, Jul 18, 2018 at 6:38 AM Tomas Kalibera  wrote:
>
> Fixed in R-devel and R-patched,
> Tomas
>
> On 07/18/2018 12:03 PM, Tomas Kalibera wrote:
>
> Thanks, I can now reproduce and it is a bug that is easy to fix, I will do so 
> shortly.
>
> Fyi it can be reproduced simply by running these two lines in Rgui:
>
> list()
> encodeString("apple")
>
> Best
> Tomas
>
> On 07/17/2018 05:16 PM, Kevin Ushey wrote:
>
> Sorry, I should have been more clear -- if I write the contents of
> that script to a file called 'encoding.R' and source that, then I see
> the reported behavior.
>
> Here's something standalone that you should hopefully be able to copy
> + paste into RGui to reproduce:
>
> code <- '
>x <- 1
>print(list())
>save(x, file = tempfile())
>output <- encodeString("apple")
>print(output)
> '
>
> file <- tempfile(fileext = ".R")
> writeLines(code, con = file)
> source(file)
>
> When I run this, I see:
>
> code <- '
>
> +x <- 1
> +print(list())
> +save(x, file = tempfile())
> +output <- encodeString("apple")
> +print(output)
> + '
>
> file <- tempfile(fileext = ".R")
> writeLines(code, con = file)
> source(file)
>
> list()
> [1] "\002ÿþapple\003ÿþ"
>
> This is with today's R-devel:
>
> sessionInfo()
>
> R Under development (unstable) (2018-07-16 r74967)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows 10 x64 (build 17134)
>
> Matrix products: default
>
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] compiler_3.6.0
>
> I realize the example looks incomplete, but it seems like each step is
> required to reproduce the strange behavior:
>
>1) You need to print an empty list,
>2) You need to invoke save() after printing that empty list,
>3) Then, attempts to call encodeString() will produce the strange output.
>
> For what it's worth, it may be related to a behavior I'm seeing where
> the first name printed for an R list is quoted with backticks even
> when not necessary:
>
> list(x = 1, y = 2)
>
> $`x`
> [1] 1
>
> $y
> [1] 2
>
> Thanks,
> Kevin
>
> On Tue, Jul 17, 2018 at 6:12 AM Tomas Kalibera  
> wrote:
>
> Hi Kevin,
>
> the extra bytes you are seeing are escapes for UTF-8 strings used in
> input to RGui console. Recently ascii strings are converted to UTF-8 so
> you would get these escapes for ascii strings now as well. RGui
> understands these escapes and converts from UTF-8 to wide characters
> before printing on Windows. The escapes should not be used unless
> printing to RGui console.
>
> I suppose you managed to leak the escapes but I cannot reproduce, the
> example you sent seems incomplete ("x" not used, not clear what
> encoding.R is, not clear where the encodeString is run) and none of the
> variations I ran leaked the escapes on R-devel. Please clarify the
> example if you believe it is a bug. Please also use current R-devel
> (I've relatively recently fixed a bug in decoding these escaped strings,
> perhaps unlikely, but not impossible it could be related).
>
> Best
> Tomas
>
> On 07/16/2018 10:01 PM, Kevin Ushey wrote:
>
> Given the following R script:
>
> x <- 1
> print(list())
> save(x, file = tempfile())
> output <- encodeString("apple")
> print(output)
>
> If I source this script from RGui on Windows, I see the output:
>
> > source("encoding.R")
> list()
> [1] "\002ÿþapple\003ÿþ"
>
> That is, it's as though R has injected what looks like byte order
> marks around the encoded string:
>
> > charToRaw(output)
>  [1] 02 ff fe 61 70 70 6c 65 03 ff fe
>
> FWIW I see the same output in R-patched and R-devel. Any idea what
> might be going on? For what it's worth, I don't see the same issue
> with R as run from the terminal.
>
> Thanks,
> Kevin
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] MARGIN in base::unique.matrix() and base::unique.array()

2018-07-18 Thread Kurt Hornik
> Hervé Pagès writes:

Thanks for spotting this.

With c74978 I just committed, we now get

R> unique(matrix(1:10, ncol=2), MARGIN=1:3)
Error in unique.matrix(matrix(1:10, ncol = 2), MARGIN = 1:3) : 
  MARGIN = 1,2,3 is invalid for dim = 5,2
Calls: unique -> unique.matrix
R> unique(matrix(1:10, ncol=2), MARGIN=3)
Error in unique.matrix(matrix(1:10, ncol = 2), MARGIN = 3) : 
  MARGIN = 3 is invalid for dim = 5,2
Calls: unique -> unique.matrix

Best
-k

> Hi,
> The man page for base::unique.matrix() and base::unique.array() says
> that MARGIN is expected to be a single integer. OTOH the code in charge
> of checking the user supplied MARGIN is:

>  if (length(MARGIN) > ndim || any(MARGIN > ndim))
>  stop(gettextf("MARGIN = %d is invalid for dim = %d",
>  MARGIN, dx), domain = NA)

> which doesn't really make sense.

> As a consequence the user gets an obscure error message when specifying
> a MARGIN that satisfies the above check but is in fact invalid:

>> unique(matrix(1:10, ncol=2), MARGIN=1:2)
>Error in args[[MARGIN]] <- !duplicated.default(temp, fromLast = 
> fromLast,  :
>  object of type 'symbol' is not subsettable

> Also the code used by the above check to generate the error message
> is broken:

>> unique(matrix(1:10, ncol=2), MARGIN=1:3)
>Error in sprintf(gettext(fmt, domain = domain), ...) :
>  arguments cannot be recycled to the same length

>> unique(matrix(1:10, ncol=2), MARGIN=3)
>Error in unique.matrix(matrix(1:10, ncol = 2), MARGIN = 3) :
>  c("MARGIN = 3 is invalid for dim = 5", "MARGIN = 3 is invalid for 
> dim = 2")

> Thanks,
> H.

> -- 
> Hervé Pagès

> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024

> E-mail: hpa...@fredhutch.org
> Phone:  (206) 667-5791
> Fax:(206) 667-1319

> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] base::mean not consistent about NA/NaN

2018-07-18 Thread Tomas Kalibera
Yes, the performance overhead of fixing this at R level would be too 
large and it would complicate the code significantly. The result of 
binary operations involving NA and NaN is hardware dependent (the 
propagation of NaN payload) - on some hardware, it actually works the 
way we would like - NA is returned - but on some hardware you get NaN or 
sometimes NA and sometimes NaN. Also there are C compiler optimizations 
re-ordering code, as mentioned in ?NaN. Then there are also external 
numerical libraries that do not distinguish NA from NaN (NA is an R 
concept). So I am afraid this is unfixable. The disclaimer mentioned by 
Duncan is in ?NaN/?NA, which I think is ok - there are so many numerical 
functions through which one might run into these problems that it would 
be infeasible to document them all. Some functions in fact will preserve 
NA, and we would not let NA turn into NaN unnecessarily, but the 
disclaimer says it is something not to depend on.


Tomas

On 07/03/2018 11:12 AM, Jan Gorecki wrote:

Thank you for interesting examples.
I would find useful to document this behavior also in `?mean`, while `+`
operator is also affected, the `sum` function is not.
For mean, NA / NaN could be handled in loop in summary.c. I assume that
performance penalty of fix is the reason why this inconsistency still
exists.
Jan

On Mon, Jul 2, 2018 at 8:28 PM, Barry Rowlingson <
b.rowling...@lancaster.ac.uk> wrote:


And for a starker example of this (documented) inconsistency,
arithmetic addition is not commutative:

  > NA + NaN
  [1] NA
  > NaN + NA
  [1] NaN



On Mon, Jul 2, 2018 at 5:32 PM, Duncan Murdoch 
wrote:

On 02/07/2018 11:25 AM, Jan Gorecki wrote:

Hi,
base::mean is not consistent in terms of handling NA/NaN.
Mean should not depend on order of its arguments while currently it is.

The result of mean() can depend on the order even with regular numbers.
For example,

  > x <- rep(c(1, 10^(-15)), 100)
  > mean(sort(x)) - 0.5
[1] 5.551115e-16
  > mean(rev(sort(x))) - 0.5
[1] 0



  mean(c(NA, NaN))
  #[1] NA
  mean(c(NaN, NA))
  #[1] NaN

I created issue so in case of no replies here status of it can be

looked up

at:
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17441

The help page for ?NaN says,

"Computations involving NaN will return NaN or perhaps NA: which of
those two is not guaranteed and may depend on the R platform (since
compilers may re-order computations)."

And ?NA says,

"Numerical computations using NA will normally result in NA: a possible
exception is where NaN is also involved, in which case either might
result (which may depend on the R platform). "

So I doubt if this inconsistency will be fixed.

Duncan Murdoch


Best,
Jan

   [[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] missing news entry?

2018-07-18 Thread Benjamin Tyner

Hi,

Unless I am mistaken, this enhancement to gc():


r73749 | luke | 2017-11-18 13:26:25 -0500 (Sat, 18 Nov 2017) | 2 lines

Added 'full' argument to gc() with default 'TRUE' for now.



appears to be lacking an entry in doc/NEWS.Rd. Just FYI, in case there 
is capacity to add one.


Regards

Ben

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel