Re: [Rd] Output mis-encoded on Windows w/ RGui 3.5.1 in strange case
Thanks, I can now reproduce and it is a bug that is easy to fix, I will do so shortly. Fyi it can be reproduced simply by running these two lines in Rgui: list() encodeString("apple") Best Tomas On 07/17/2018 05:16 PM, Kevin Ushey wrote: > Sorry, I should have been more clear -- if I write the contents of > that script to a file called 'encoding.R' and source that, then I see > the reported behavior. > > Here's something standalone that you should hopefully be able to copy > + paste into RGui to reproduce: > > code <- ' > x <- 1 > print(list()) > save(x, file = tempfile()) > output <- encodeString("apple") > print(output) > ' > > file <- tempfile(fileext = ".R") > writeLines(code, con = file) > source(file) > > When I run this, I see: > >> code <- ' > +x <- 1 > +print(list()) > +save(x, file = tempfile()) > +output <- encodeString("apple") > +print(output) > + ' >> file <- tempfile(fileext = ".R") >> writeLines(code, con = file) >> source(file) > list() > [1] "\002ÿþapple\003ÿþ" > > This is with today's R-devel: > >> sessionInfo() > R Under development (unstable) (2018-07-16 r74967) > Platform: x86_64-w64-mingw32/x64 (64-bit) > Running under: Windows 10 x64 (build 17134) > > Matrix products: default > > locale: > [1] LC_COLLATE=English_United States.1252 > [2] LC_CTYPE=English_United States.1252 > [3] LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] compiler_3.6.0 > > I realize the example looks incomplete, but it seems like each step is > required to reproduce the strange behavior: > > 1) You need to print an empty list, > 2) You need to invoke save() after printing that empty list, > 3) Then, attempts to call encodeString() will produce the strange output. > > For what it's worth, it may be related to a behavior I'm seeing where > the first name printed for an R list is quoted with backticks even > when not necessary: > >> list(x = 1, y = 2) > $`x` > [1] 1 > > $y > [1] 2 > > Thanks, > Kevin > > On Tue, Jul 17, 2018 at 6:12 AM Tomas Kalibera > wrote: >> Hi Kevin, >> >> the extra bytes you are seeing are escapes for UTF-8 strings used in >> input to RGui console. Recently ascii strings are converted to UTF-8 so >> you would get these escapes for ascii strings now as well. RGui >> understands these escapes and converts from UTF-8 to wide characters >> before printing on Windows. The escapes should not be used unless >> printing to RGui console. >> >> I suppose you managed to leak the escapes but I cannot reproduce, the >> example you sent seems incomplete ("x" not used, not clear what >> encoding.R is, not clear where the encodeString is run) and none of the >> variations I ran leaked the escapes on R-devel. Please clarify the >> example if you believe it is a bug. Please also use current R-devel >> (I've relatively recently fixed a bug in decoding these escaped strings, >> perhaps unlikely, but not impossible it could be related). >> >> Best >> Tomas >> >> On 07/16/2018 10:01 PM, Kevin Ushey wrote: >>> Given the following R script: >>> >>> x <- 1 >>> print(list()) >>> save(x, file = tempfile()) >>> output <- encodeString("apple") >>> print(output) >>> >>> If I source this script from RGui on Windows, I see the output: >>> >>> > source("encoding.R") >>> list() >>> [1] "\002ÿþapple\003ÿþ" >>> >>> That is, it's as though R has injected what looks like byte order >>> marks around the encoded string: >>> >>> > charToRaw(output) >>> [1] 02 ff fe 61 70 70 6c 65 03 ff fe >>> >>> FWIW I see the same output in R-patched and R-devel. Any idea what >>> might be going on? For what it's worth, I don't see the same issue >>> with R as run from the terminal. >>> >>> Thanks, >>> Kevin >>> >>> __ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >> [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Output mis-encoded on Windows w/ RGui 3.5.1 in strange case
Fixed in R-devel and R-patched, Tomas On 07/18/2018 12:03 PM, Tomas Kalibera wrote: > Thanks, I can now reproduce and it is a bug that is easy to fix, I > will do so shortly. > > Fyi it can be reproduced simply by running these two lines in Rgui: > > list() > encodeString("apple") > > Best > Tomas > > On 07/17/2018 05:16 PM, Kevin Ushey wrote: >> Sorry, I should have been more clear -- if I write the contents of >> that script to a file called 'encoding.R' and source that, then I see >> the reported behavior. >> >> Here's something standalone that you should hopefully be able to copy >> + paste into RGui to reproduce: >> >> code <- ' >> x <- 1 >> print(list()) >> save(x, file = tempfile()) >> output <- encodeString("apple") >> print(output) >> ' >> >> file <- tempfile(fileext = ".R") >> writeLines(code, con = file) >> source(file) >> >> When I run this, I see: >> >>> code <- ' >> +x <- 1 >> +print(list()) >> +save(x, file = tempfile()) >> +output <- encodeString("apple") >> +print(output) >> + ' >>> file <- tempfile(fileext = ".R") >>> writeLines(code, con = file) >>> source(file) >> list() >> [1] "\002ÿþapple\003ÿþ" >> >> This is with today's R-devel: >> >>> sessionInfo() >> R Under development (unstable) (2018-07-16 r74967) >> Platform: x86_64-w64-mingw32/x64 (64-bit) >> Running under: Windows 10 x64 (build 17134) >> >> Matrix products: default >> >> locale: >> [1] LC_COLLATE=English_United States.1252 >> [2] LC_CTYPE=English_United States.1252 >> [3] LC_MONETARY=English_United States.1252 >> [4] LC_NUMERIC=C >> [5] LC_TIME=English_United States.1252 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> loaded via a namespace (and not attached): >> [1] compiler_3.6.0 >> >> I realize the example looks incomplete, but it seems like each step is >> required to reproduce the strange behavior: >> >> 1) You need to print an empty list, >> 2) You need to invoke save() after printing that empty list, >> 3) Then, attempts to call encodeString() will produce the strange output. >> >> For what it's worth, it may be related to a behavior I'm seeing where >> the first name printed for an R list is quoted with backticks even >> when not necessary: >> >>> list(x = 1, y = 2) >> $`x` >> [1] 1 >> >> $y >> [1] 2 >> >> Thanks, >> Kevin >> >> On Tue, Jul 17, 2018 at 6:12 AM Tomas Kalibera >> wrote: >>> Hi Kevin, >>> >>> the extra bytes you are seeing are escapes for UTF-8 strings used in >>> input to RGui console. Recently ascii strings are converted to UTF-8 so >>> you would get these escapes for ascii strings now as well. RGui >>> understands these escapes and converts from UTF-8 to wide characters >>> before printing on Windows. The escapes should not be used unless >>> printing to RGui console. >>> >>> I suppose you managed to leak the escapes but I cannot reproduce, the >>> example you sent seems incomplete ("x" not used, not clear what >>> encoding.R is, not clear where the encodeString is run) and none of the >>> variations I ran leaked the escapes on R-devel. Please clarify the >>> example if you believe it is a bug. Please also use current R-devel >>> (I've relatively recently fixed a bug in decoding these escaped strings, >>> perhaps unlikely, but not impossible it could be related). >>> >>> Best >>> Tomas >>> >>> On 07/16/2018 10:01 PM, Kevin Ushey wrote: Given the following R script: x <- 1 print(list()) save(x, file = tempfile()) output <- encodeString("apple") print(output) If I source this script from RGui on Windows, I see the output: > source("encoding.R") list() [1] "\002ÿþapple\003ÿþ" That is, it's as though R has injected what looks like byte order marks around the encoded string: > charToRaw(output) [1] 02 ff fe 61 70 70 6c 65 03 ff fe FWIW I see the same output in R-patched and R-devel. Any idea what might be going on? For what it's worth, I don't see the same issue with R as run from the terminal. Thanks, Kevin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel > > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Tiny bug in lm()?
> Brett Presnell > on Sun, 24 Jun 2018 13:57:04 +0100 writes: > I meant ncol(y) of course. > Brett Presnell writes: >> I suppose that this never affects anything, but in line >> 57 of lm.R, where the coefficients are defined for an >> empty model, when y is a matrix, shouldn't the value be >> matrix(,0,nrow(y)) rather than matrix(,0,3)? Yes ("ncol(y)") and actually it should be 'double', not 'logical'. OTOH: Multivariate empty models are probably pretty rare, so no wonder this has never been reported the last 13.374 years this has been in the sources. Of course, I will still fix it. Thank you, Brett, for reporting! Martin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Output mis-encoded on Windows w/ RGui 3.5.1 in strange case
Thank you for the quick fix! I could've sworn the 'save()' dance was a necessary part of the reproducible example, but evidently not ... On Wed, Jul 18, 2018 at 6:38 AM Tomas Kalibera wrote: > > Fixed in R-devel and R-patched, > Tomas > > On 07/18/2018 12:03 PM, Tomas Kalibera wrote: > > Thanks, I can now reproduce and it is a bug that is easy to fix, I will do so > shortly. > > Fyi it can be reproduced simply by running these two lines in Rgui: > > list() > encodeString("apple") > > Best > Tomas > > On 07/17/2018 05:16 PM, Kevin Ushey wrote: > > Sorry, I should have been more clear -- if I write the contents of > that script to a file called 'encoding.R' and source that, then I see > the reported behavior. > > Here's something standalone that you should hopefully be able to copy > + paste into RGui to reproduce: > > code <- ' >x <- 1 >print(list()) >save(x, file = tempfile()) >output <- encodeString("apple") >print(output) > ' > > file <- tempfile(fileext = ".R") > writeLines(code, con = file) > source(file) > > When I run this, I see: > > code <- ' > > +x <- 1 > +print(list()) > +save(x, file = tempfile()) > +output <- encodeString("apple") > +print(output) > + ' > > file <- tempfile(fileext = ".R") > writeLines(code, con = file) > source(file) > > list() > [1] "\002ÿþapple\003ÿþ" > > This is with today's R-devel: > > sessionInfo() > > R Under development (unstable) (2018-07-16 r74967) > Platform: x86_64-w64-mingw32/x64 (64-bit) > Running under: Windows 10 x64 (build 17134) > > Matrix products: default > > locale: > [1] LC_COLLATE=English_United States.1252 > [2] LC_CTYPE=English_United States.1252 > [3] LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] compiler_3.6.0 > > I realize the example looks incomplete, but it seems like each step is > required to reproduce the strange behavior: > >1) You need to print an empty list, >2) You need to invoke save() after printing that empty list, >3) Then, attempts to call encodeString() will produce the strange output. > > For what it's worth, it may be related to a behavior I'm seeing where > the first name printed for an R list is quoted with backticks even > when not necessary: > > list(x = 1, y = 2) > > $`x` > [1] 1 > > $y > [1] 2 > > Thanks, > Kevin > > On Tue, Jul 17, 2018 at 6:12 AM Tomas Kalibera > wrote: > > Hi Kevin, > > the extra bytes you are seeing are escapes for UTF-8 strings used in > input to RGui console. Recently ascii strings are converted to UTF-8 so > you would get these escapes for ascii strings now as well. RGui > understands these escapes and converts from UTF-8 to wide characters > before printing on Windows. The escapes should not be used unless > printing to RGui console. > > I suppose you managed to leak the escapes but I cannot reproduce, the > example you sent seems incomplete ("x" not used, not clear what > encoding.R is, not clear where the encodeString is run) and none of the > variations I ran leaked the escapes on R-devel. Please clarify the > example if you believe it is a bug. Please also use current R-devel > (I've relatively recently fixed a bug in decoding these escaped strings, > perhaps unlikely, but not impossible it could be related). > > Best > Tomas > > On 07/16/2018 10:01 PM, Kevin Ushey wrote: > > Given the following R script: > > x <- 1 > print(list()) > save(x, file = tempfile()) > output <- encodeString("apple") > print(output) > > If I source this script from RGui on Windows, I see the output: > > > source("encoding.R") > list() > [1] "\002ÿþapple\003ÿþ" > > That is, it's as though R has injected what looks like byte order > marks around the encoded string: > > > charToRaw(output) > [1] 02 ff fe 61 70 70 6c 65 03 ff fe > > FWIW I see the same output in R-patched and R-devel. Any idea what > might be going on? For what it's worth, I don't see the same issue > with R as run from the terminal. > > Thanks, > Kevin > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] MARGIN in base::unique.matrix() and base::unique.array()
> Hervé Pagès writes: Thanks for spotting this. With c74978 I just committed, we now get R> unique(matrix(1:10, ncol=2), MARGIN=1:3) Error in unique.matrix(matrix(1:10, ncol = 2), MARGIN = 1:3) : MARGIN = 1,2,3 is invalid for dim = 5,2 Calls: unique -> unique.matrix R> unique(matrix(1:10, ncol=2), MARGIN=3) Error in unique.matrix(matrix(1:10, ncol = 2), MARGIN = 3) : MARGIN = 3 is invalid for dim = 5,2 Calls: unique -> unique.matrix Best -k > Hi, > The man page for base::unique.matrix() and base::unique.array() says > that MARGIN is expected to be a single integer. OTOH the code in charge > of checking the user supplied MARGIN is: > if (length(MARGIN) > ndim || any(MARGIN > ndim)) > stop(gettextf("MARGIN = %d is invalid for dim = %d", > MARGIN, dx), domain = NA) > which doesn't really make sense. > As a consequence the user gets an obscure error message when specifying > a MARGIN that satisfies the above check but is in fact invalid: >> unique(matrix(1:10, ncol=2), MARGIN=1:2) >Error in args[[MARGIN]] <- !duplicated.default(temp, fromLast = > fromLast, : > object of type 'symbol' is not subsettable > Also the code used by the above check to generate the error message > is broken: >> unique(matrix(1:10, ncol=2), MARGIN=1:3) >Error in sprintf(gettext(fmt, domain = domain), ...) : > arguments cannot be recycled to the same length >> unique(matrix(1:10, ncol=2), MARGIN=3) >Error in unique.matrix(matrix(1:10, ncol = 2), MARGIN = 3) : > c("MARGIN = 3 is invalid for dim = 5", "MARGIN = 3 is invalid for > dim = 2") > Thanks, > H. > -- > Hervé Pagès > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > E-mail: hpa...@fredhutch.org > Phone: (206) 667-5791 > Fax:(206) 667-1319 > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] base::mean not consistent about NA/NaN
Yes, the performance overhead of fixing this at R level would be too large and it would complicate the code significantly. The result of binary operations involving NA and NaN is hardware dependent (the propagation of NaN payload) - on some hardware, it actually works the way we would like - NA is returned - but on some hardware you get NaN or sometimes NA and sometimes NaN. Also there are C compiler optimizations re-ordering code, as mentioned in ?NaN. Then there are also external numerical libraries that do not distinguish NA from NaN (NA is an R concept). So I am afraid this is unfixable. The disclaimer mentioned by Duncan is in ?NaN/?NA, which I think is ok - there are so many numerical functions through which one might run into these problems that it would be infeasible to document them all. Some functions in fact will preserve NA, and we would not let NA turn into NaN unnecessarily, but the disclaimer says it is something not to depend on. Tomas On 07/03/2018 11:12 AM, Jan Gorecki wrote: Thank you for interesting examples. I would find useful to document this behavior also in `?mean`, while `+` operator is also affected, the `sum` function is not. For mean, NA / NaN could be handled in loop in summary.c. I assume that performance penalty of fix is the reason why this inconsistency still exists. Jan On Mon, Jul 2, 2018 at 8:28 PM, Barry Rowlingson < b.rowling...@lancaster.ac.uk> wrote: And for a starker example of this (documented) inconsistency, arithmetic addition is not commutative: > NA + NaN [1] NA > NaN + NA [1] NaN On Mon, Jul 2, 2018 at 5:32 PM, Duncan Murdoch wrote: On 02/07/2018 11:25 AM, Jan Gorecki wrote: Hi, base::mean is not consistent in terms of handling NA/NaN. Mean should not depend on order of its arguments while currently it is. The result of mean() can depend on the order even with regular numbers. For example, > x <- rep(c(1, 10^(-15)), 100) > mean(sort(x)) - 0.5 [1] 5.551115e-16 > mean(rev(sort(x))) - 0.5 [1] 0 mean(c(NA, NaN)) #[1] NA mean(c(NaN, NA)) #[1] NaN I created issue so in case of no replies here status of it can be looked up at: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17441 The help page for ?NaN says, "Computations involving NaN will return NaN or perhaps NA: which of those two is not guaranteed and may depend on the R platform (since compilers may re-order computations)." And ?NA says, "Numerical computations using NA will normally result in NA: a possible exception is where NaN is also involved, in which case either might result (which may depend on the R platform). " So I doubt if this inconsistency will be fixed. Duncan Murdoch Best, Jan [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] missing news entry?
Hi, Unless I am mistaken, this enhancement to gc(): r73749 | luke | 2017-11-18 13:26:25 -0500 (Sat, 18 Nov 2017) | 2 lines Added 'full' argument to gc() with default 'TRUE' for now. appears to be lacking an entry in doc/NEWS.Rd. Just FYI, in case there is capacity to add one. Regards Ben __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel