Re: [Rd] length of `...`
Does anyone notice r-devel thread "stopifnot() does not stop at first non-TRUE argument" starting with https://stat.ethz.ch/pipermail/r-devel/2017-May/074179.html ? I have mentioned (function(...)nargs())(...) in https://stat.ethz.ch/pipermail/r-devel/2017-May/074294.html . Something like ..elt(n) is switch(n, ...) . I have mentioned it in https://stat.ethz.ch/pipermail/r-devel/2017-May/074270.html . See also response in https://stat.ethz.ch/pipermail/r-devel/2017-May/074282.html . By the way, because 'stopifnot' in R 3.5.0 contains argument other than '...', it might be better to use match.call(expand.dots=FALSE)$... instead of match.call()[-1L] . --- > Joris Meys > on Fri, 4 May 2018 15:37:27 +0200 writes: > The one difference I see, is the necessity to pass the dots to the function > dotlength : > dotlength <- function(...) nargs() > myfun <- function(..., someArg = 1){ > n1 <- ...length() > n2 <- dotlength() > n3 <- dotlength(...) > return(c(n1, n2, n3)) > } > myfun(stop("A"), stop("B"), someArg = stop("c")) > I don't really see immediately how one can replace the C definition with > Hadley's solution without changing how the function has to be used. Yes, of course: nargs() can only be applied to the function inside which it is used, and hence n2 <- dotlength() must therefore be 0. Thank you, Joris > Personally, I have no preference over the use, but changing it now would > break code dependent upon ...length() imho. Unless I'm overlooking > something of course. Yes. OTOH, as it's been very new, one could consider deprecating it, and advertize say, .length(...) instead of ...length() [yes, in spite of the fact that the pure-R solution is slower than a primitive; both are fast enough for all purposes] But such a deprecation cycle typically entails time more writing etc, not something I've time for just these days. Martin > On Fri, May 4, 2018 at 3:02 PM, Martin Maechler > wrote: >> > Hervé Pagès >> > on Thu, 3 May 2018 08:55:20 -0700 writes: >> >> > Hi, >> > It would be great if one of the experts could comment on the >> > difference between Hadley's dotlength and ...length? The fact >> > that someone bothered to implement a new primitive for that >> > when there seems to be a very simple and straightforward R-only >> > solution suggests that there might be some gotchas/pitfalls with >> > the R-only solution. >> >> Namely >> >> > dotlength <- function(...) nargs() >> >> > (This is subtly different from calling nargs() directly as it will >> > only count the elements in ...) >> >> > Hadley >> >> >> Well, I was the "someone". In the past I had seen (and used myself) >> >> length(list(...)) >> >> and of course that was not usable. >> I knew of some substitute() / match.call() tricks [but I think >> did not know Bill's cute substitute(...()) !] at the time, but >> found them too esoteric. >> >> Aditionally and importantly, ...length() and ..elt(n) were >> developed "synchronously", and the R-substitutes for ..elt() >> definitely are less trivial (I did not find one at the time), as >> Duncan's example to Bill's proposal has shown, so I had looked >> at .Primitive() solutions of both. >> >> In hindsight I should have asked here for advice, but may at >> the time I had been a bit frustrated by the results of some of >> my RFCs ((nothing specific in mind !)) >> >> But __if__ there's really no example where current (3.5.0 and newer) >> >> ...length() >> >> differs from Hadley's dotlength() >> I'd vert happy to replace ...length 's C based definition by >> Hadley's beautiful minimal solution. >> >> Martin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Sys.timezone (timedatectl) unnecessarily warns loudly
Dear R-devels, timedatectl binary used by Sys.timezone does not always work reliably. If it doesn't the warning is raised, unnecessarily because later on Sys.timezone gets timezone successfully from /etc/timezone. This obviously might not be true for different linux OSes, but it solves the issue for simple dockerized Ubuntu 16.04. Current behavior R Under development (unstable) (2018-05-04 r74695) -- "Unsuffered Consequences" Sys.timezone() #Failed to create bus connection: No such file or directory #[1] "Etc/UTC" #Warning message: #In system("timedatectl", intern = TRUE) : # running command 'timedatectl' had status 1 There was small discussion where I initially put comment about it in: https://github.com/wch/r-source/commit/9866ac2ad1e2f1c4565ae829ba33b5b98a08d10d#r28867164 Below patch makes timedatectl call silent, both suppressWarnings and ignore.stderr are required to deal with R warning, and warning printed directly to console from timedatectl. diff --git src/library/base/R/datetime.R src/library/base/R/datetime.R index 6b34267936..b81c049f3e 100644 --- src/library/base/R/datetime.R +++ src/library/base/R/datetime.R @@ -73,7 +73,7 @@ Sys.timezone <- function(location = TRUE) ## First try timedatectl: should work on any modern Linux ## as part of systemd (and probably nowhere else) if (nzchar(Sys.which("timedatectl"))) { -inf <- system("timedatectl", intern = TRUE) +inf <- suppressWarnings(system("timedatectl", intern = TRUE, ignore.stderr=TRUE)) ## typical format: ## " Time zone: Europe/London (GMT, +)" ## " Time zone: Europe/Vienna (CET, +0100)" Regards, Jan Gorecki __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] download.file does not process gz files correctly (truncates them?)
Thanks for the comments, feedback, and improvements. I still argue that the current behavior cause more harm than it helps. First of all, it increases the risk for code that does not work on all platforms, which I'd say is one of the strengths and design goals of R. To write cross-platform code, a developer basically needs to specify argument 'mode'. A second problem is that people who work on non-Windows platforms will not be aware of this problem. Yes, adding this Windows-specific behavior to the help on all platforms will help a bit (thanks for doing that). However, since there are so many non-Windows users out there that write documentation, vignettes, blog posts, host classes and workshops, it is quite likely that you'll see things like "Download the data file using `download.file(url, file)` and then ...". Boom, a "beginner" on Windows will have problems and even the non-Windows instructor may not know what's going and quickly lots of time is wasted. A third problem is wasted bandwidth because the same file has to be downloaded a second time. If the default is changed to mode="wb" and someone truly needs mode="w", the penalty should be smaller because such text-based files are likely to be much smaller than binary files, which are often several GiB these days. What could lower the risk for the above,and help the user and helpers, is to give an informative warning whenever 'mode' is not specified, e.g. The file 'NNN' is downloaded as a text file (mode = "w"). If you meant to download it as a binary file, specify mode = "wb". Deprecating the default mode="w" on Windows can be done in steps, e.g. by making the argument mandatory for a while. This could be done on all platforms because we're already all affected, i.e. we need to specify 'mode' to avoid surprises. Even if the default won't change, below are some more comments/observations that is related to the current implementation of download.file() on Windows: ADD MORE EXTENSIONS? What about case-insensitive matching, e.g. data.ZIP and data.Rdata? A quick scan of the R source code suggests that R is also working with the following filename extensions (using various case styles): * Rbin (src/library/tools/R/install.R) * rda, Rda (tests/reg-tests-1a.R) * rdb (src/library/tools/R/install.R) * rds, RDS, Rds (src/library/tools/R/install.R) * rdx (src/library/tools/R/install.R) * RData, Rdata, rdata (src/library/tools/R/install.R) Should the tar extension also be added? What about binary image formats that R produces, e.g. filename extensions bmp, jpg, jpeg, pdf, png, tif, tiff? What about all the other file extensions that we know for sure are binary? VECTORIZATION: For some value of the 'method' argument, the current implementation will download the same file differently depending on other files downloaded at the same time. For example, here a PNG file is downloaded in text mode and its content is translated: > urls <- c("https://www.r-project.org/logo/Rlogo.png";) > download.file(urls, destfile = basename(urls), method = "libcurl") trying URL 'https://www.r-project.org/logo/Rlogo.png' Content length 48148 bytes (47 KB) downloaded 47 KB > file.size(basename(urls)) [1] 48281 But if we throw in a "known" binary extension, the PNG file be downloaded as binary: > urls <- c("https://www.r-project.org/logo/Rlogo.png";, > "https://cran.r-project.org/bin/windows/contrib/3.6/future_1.8.1.zip";) > download.file(urls, destfile = basename(urls), method = "libcurl") trying URL 'https://www.r-project.org/logo/Rlogo.png' trying URL 'https://cran.r-project.org/bin/windows/contrib/3.6/future_1.8.1.zip' > file.size(basename(urls)) [1] 48148 527069 Best, Henrik On Fri, May 4, 2018 at 1:18 AM, Martin Maechler wrote: >> Joris Meys >> on Fri, 4 May 2018 10:00:07 +0200 writes: > > > On Fri, May 4, 2018 at 8:34 AM, Tomas Kalibera > > wrote: > > >> The current heuristic/hack is in line with the > >> compatibility approach: it detects files that are > >> obviously binary, so it changes the default behavior only > >> for cases when it would obviously cause damage. > >> > >> Tomas > > > > Well, I was trying to download a .gz file and > > download.file() didn't detect that. Reason for that is > > obviously that the link doesn't contain .gz but %2Egz , > > using the ASCII code for the dot instead of the dot > > itself. That's general practice in a lot of links. > > > Hence I propose to change the line in download.file() that > > does this check to: > > > if (missing(mode) && length(grep("\\.(gz|bz2|xz|tgz|zip|rda|RData)$", > > URLdecode(url > > > using URLdecode() ensures that .gz, .RData etc will be > > detected correctly in an encoded URL. > > > Cheers Joris > > Makes sense to me and I plan to add it when also adding '.rds' > > { OTOH, after reading the thread about this: Shouldn't you make > your code more robust and use mode = "wb" (or "