[Rd] Appetite for eliminating dependency on Perl
Preamble: I am in no way opposed to Perl in general - I love Perl and probably always will. R currently has Perl as both a build-time and run-time dependency. This adds about 200 Mb, give or take, to the required environment size (as measured in CentOS - looks like it might be a bit smaller in Ubuntu?). Not such a huge deal, really, but the actual benefit R gets from the dependency is quite small. From my poking around in the R sources (using `git grep -P '\bperl\b(?! ?= ?(?:TRUE|FALSE))' ` as a filter), it looks like it's only used in the following nooks & crannies: * tools/help2man.pl * tools/install-info.pl * configure: INSTALL_INFO="perl \$(top_srcdir)/tools/install-info.pl" * m4/R.m4: INSTALL_INFO="perl \$(top_srcdir)/tools/install-info.pl" Ultimately that's only two scripts. `help2man.pl` seems like it's part of the build process, but not used at runtime. `install-info.pl` seems like maybe it's runnable at runtime, but requires user initiation to run, at which point the user is expected to have perl installed. Either one of them could probably be ported to another language pretty easily, maybe even R. Anything else I missed? If someone were to volunteer the porting work, would there be any appetite for eliminating the dependency? -Ken [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Bug filed on unzip() function
Hi, A few days ago I filed a bug report on the unzip() function: https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14462 I haven't gotten any comments yet, so I thought I'd ask for comments here. I also see on the description of R-devel that the list "also receives all (filtered, i.e. non-spam!) bug reports from R-bugs", but I don't see it here. Eventually I would like to help unzip() gain large-file support, such as is offered by http://info-zip.org/UnZip.html version 6.0. A corresponding zip() function would be nice too. Thanks. -Ken __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Bug filed on unzip() function
On Thu, Dec 23, 2010 at 11:22 PM, Marc Schwartz wrote: > Also, I don't know what the typical response time has been on Bugzilla once > a bug report is filed. Perhaps something could be noted there so that bug > reporters might have some expectation that a comment/reply might be > forthcoming within X days of filing. After that time frame, some recommended > form of follow up communication could take place as a tickler/reminder of > sorts. > Well, as a concrete data point - nobody's yet commented on the bug report, or on this list, about the original issue I brought up: https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14462 I haven't filed bug reports before, but in your experience does Warnocking like this happen frequently? -Ken [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] [patch] giving library() a 'version' argument
I've made a small enhancement to R that would help developers better control what versions of code we're using where. Basically, to load a package in R, one currently does: library(whateverPackage) and with the enhancement, you can ensure that you're getting at least version X of the package: library(whateverPackage, version=3.14) Reasons one might want this include: * you know that in version X some bug was fixed * you know that in version X some feature was added * that's the first version you've actually tested it with & you don't want to vouch for earlier versions without testing * you develop on one machine & deploy on another machine you don't control, and you want runtime checks that the sysadmin installed what they were supposed to install In general, I have an interest in helping R get better at various things that would help it play in a "production environment", for various values of that term. =) The attached patch is made against revision 58980 of https://svn.r-project.org/R/trunk . I think this is the first patch I've submitted to the R core, so please let me know if anything's amiss, or of course if there are reservations about the approach. Thanks. -- Ken Williams, Senior Research Scientist WindLogics http://windlogics.com CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution of any kind is strictly prohibited. If you are not the intended recipient, please contact the sender via reply e-mail and destroy all copies of the original message. Thank you. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [patch] giving library() a 'version' argument
Apparently the patch file got eaten. Let me try again with a .txt extension. -Ken > -Original Message- > From: Ken Williams > Sent: Wednesday, April 11, 2012 10:28 AM > To: r-devel@r-project.org > Subject: [patch] giving library() a 'version' argument > > I've made a small enhancement to R that would help developers better > control what versions of code we're using where. > [...] CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution of any kind is strictly prohibited. If you are not the intended recipient, please contact the sender via reply e-mail and destroy all copies of the original message. Thank you. Index: src/library/base/man/library.Rd === --- src/library/base/man/library.Rd (revision 58980) +++ src/library/base/man/library.Rd (working copy) @@ -21,7 +21,7 @@ character.only = FALSE, logical.return = FALSE, warn.conflicts = TRUE, quietly = FALSE, keep.source = getOption("keep.source.pkgs"), -verbose = getOption("verbose")) +verbose = getOption("verbose"), version) require(package, lib.loc = NULL, quietly = FALSE, warn.conflicts = TRUE, @@ -59,6 +59,9 @@ \item{quietly}{a logical. If \code{TRUE}, no message confirming package loading is printed, and most often, no errors/warnings are printed if package loading fails.} + \item{version}{the minimum acceptable version of the package to load. +If a lesser version is found, the package will not be loaded and an +exception will be thrown.} } \details{ \code{library(package)} and \code{require(package)} both load the @@ -189,6 +192,10 @@ search()# "splines", too detach("package:splines") +# To require a specific minimum version: +library(splines, '2.14') +detach("package:splines") + # if the package name is in a character vector, use pkg <- "splines" library(pkg, character.only = TRUE) Index: src/library/base/R/library.R === --- src/library/base/R/library.R(revision 58980) +++ src/library/base/R/library.R(working copy) @@ -32,7 +32,7 @@ function(package, help, pos = 2, lib.loc = NULL, character.only = FALSE, logical.return = FALSE, warn.conflicts = TRUE, quietly = FALSE, keep.source = getOption("keep.source.pkgs"), - verbose = getOption("verbose")) + verbose = getOption("verbose"), version) { if (!missing(keep.source)) warning("'keep.source' is deprecated and will be ignored") @@ -276,6 +276,11 @@ stop(gettextf("%s is not a valid installed package", sQuote(package)), domain = NA) pkgInfo <- readRDS(pfile) +if (!missing(version)) { +pver <- pkgInfo$DESCRIPTION["Version"] +if (compareVersion(pver, as.character(version)) < 0) +stop("Version ", version, " of '", package, "' required, but only ", pver, " is available") +} testRversion(pkgInfo, package, pkgpath) ## avoid any bootstrapping issues by these exemptions if(!package %in% c("datasets", "grDevices", "graphics", "methods", @@ -332,10 +337,18 @@ stop(gettextf("package %s does not have a NAMESPACE and should be re-installed", sQuote(package)), domain = NA) } - if (verbose && !newpackage) -warning(gettextf("package %s already present in search()", - sQuote(package)), domain = NA) +if (!newpackage) { + if (verbose) + warning(gettextf("package %s already present in search()", +sQuote(package)), domain = NA) + if (!missing(version)) { + pver <- packageVersion(package) + if (compareVersion(as.character(pver), as.character(version)) < 0) + stop("Version ", version, " of '", package,"' required, ", +"but a lesser version ", pver, " is already loaded") + } + } } else if(!missing(help)) { if(!character.only) __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [patch] giving library() a 'version' argument
> -Original Message- > From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk] > Sent: Thursday, April 12, 2012 7:54 AM > To: Duncan Murdoch > Cc: Ken Williams; r-devel@r-project.org > Subject: Re: [Rd] [patch] giving library() a 'version' argument > > A very important point is that library() *had* a 'version' argument for > several > years, and this is not what it did. That is unfortunate. So such a mechanism would need to use a different argument name. For completeness in this thread, I dug up the fact that it seems to have been removed in the 2.9.0 release: o Support for versioned installs (R CMD INSTALL --with-package-versions and install.packages(installWithVers = TRUE)) has been removed. Packages installed with versioned names will be ignored. I'll address Duncan's concerns in a separate message. -Ken CONFIDENTIALITY NOTICE: This e-mail message is for the s...{{dropped:7}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [patch] giving library() a 'version' argument
> -Original Message- > From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com] > Sent: Thursday, April 12, 2012 7:22 AM > To: Ken Williams > Cc: r-devel@r-project.org > Subject: Re: [Rd] [patch] giving library() a 'version' argument > > On 12-04-11 11:28 AM, Ken Williams wrote: > > > > Reasons one might want this include: > > > >* you know that in version X some bug was fixed > >* you know that in version X some feature was added > >* that's the first version you've actually tested it with & you don't > > want to > > vouch for earlier versions without testing > >* you develop on one machine & deploy on another machine you don't > > control, and you want runtime checks that the sysadmin installed what > > they were supposed to install > > I don't really see the need for this. Packages already have a scheme for > requiring a particular version of a package, so this would only be useful in > scripts run outside of packages. The main distinction here is that the existing package mechanism enforces version requirements at *install* time, but this mechanism enforces it at *run* time. So this indeed applies well to scripts run outside packages, but it's also useful inside packages when they're loading their dependencies at runtime. I was trying to illustrate that with the 4 bullet points above (especially the last one) but I should have said so explicitly. It can happen very easily that constraints that were satisfied at install time get out of whack by subsequent package installations, but the violations go undetected. The result can be breakage, whether dramatic or subtle. The main hats targeted here are really people (like me, of course) who are trying to "productionize" results, not so much people who are doing offline analysis. In a production system > But what if your script requires a particular > (perhaps obsolete) version of a package? This change only puts a lower > bound on the version number, and version requirements can be more > elaborate than that. Certainly true; this was meant as a first iteration, and support for the more elaborate requirements specifications could certainly be added. The more elaborate specs actually illustrate the need for a runtime mechanism nicely - if code X (which may be a package, or a script, it doesn't matter) requires exactly version 3.14 of package B, and someone in the production team upgrades version 3.14 to version 3.78 because "it's faster" or "it's less buggy" or "we just like to have the latest version of everything all the time", then someone needs to be alerted to the problem. One alternative solution would be to use a full-fledged package management system like RPM or Deb to track all the dependencies, but yikes, that doesn't sound fun. > I think my advice would be: > > 1. Put your code in a package, and use the version specifications there. > > 2. If you must write it in a script, then put a version test at the top, > using packageVersion(). Certainly those are alternatives, but to us they are somewhat unsatisfactory. The first option doesn't help with the crux of the problem, which is runtime enforcement. The second is essentially the same solution I've proposed, but doesn't help anyone outside our organization who has the same problem. -Ken CONFIDENTIALITY NOTICE: This e-mail message is for the s...{{dropped:7}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [patch] giving library() a 'version' argument
> -Original Message- > From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com] > Sent: Thursday, April 12, 2012 12:27 PM > To: Ken Williams > Cc: r-devel@r-project.org > Subject: Re: [Rd] [patch] giving library() a 'version' argument > > I haven't tested it, but according to the documentation in Writing R > Extensions, the dependencies are enforced at the time library() is called. Oh, I hadn't suspected that. I can look into testing that, if it's true then of course that changes this all. I probably won't be able to do that for a few days because I'll be traveling though. I've never noticed a package failing to load at runtime because its prereq-version dependency wasn't met though. > [...] > But a single line at the top of the script would fix this: > > stopifnot(packageVersion("foo") == "3.14") For the most common use case, that would look more like: stopifnot(compareVersion(packageVersion("foo"), "3.14") < 0) which gets less declarative, and I'd argue less clear about exactly what it's trying to enforce. And I can see myself (& presumably others) getting that comparison operator backwards a lot, having to look it up each time or copy-paste it from other code. And then that still doesn't add nice error messages, that would be yet more code. *And*, it doesn't actually behave correctly if the package is already loaded by other code, because it might have been loaded from a different location than the one that would be found in the packageVersion() call. (Or am I maybe wrong about what packageVersion() does in that case? I don't think the docs specify that behavior.) For prior art on this whole concept, a useful precedent is the 'use()' function in Perl, which accepts a version argument, even though there is also robust version checking at installation/testing time. > > Another problem with putting this into library() is that packages aren't > always loaded by library(): there is require(), and there are implicit > loads triggered by dependencies of other packages. That's not really a problem. If someone wants to enforce a runtime dependency, they stick the enforcement line into their code, and it will correctly stop if the criterion is not met. -Ken CONFIDENTIALITY NOTICE: This e-mail message is for the s...{{dropped:7}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [patch] giving library() a 'version' argument
> -Original Message- > From: Roebuck,Paul L [mailto:proeb...@mdanderson.org] > Sent: Thursday, April 12, 2012 1:03 PM > To: R-devel > Cc: Ken Williams > Subject: Re: [Rd] [patch] giving library() a 'version' argument > > On 4/12/12 10:11 AM, Ken Williams wrote: > > >> On 4/12/12 7:22 AM, Duncan Murdoch wrote: > > [SNIP] > > ... > > The main hats targeted here are really people (like me, of course) who > > are trying to "productionize" results, not so much people who are > > doing offline analysis. In a production system > > > >> But what if your script requires a particular (perhaps obsolete) > >> version of a package? This change only puts a lower bound on the > >> version number, and version requirements can be more elaborate than > >> that. > > > > Certainly true; this was meant as a first iteration, and support for > > the more elaborate requirements specifications could certainly be added. > > > > The more elaborate specs actually illustrate the need for a runtime > > mechanism nicely - if code X (which may be a package, or a script, it > > doesn't matter) requires exactly version 3.14 of package B, and > > someone in the production team upgrades version 3.14 to version 3.78 > > because "it's faster" or "it's less buggy" or "we just like to have > > the latest version of everything all the time", then someone needs to > > be alerted to the problem. One alternative solution would be to use a > > full-fledged package management system like RPM or Deb to track all the > dependencies, but yikes, that doesn't sound fun. > > I appreciate your contribution of both time and energy. > > But I think the existing library() method is sufficient without this > modification. > It's essentially syntactic sugar for: > > library(MASS); stopifnot(packageVersion("MASS") >= "7.3")) I was about to write back & say "that's not correct, if '7.10' is installed, a string comparison will do the wrong thing." But apparently it does the *right* thing, because 'numeric_version' class implements the comparison operator. I'd still prefer to "Huffman-code it" to something shorter, to encourage people to use it, but I can see why others could consider it good enough. I could contribute a doc patch to the 'numeric_version' man page to make it clearer what's available. The 3 comparisons there happen to turn out the same way when done as a string comparison. I also do still have a question about what packageVersion() does when a package is already loaded - does it go look for it again, or does it check the version of what's already loaded? A doc patch could help here too. -Ken CONFIDENTIALITY NOTICE: This e-mail message is for the s...{{dropped:7}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [patch] giving library() a 'version' argument
From: Martin Maechler [maech...@stat.math.ethz.ch] > Indeed nowadays, packageDescription() *) *does* > use the correct package version, by inspecting the "path" > attribute of the package, in the same way as > searchpaths() Yeah, that's what I suspected, but only from reading the code of packageDescription(). It doesn't seem to mention this in the docs. And I wasn't 100% confident from reading the code, in case it was a 'promise' or something like that. I'm willing to write a doc patch but it'll take a few days, I'm traveling. --- Ken Williams, Senior Research Scientist Applied Mathematics Group, WindLogics Inc. CONFIDENTIALITY NOTICE: This e-mail message is for the s...{{dropped:7}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] data.frame() args in transform()
Starting in SVN revision 47035 (which shows up in the R-2-9-0 line), transform.data.frame() started accepting arguments like 'row.names' and 'stringsAsFactors' to be passed through to the data.frame() function. It looks like this was an unintentional side-effect of letting multiple columns be added properly. Given that this has been implemented for quite a while, should it now be documented? It's a little strange to support it though - the 'stringsAsFactors' argument might be handy, but 'check.names', 'row.names', and 'check.rows' are perhaps questionable. If it's desirable to document the behavior, here's a possible patch. - Index: src/library/base/man/transform.Rd === --- src/library/base/man/transform.Rd (revision 61043) +++ src/library/base/man/transform.Rd (working copy) @@ -27,6 +27,10 @@ \code{_data}. The tags are matched against \code{names(_data)}, and for those that match, the value replace the corresponding variable in \code{_data}, and the others are appended to \code{_data}. + + \code{transform.data.frame} also accepts the additional named + arguments that the \code{data.frame} function accepts, + e.g. \code{stringsAsFactors}. } \value{ The modified value of \code{_data}. - -- Ken Williams, Senior Research Scientist WindLogics http://windlogics.com CONFIDENTIALITY NOTICE: This e-mail message is for the s...{{dropped:10}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Doc patch for Sys.time and system.time
Here’s a patch that adds ‘seealso’ entries to Sys.time and system.time docs, to help people who forget what the distinction is between them. Patch was made against https://svn.r-project.org/R/trunk@61454 . -Ken __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Doc patch for Sys.time and system.time
Duncan noticed that either the sending server (Gmail - shouldn't be the case) or receiving server stripped out the attachment. Here it is again, inline. -Ken === >From 99766dd8f16804ecddc73f6169be3e42b916b8fa Mon Sep 17 00:00:00 2001 From: Ken Williams Date: Thu, 27 Dec 2012 09:58:21 -0600 Subject: [PATCH] Add system.time link to Sys.time documentation, and vice versa. diff --git a/src/library/base/man/Sys.time.Rd b/src/library/base/man/Sys.time.Rd index d34571b..f0b0c50 100644 --- a/src/library/base/man/Sys.time.Rd +++ b/src/library/base/man/Sys.time.Rd @@ -41,6 +41,8 @@ Sys.Date() string. \code{\link{Sys.timezone}}. + + \code{\link{system.time}} for measuring elapsed/CPU time of expressions. } \examples{\donttest{ Sys.time() diff --git a/src/library/base/man/system.time.Rd b/src/library/base/man/system.time.Rd index 5cd79b7..ad21267 100644 --- a/src/library/base/man/system.time.Rd +++ b/src/library/base/man/system.time.Rd @@ -38,6 +38,8 @@ unix.time(expr, gcFirst = TRUE) } \seealso{ \code{\link{proc.time}}, \code{\link{time}} which is for time series. + + \code{\link{Sys.time}} to get the current date & time. } \examples{ require(stats) -- 1.7.9 === On Thu, Dec 27, 2012 at 10:08 AM, Ken Williams wrote: > Heres a patch that adds seealso entries to Sys.time and system.time > docs, to help people who forget what the distinction is between them. > > Patch was made against https://svn.r-project.org/R/trunk@61454 . > > -Ken [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel