On 03/20/2014 01:28 PM, Ted Byers wrote:
On Thu, Mar 20, 2014 at 3:14 PM, Hervé Pagès <hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>> wrote: On 03/20/2014 03:52 AM, Duncan Murdoch wrote: On 14-03-20 2:15 AM, Dan Tenenbaum wrote: ----- Original Message ----- From: "David Winsemius" <dwinsem...@comcast.net <mailto:dwinsem...@comcast.net>> To: "Jeroen Ooms" <jeroen.o...@stat.ucla.edu <mailto:jeroen.o...@stat.ucla.edu>> Cc: "r-devel" <r-devel@r-project.org <mailto:r-devel@r-project.org>> Sent: Wednesday, March 19, 2014 11:03:32 PM Subject: Re: [Rd] [RFC] A case for freezing CRAN On Mar 19, 2014, at 7:45 PM, Jeroen Ooms wrote: On Wed, Mar 19, 2014 at 6:55 PM, Michael Weylandt <michael.weyla...@gmail.com <mailto:michael.weyla...@gmail.com>> wrote: Reading this thread again, is it a fair summary of your position to say "reproducibility by default is more important than giving users access to the newest bug fixes and features by default?" It's certainly arguable, but I'm not sure I'm convinced: I'd imagine that the ratio of new work being done vs reproductions is rather high and the current setup optimizes for that already. I think that separating development from released branches can give us both reliability/reproducibility (stable branch) as well as new features (unstable branch). The user gets to pick (and you can pick both!). The same is true for r-base: when using a 'released' version you get 'stable' base packages that are up to 12 months old. If you want to have the latest stuff you download a nightly build of r-devel. For regular users and reproducible research it is recommended to use the stable branch. However if you are a developer (e.g. package author) you might want to develop/test/check your work with the latest r-devel. I think that extending the R release cycle to CRAN would result both in more stable released versions of R, as well as more freedom for package authors to implement rigorous change in the unstable branch. When writing a script that is part of a production pipeline, or sweave paper that should be reproducible 10 years from now, or a book on using R, you use stable version of R, which is guaranteed to behave the same over time. However when developing packages that should be compatible with the upcoming release of R, you use r-devel which has the latest versions of other CRAN and base packages. As I remember ... The example demonstrating the need for this was an XML package that cause an extract from a website where the headers were misinterpreted as data in one version of pkg:XML and not in another. That seems fairly unconvincing. Data cleaning and validation is a basic task of data analysis. It also seems excessive to assert that it is the responsibility of CRAN to maintain a synced binary archive that will be available in ten years. CRAN already does this, the bin/windows/contrib directory has subdirectories going back to 1.7, with packages dated October 2004. I don't see why it is burdensome to continue to archive these. It would be nice if source versions had a similar archive. The bin/windows/contrib directories are updated every day for active R versions. It's only when Uwe decides that a version is no longer worth active support that he stops doing updates, and it "freezes". A consequence of this is that the snapshots preserved in those older directories are unlikely to match what someone who keeps up to date with R releases is using. Their purpose is to make sure that those older versions aren't completely useless, but they aren't what Jeroen was asking for. But it is almost completely useless from a reproducibility point of view to get random package versions. For example if some people try to use R-2.13.2 today to reproduce an analysis that was published 2 years ago, they'll get Matrix 1.0-4 on Windows, Matrix 1.0-3 on Mac, and Matrix 1.1-2-2 on Unix. And none of them of course is what was used by the authors of the paper (they used Matrix 1.0-1, which is what was current when they ran their analysis). Initially this discussion brought back nightmares of DLL hell on Windows. Those as ancient as I will remember that well. But now, the focus seems to be on reproducibility, but with what strikes me as a seriously flawed notion of what reproducibility means. Herve Pages mentions the risk of irreproducibility across three minor revisions of version 1.0 of Matrix.
If you use R-2.13.2, you get Matrix 1.1-2-2 on Linux. AFAIK this is the most recent version of Matrix, aimed to be compatible with the most current version of R (i.e. R 3.0.3). However, it has never been tested with R-2.13.2. I'm not saying that it should, that would be a big waste of resources of course. All I'm saying it that it doesn't make sense to serve by default a version that is known to be incompatible with the version of R being used. It's very likely to not even install properly. For the apparently small differences between the versions you get on Windows and Mac, the Matrix package was just an example. With other packages you get (again if you use R-2.13.2): src win mac abc 1.8 1.5 1.4 ape 3.1-1 3.0-1 2.8 BaSTA 1.9.3 1.1 1.0 bcrm 0.4.3 0.2 0.1 BMA 3.16.2.3 3.15 3.14.1 Boruta 3.0.0 1.6 1.5 ... Are the differences big enough? Also note that back in October 2011, people using R-2.13.2 would get e.g. ape 2.7-3 on Linux, Windows and Mac. Wouldn't it make sense that people using R-2.13.2 today get the same? Why would anybody use R-2.13.2 today if it's not to run again some code that was written and used two years ago to obtain some important results? Cheers, H.
My gut reaction would be that if the results are not reproducible across such minor revisions of one library, they are probably just so much BS. I am trained in mathematical ecology, with more than a couple decades of post-doc experience working with risk assessment in the private sector. When I need to do an analysis, I will repeat it myself in multiple products, as well as C++ or FORTRAN code I have hand-crafted myself (and when I wrote number crunching code myself, I would do so in multiple programming languages - C++, Java, FORTRAN, applying rigorous QA procedures to each program/library I developed). Back when I was a grad student, I would not even show the results to my supervisor, let alone try to publish them, unless the results were reproducible across ALL the tools I used. If there was a discrepancy, I would debug that before discussing them with anyone. Surely, it is the responsibility of the journals' editors and reviewers to apply a similar practice. The concept of reproducibility used to this point in this discussion might be adequate from a programmers perspective (except in my lab), it is wholly inadequate from a scientist's perspective. I maintain that if you have the original data, and repeat the analysis using the latest version of R and the available, relevant packages, the original results are probably due to a bug either in the R script or in R or the packages used IF the results obtained using the latest versions of these are not consistent with the originally reported results. Therefore, of the concerns I see raised in this discussion, the principle one of concern is that of package developers who fail to pay sufficient attention to backwards compatibility: a new version ought not break any code that executes fine using previous versions. That is not a trivial task, and may require contributors obtaining the assistance of a software engineer. I am sure anyone in this list who programs in C++ knows how the ANSI committees handle change management. Introduction of new features is something that is largely irrelevant for backwards compatibility (but there are exceptions), but features to be removed are handled by declaring them deprecated, and leaving them in that condition for years. That tells anyone using the language that they ought to plan to adapt their code to work when the deprecated feature is finally removed. I am responsible for maintaining code (involving distributed computing) to which many companies integrate their systems, and I am careful to ensure that no change I make breaks their integration into my system, even though I often have to add new features. And I don't add features lightly, and have yet to remove features. When that eventually happens, the old feature will be deprecated, so that the other companies have plenty of time to adapt their integration code. I do not know whether CRAN ought to have any responsibility for this sort of change management, or if they have assumed some responsibility for some of it, but I would argue that the package developers have the primary responsibility for doing this right. Just my $0.05 (the penny no longer exists in Canada) Cheers Ted R.E. (Ted) Byers, Ph.D., Ed.D.
-- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel