No attempt to summarize the thread, but a few highlighted points: o Karl's suggestion of versioned / dated access to the repo by adding a layer to webaccess is (as usual) nice. It works on the 'supply' side. But Jeroen's problem is on the demand side. Even when we know that an analysis was done on 20xx-yy-zz, and we reconstruct CRAN that day, it only gives us a 'ceiling' estimate of what was on the machine. In production or lab environments, installations get stale. Maybe packages were already a year old? To me, this is an issue that needs to be addressed on the 'demand' side of the user. But just writing out version numbers is not good enough.
o Roger correctly notes that R scripts and packages are just one issue. Compilers, libraries and the OS matter. To me, the natural approach these days would be to think of something based on Docker or Vagrant or (if you must, VirtualBox). The newer alternatives make snapshotting very cheap (eg by using Linux LXC). That approach reproduces a full environemnt as best as we can while still ignoring the hardware layer (and some readers may recall the infamous Pentium bug of two decades ago). o Reproduciblity will probably remain the responsibility of study authors. If an investigator on a mega-grant wants to (or needs to) freeze, they do have the tools now. Requiring the need of a few to push work on those already overloaded (ie CRAN) and changing the workflow of everybody is a non-starter. o As Terry noted, Jeroen made some strong claims about exactly how flawed the existing system is and keeps coming back to the example of 'a JSS paper that cannot be re-run'. I would really like to see empirics on this. Studies of reproducibility appear to be publishable these days, so maybe some enterprising grad student wants to run with the idea of actually _testing_ this. We maybe be above Terry's 0/30 and nearer to Kevin's 'low'/30. But let's bring some data to the debate. o Overall, I would tend to think that our CRAN standards of releasing with tests, examples, and checks on every build and release already do a much better job of keeping things tidy and workable than in most if not all other related / similar open source projects. I would of course welcome contradictory examples. Dirk -- Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel