Jari Oksanen <jari.oksa...@oulu.fi> writes: > On 21/03/2014, at 10:40 AM, Rainer M Krug wrote: > >> >> >> This is a long and (mainly) interesting discussion, which is fanning out >> in many different directions, and I think many are not that relevant to >> the OP's suggestion. >> >> I see the advantages of having such a dynamic CRAN, but also of having a >> more stable CRAN. I prefer CRAN as it is now, but ion many cases a more >> stable CRAN might b an advantage. So having releases of CRAN might make >> sense. But then there is the archiving issue of CRAN. >> >> The suggestion was made to move the responsibility away from CRAN and >> the R infrastructure to the user / researcher to guarantee that the >> results can be re-run years later. It would be nice to have this build >> in CRAN, but let's stick at the scenario that the user should care for >> reproducability. > > There are two different problems that alternate in the discussion: > reproducibility and breakage of CRAN dependencies. Frozen CRAN could > make *approximate* reproducibility easier to achieve, but real > reproducibility needs stricter solutions. Actual sessionInfo() is > minimal information, but re-building a spitting image of old > environment may still be demanding (but in many cases this does not > matter). > > Another problem is that CRAN is so volatile that new versions of > packages break other packages or old scripts. Here the main problem is > how package developers work. Freezing CRAN would not change that: if > package maintainers release breaking code, that would be frozen. I > think that most packages do not make distinction between development > and release branches, and CRAN policy won't change that. > > I can sympathize with package maintainers having 150 reverse > dependencies. My main package only has ~50, and it is sure that I > won't test them all with new release. I sometimes tried, but I could > not even get all those built because they had other dependencies on > packages that failed. Even those that I could test failed to detect > problems (in one case all examples were \dontrun and passed nicely > tests). I only wish that if people *really* depend on my package, they > test it against R-Forge version and alert me before CRAN releases, but > that is not very likely (I guess many dependencies are not *really* > necessary, but only concern marginal features of the package, but CRAN > forces to declare those).
Breakage of CRAN packages is a problem, to which I can not comment much. I have no idea how this could be saved unless one introduces more checks, which nobody wants. CRAN is a (more or less) open repository for packages written by engineers / programmers but also scientists of other fields - and that is the strength of CRAN - a central repository to find packages which conform to a minimal standard and format. > > Still a few words about reproducibility of scripts: this can be hardly > achieved with good coverage, because many scripts are so very ad > hoc. When I edit and review manuscripts for journals, I very often get > Sweave or knitr scripts that "just work", where "just" means "just so > and so". Often they do not work at all, because they had some > undeclared private functionalities or stray files in the author > workspace that did not travel with the Sweave document. One reason why I *always* start my R sessions --vanilla and ave a local initialization script which I call manually. > I think these > -- published scientific papers -- are the main field where the code > really should be reproducible, but they often are the hardest to > reproduce. And this is completely ouyt of the hands of R / CRAN / ... and in the hand of Journals and Authors. But R could provide a framework to make this more easy in form of a package which provides functions to make this a one-command approach. > Nothing CRAN people do can help with sloppy code scientists > write for publications. You know, they are scientists -- not > engineers. Absolutely - and I am also a sloppy scientists - I put my code online, but hope that not many people ask me later about it. Cheers, Rainer > > Cheers, Jari Oksanen >> >> Leaving the issue of compilation out, a package which is creating a >> custom installation of the R version which includes the source of the R >> version used and the sources of the packages in a on Linux compilable >> format, given that the relevant dependencies are installed, would be a >> huge step forward. >> >> I know - compilation on Windows (and sometimes Mac) is a serious >> problem), but to archive *all* binaries and to re-compile all older >> versions of R and all packages would be an impossible task. >> >> Apart from that - doing your analysis in a Virtual Machine and then >> simply archiving this Virtual Machine, would also be an option, but only >> for the more tech savy users. >> >> In a nutshell: I think a package would be able to provide the solution >> for a local archiving to make it possible to re-run the simulation with >> the same tools at a later stage - although guarantees would not be >> possible. >> >> Cheers, >> >> Rainer >> -- >> Rainer M. Krug >> email: Rainer<at>krugs<dot>de >> PGP: 0x0F52F982 >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > -- Rainer M. Krug email: Rainer<at>krugs<dot>de PGP: 0x0F52F982
pgpKyVqJsTdHm.pgp
Description: PGP signature
______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel