On Thu, Mar 20, 2014 at 5:11 PM, Tim Triche, Jr. <tim.tri...@gmail.com>wrote:
> That doesn't make sense. > > If an API changes (e.g. in Matrix) and a program written against the old > API can no longer run, that is a very different issue than if the same > numbers (data) give different results. The latter is what I am guessing > you address. The former is what I believe most people are concerned about > here. Or at least I hope that's so. > > The problem you describe is the classic case of a failure of backward compatibility. That is completely different from the question of reproducibility or replicability. And, since I, among others, noticed the question of reproducibility had arisen, I felt a need to primarily address that. I do not have a quibble with anything else you wrote (or with anything in this thread related to the issue of backward compatibility), and I have enough experience to know both that it is a hard problem and that there are a number of different solutions people have used. Appropriate management of deprecation of features is one, and the use of code freezes is another. Version control is a third. Each option carries its own advantages and disadvantages. > It's more an issue of usability than reproducibility in such a case, far > as I can tell (see e.g. > http://liorpachter.wordpress.com/2014/03/18/reproducibility-vs-usability/). > If the same data produces substantially different results (not > attributable to e.g. better handling of machine precision and so forth, > although that could certainly be a bugaboo in many cases... anyone who has > programmed numerical routines in FORTRAN already knows this) then yes, > that's a different type of bug. But in order to uncover the latter type of > bug, the code has to run in the first place. After a while it becomes > rather impenetrable if no thought is given to these changes. > > So the Bioconductor solution, as Herve noted, is to have freezes and > releases. There can be old bugs enshrined in people's code due to using > old versions, and those can be traced even after many releases have come > and gone, because there is a point-in-time snapshot of about when these > things occurred. As with (say) ANSI C++, deprecation notices stay in place > for a year before anything is actually done to remove a function or break > an API. It's not impossible, it just requires more discipline than > declaring that the same program should be written multiple times on > multiple platforms every time. The latter isn't an efficient use of > anyone's time. > > Most of these analyses are not about putting a man on the moon or making > sure a dam does not break. They're relatively low-consequence exploratory > sorties. If something comes of them, it would be nice to have a > point-in-time reference to check and see whether the original results were > hooey. That's a lot quicker and more efficient than rewriting everything > from scratch (which, in some fields, simply ensures things won't get > checked). > > My $0.02, since we do still have those to bedevil cashiers. > > > > Statistics is the grammar of science. > Karl Pearson <http://en.wikipedia.org/wiki/The_Grammar_of_Science> > > > Cheers Ted [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel