On Thu, Mar 20, 2014 at 5:11 PM, Tim Triche, Jr. <tim.tri...@gmail.com>wrote:

> That doesn't make sense.
>
> If an API changes (e.g. in Matrix) and a program written against the old
> API can no longer run, that is a very different issue than if the same
> numbers (data) give different results.  The latter is what I am guessing
> you address.  The former is what I believe most people are concerned about
> here.  Or at least I hope that's so.
>
> The problem you describe is the classic case of a failure of backward
compatibility.  That is completely different from the question of
reproducibility or replicability.  And, since I, among others, noticed the
question of reproducibility had arisen, I felt a need to primarily address
that.

I do not have a quibble with anything else you wrote (or with anything in
this thread related to the issue of backward compatibility), and I have
enough experience to know both that it is a hard problem and that there are
a number of different solutions people have used.  Appropriate management
of deprecation of features is one, and the use of code freezes is another.
Version control is a third.  Each option carries its own advantages and
disadvantages.


> It's more an issue of usability than reproducibility in such a case, far
> as I can tell (see e.g.
> http://liorpachter.wordpress.com/2014/03/18/reproducibility-vs-usability/).  
> If the same data produces substantially different results (not
> attributable to e.g. better handling of machine precision and so forth,
> although that could certainly be a bugaboo in many cases... anyone who has
> programmed numerical routines in FORTRAN already knows this) then yes,
> that's a different type of bug.  But in order to uncover the latter type of
> bug, the code has to run in the first place.  After a while it becomes
> rather impenetrable if no thought is given to these changes.
>
> So the Bioconductor solution, as Herve noted, is to have freezes and
> releases.  There can be old bugs enshrined in people's code due to using
> old versions, and those can be traced even after many releases have come
> and gone, because there is a point-in-time snapshot of about when these
> things occurred.  As with (say) ANSI C++, deprecation notices stay in place
> for a year before anything is actually done to remove a function or break
> an API.  It's not impossible, it just requires more discipline than
> declaring that the same program should be written multiple times on
> multiple platforms every time.  The latter isn't an efficient use of
> anyone's time.
>
> Most of these analyses are not about putting a man on the moon or making
> sure a dam does not break.  They're relatively low-consequence exploratory
> sorties.  If something comes of them, it would be nice to have a
> point-in-time reference to check and see whether the original results were
> hooey.  That's a lot quicker and more efficient than rewriting everything
> from scratch (which, in some fields, simply ensures things won't get
> checked).
>
> My $0.02, since we do still have those to bedevil cashiers.
>
>
>
> Statistics is the grammar of science.
> Karl Pearson <http://en.wikipedia.org/wiki/The_Grammar_of_Science>
>
>
> Cheers

Ted

        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to