That doesn't make sense.

If an API changes (e.g. in Matrix) and a program written against the old
API can no longer run, that is a very different issue than if the same
numbers (data) give different results.  The latter is what I am guessing
you address.  The former is what I believe most people are concerned about
here.  Or at least I hope that's so.

It's more an issue of usability than reproducibility in such a case, far as
I can tell (see e.g.
http://liorpachter.wordpress.com/2014/03/18/reproducibility-vs-usability/).
 If the same data produces substantially different results (not
attributable to e.g. better handling of machine precision and so forth,
although that could certainly be a bugaboo in many cases... anyone who has
programmed numerical routines in FORTRAN already knows this) then yes,
that's a different type of bug.  But in order to uncover the latter type of
bug, the code has to run in the first place.  After a while it becomes
rather impenetrable if no thought is given to these changes.

So the Bioconductor solution, as Herve noted, is to have freezes and
releases.  There can be old bugs enshrined in people's code due to using
old versions, and those can be traced even after many releases have come
and gone, because there is a point-in-time snapshot of about when these
things occurred.  As with (say) ANSI C++, deprecation notices stay in place
for a year before anything is actually done to remove a function or break
an API.  It's not impossible, it just requires more discipline than
declaring that the same program should be written multiple times on
multiple platforms every time.  The latter isn't an efficient use of
anyone's time.

Most of these analyses are not about putting a man on the moon or making
sure a dam does not break.  They're relatively low-consequence exploratory
sorties.  If something comes of them, it would be nice to have a
point-in-time reference to check and see whether the original results were
hooey.  That's a lot quicker and more efficient than rewriting everything
from scratch (which, in some fields, simply ensures things won't get
checked).

My $0.02, since we do still have those to bedevil cashiers.



Statistics is the grammar of science.
Karl Pearson <http://en.wikipedia.org/wiki/The_Grammar_of_Science>


On Thu, Mar 20, 2014 at 1:28 PM, Ted Byers <r.ted.by...@gmail.com> wrote:

> On Thu, Mar 20, 2014 at 3:14 PM, Hervé Pagès <hpa...@fhcrc.org> wrote:
>
> > On 03/20/2014 03:52 AM, Duncan Murdoch wrote:
> >
> >> On 14-03-20 2:15 AM, Dan Tenenbaum wrote:
> >>
> >>>
> >>>
> >>> ----- Original Message -----
> >>>
> >>>> From: "David Winsemius" <dwinsem...@comcast.net>
> >>>> To: "Jeroen Ooms" <jeroen.o...@stat.ucla.edu>
> >>>> Cc: "r-devel" <r-devel@r-project.org>
> >>>> Sent: Wednesday, March 19, 2014 11:03:32 PM
> >>>> Subject: Re: [Rd] [RFC] A case for freezing CRAN
> >>>>
> >>>>
> >>>> On Mar 19, 2014, at 7:45 PM, Jeroen Ooms wrote:
> >>>>
> >>>>  On Wed, Mar 19, 2014 at 6:55 PM, Michael Weylandt
> >>>>> <michael.weyla...@gmail.com> wrote:
> >>>>>
> >>>>>> Reading this thread again, is it a fair summary of your position
> >>>>>> to say "reproducibility by default is more important than giving
> >>>>>> users access to the newest bug fixes and features by default?"
> >>>>>> It's certainly arguable, but I'm not sure I'm convinced: I'd
> >>>>>> imagine that the ratio of new work being done vs reproductions is
> >>>>>> rather high and the current setup optimizes for that already.
> >>>>>>
> >>>>>
> >>>>> I think that separating development from released branches can give
> >>>>> us
> >>>>> both reliability/reproducibility (stable branch) as well as new
> >>>>> features (unstable branch). The user gets to pick (and you can pick
> >>>>> both!). The same is true for r-base: when using a 'released'
> >>>>> version
> >>>>> you get 'stable' base packages that are up to 12 months old. If you
> >>>>> want to have the latest stuff you download a nightly build of
> >>>>> r-devel.
> >>>>> For regular users and reproducible research it is recommended to
> >>>>> use
> >>>>> the stable branch. However if you are a developer (e.g. package
> >>>>> author) you might want to develop/test/check your work with the
> >>>>> latest
> >>>>> r-devel.
> >>>>>
> >>>>> I think that extending the R release cycle to CRAN would result
> >>>>> both
> >>>>> in more stable released versions of R, as well as more freedom for
> >>>>> package authors to implement rigorous change in the unstable
> >>>>> branch.
> >>>>> When writing a script that is part of a production pipeline, or
> >>>>> sweave
> >>>>> paper that should be reproducible 10 years from now, or a book on
> >>>>> using R, you use stable version of R, which is guaranteed to behave
> >>>>> the same over time. However when developing packages that should be
> >>>>> compatible with the upcoming release of R, you use r-devel which
> >>>>> has
> >>>>> the latest versions of other CRAN and base packages.
> >>>>>
> >>>>
> >>>>
> >>>> As I remember ... The example demonstrating the need for this was an
> >>>> XML package that cause an extract from a website where the headers
> >>>> were misinterpreted as data in one version of pkg:XML and not in
> >>>> another. That seems fairly unconvincing. Data cleaning and
> >>>> validation is a basic task of data analysis. It also seems excessive
> >>>> to assert that it is the responsibility of CRAN to maintain a synced
> >>>> binary archive that will be available in ten years.
> >>>>
> >>>
> >>>
> >>> CRAN already does this, the bin/windows/contrib directory has
> >>> subdirectories going back to 1.7, with packages dated October 2004. I
> >>> don't see why it is burdensome to continue to archive these. It would
> >>> be nice if source versions had a similar archive.
> >>>
> >>
> >> The bin/windows/contrib directories are updated every day for active R
> >> versions.  It's only when Uwe decides that a version is no longer worth
> >> active support that he stops doing updates, and it "freezes".  A
> >> consequence of this is that the snapshots preserved in those older
> >> directories are unlikely to match what someone who keeps up to date with
> >> R releases is using.  Their purpose is to make sure that those older
> >> versions aren't completely useless, but they aren't what Jeroen was
> >> asking for.
> >>
> >
> > But it is almost completely useless from a reproducibility point of
> > view to get random package versions. For example if some people try
> > to use R-2.13.2 today to reproduce an analysis that was published
> > 2 years ago, they'll get Matrix 1.0-4 on Windows, Matrix 1.0-3 on Mac,
> > and Matrix 1.1-2-2 on Unix. And none of them of course is what was used
> > by the authors of the paper (they used Matrix 1.0-1, which is what was
> > current when they ran their analysis).
> >
>
> Initially this discussion brought back nightmares of DLL hell on Windows.
> Those as ancient as I will remember that well.  But now, the focus seems to
> be on reproducibility, but with what strikes me as a seriously flawed
> notion of what reproducibility means.
>
> Herve Pages mentions the risk of irreproducibility across three minor
> revisions of version 1.0 of Matrix.  My gut reaction would be that if the
> results are not reproducible across such minor revisions of one library,
> they are probably just so much BS.  I am trained in mathematical ecology,
> with more than a couple decades of post-doc experience working with risk
> assessment in the private sector.  When I need to do an analysis, I will
> repeat it myself in multiple products, as well as C++ or FORTRAN code I
> have hand-crafted myself (and when I wrote number crunching code myself, I
> would do so in multiple programming languages - C++, Java, FORTRAN,
> applying rigorous QA procedures to each program/library I developed).  Back
> when I was a grad student, I would not even show the results to my
> supervisor, let alone try to publish them, unless the results were
> reproducible across ALL the tools I used.  If there was a discrepancy, I
> would debug that before discussing them with anyone.  Surely, it is the
> responsibility of the journals' editors and reviewers to apply a similar
> practice.
>
> The concept of reproducibility used to this point in this discussion might
> be adequate from a programmers perspective (except in my lab), it is wholly
> inadequate from a scientist's perspective.  I maintain that if you have the
> original data, and repeat the analysis using the latest version of R and
> the available, relevant packages, the original results are probably due to
> a bug either in the R script or in R or the packages used IF the results
> obtained using the latest versions of these are not consistent with the
> originally reported results.  Therefore, of the concerns I see raised in
> this discussion, the principle one of concern is that of package developers
> who fail to pay sufficient attention to backwards compatibility: a new
> version ought not break any code that executes fine using previous
> versions.  That is not a trivial task, and may require contributors
> obtaining the assistance of a software engineer.  I am sure anyone in this
> list who programs in C++ knows how the ANSI committees handle change
> management.  Introduction of new features is something that is largely
> irrelevant for backwards compatibility (but there are exceptions), but
> features to be removed  are handled by declaring them deprecated, and
> leaving them in that condition for years.  That tells anyone using the
> language that they ought to plan to adapt their code to work when the
> deprecated feature is finally removed.
>
> I am responsible for maintaining code (involving distributed computing) to
> which many companies integrate their systems, and I am careful to ensure
> that no change I make breaks their integration into my system, even though
> I often have to add new features.  And I don't add features lightly, and
> have yet to remove features.  When that eventually happens, the old feature
> will be deprecated, so that the other companies have plenty of time to
> adapt their integration code.  I do not know whether CRAN ought to have any
> responsibility for this sort of change management, or if they have assumed
> some responsibility for some of it, but I would argue that the package
> developers have the primary responsibility for doing this right.
>
> Just my $0.05 (the penny no longer exists in Canada)
>
> Cheers
>
> Ted
> R.E. (Ted) Byers, Ph.D., Ed.D.
>
>         [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to