Re: [Rd] NY Times article

Jb Sat, 10 Jan 2009 11:53:43 -0800

Hi all,

One relatively easy solution is to include R and all relevant versionsof packages used in (a) a reproduction archive and/or (b) packagedinside a virtual machine. With storage space cheap and sources of bothR and packages available (and easy free crossplatform virtual machinesolutions and Linux) one can distribute not only ones own code anddata but also all that was required to do the analyses down to the OS.

So far in our own work we've just included relevant package versionsbut we will probably start include R as well for next projects.

Hope this brainstorm helps (and credit to Ben Hansen and MarkFredrickson for these ideas).


Jake

Jake Bowers
http://jakebowers.org

On Jan 10, 2009, at 1:19 PM, "Nicholas Lewin-Koh" <[email protected]>wrote:

Hi,
Unfortunately one of the the cornerstones of the validation paradigm
in the clinical world (as I understand it) is that
consistency/repeatability,
documentation (of the programs consistency and maintenance) and
adherence to regulatory
requirements take supremacy even ironically over correctness. So you
still

get ridiculous requirements from regulatory bodies like type IIIsums of

squares, Last observation carried forward, etc. These edifices are
very hard to change, I know of people who have worked their whole
careers just to get the FDA to allow other treatment of missing data.

So what does this have to do with R? This comes down to the point
you made below about the R development cycle incorporating bug fixes
into new releases, and not supporting old versions. I think this has
been rehashed many times, and is not likely to change. So how to
move R into the clinic? From a practical perspective all the
development and interoperability features of R are very nice,
but how to maintain things in a way that if the underlying
R platform changes the tool or method does not, and furthermore
how to manage this in a cost effective way so that it can't
be argued that it is cheaper to pay for SAS???

These are not necessarily questions that R core has to answer,
as the burden of proof of validation is really in the hands of the

company/organization doing the submission. We just like to pretendthatthe large price we pay for our SAS support means we can shiftliability

:)

Rambling again,
Nicholas


On Fri, 09 Jan 2009 17:07:31 -0600, "Kevin R. Coombes"
<[email protected]> said:

Hi Nicholas,

You raise a very good point. As an R user (who develops a couple of
packages for our own local use), I sometimes find myself cringing in
anticipation of a new R (or BioConductor) release. In my perception

(which is almost certainly exaggerated, but that's why I emphasizethat

it is only an opinion), clever theoretical arguments in favor of

structural changes have a tendency to outweigh practicalconsiderations

of backwards compatibility.

One of my own interests is in "reproducible research", and I've been
pushing hard here at M.D. Anderson to get people to use Sweave to

enhance the reproducibility of their own analyses. But, more oftenthanI would like, I find that reports written in Sweave do not survivethe

transition from one version of R to the next, because either the core

implementation or one of the packages they depend on has changed insome

small but meaningful way.

For our own packages, we have been adding extensive regressiontesting

to ensure that the same numbers come out of various computations, in
order to see the effects of either the changes that we make or the
changes in the packages we depend on.  But doing this in a nontrivial
way with real data leads to test suites that take a long time to run,
and so cannot be incorporated in the nightly builds used by CRAN.

We also encourage our analysts to include a "sessionInfo()" commandinan appendix to each report so we are certain to document whatversions

of packages were used.

I suspect that the sort of validation you want will have to rely onanextensive regression suite test to make certain that the things youneedremain stable from one release to another. That, and you'll have tobeslow about upgrading (which may mean foregoing support from themailing

lists, where a common refrain in response to bug reports is that "you

aren't using the latest and greatest version", without anappreciation

of the fact that there can be good reasons for not changing something
that you know works....).

Best,
   Kevin

Nicholas Lewin-Koh wrote:

Hi,
Kudos, nice exposure, but to make this more appropriate to R-devel I
would just

like to make a small comment about the point made by the SASexecutive

about getting
on an airplane yada yada ...

1) It would seem to me that R has certification documents
2) anyone designing airplanes, analyzing clinical trials, etc. had
  better be worried about a lot more than whether their software is
  proprietary.

So from that point of view it would seem that R has made greatstrides

over
the last 5 years especially in establishing a role for open source
software solutions in regulated/ commercial
environments. The question now is how to meld the archiac notions of
validation and

and verification seen in industry with the very different model ofopen

source
development? Rather than the correctness of the software, in which I
think R is competitive,
it is how to deal with the rapid release cycles of R, and the
contributed packages.

We pull our hair out in pharma trying to figure out how we wouldever

reconcile CRAN and validation requirements. I have no brilliant
soulution,
just food for thought

Nicholas
------------------------------

Message: 5
Date: Thu, 8 Jan 2009 13:02:55 +0000 (GMT)
From: Prof Brian Ripley <[email protected]>
Subject:Re: [Rd]  NY Times article
To: Anand Patil <[email protected]>
Cc: [email protected]
Message-ID: <[email protected]>
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed

It has been all over R-help, in several threads.

https://stat.ethz.ch/pipermail/r-help/2009-January/184119.html
https://stat.ethz.ch/pipermail/r-help/2009-January/184170.html
https://stat.ethz.ch/pipermail/r-help/2009-January/184209.html
https://stat.ethz.ch/pipermail/r-help/2009-January/184232.html
https://stat.ethz.ch/pipermail/r-help/2009-January/184237.html

and more

On Thu, 8 Jan 2009, Anand Patil wrote:

Sorry if this is spam, but I couldn't see it having popped up onthe list

yet.
http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?emc=eta1

Anand

   [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

--
Brian D. Ripley,                  [email protected]

Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/

University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] NY Times article

Reply via email to