Re: [Rd] NY Times article

2009-01-10 Thread Nicholas Lewin-Koh
Hi,
Unfortunately one of the the cornerstones of the validation paradigm
in the clinical world (as I understand it) is that
consistency/repeatability,
documentation (of the programs consistency and maintenance) and
adherence to regulatory
requirements take supremacy even ironically over correctness. So you
still
get ridiculous requirements from regulatory bodies like type III sums of
squares, Last observation carried forward, etc. These edifices are
very hard to change, I know of people who have worked their whole
careers just to get the FDA to allow other treatment of missing data.

So what does this have to do with R? This comes down to the point
you made below about the R development cycle incorporating bug fixes
into new releases, and not supporting old versions. I think this has 
been rehashed many times, and is not likely to change. So how to
move R into the clinic? From a practical perspective all the
development and interoperability features of R are very nice,
but how to maintain things in a way that if the underlying
R platform changes the tool or method does not, and furthermore 
how to manage this in a cost effective way so that it can't
be argued that it is cheaper to pay for SAS???

These are not necessarily questions that R core has to answer,
as the burden of proof of validation is really in the hands of the 
company/organization doing the submission. We just like to pretend that
the large price we pay for our SAS support means we can shift liability
:)

Rambling again,
Nicholas


On Fri, 09 Jan 2009 17:07:31 -0600, "Kevin R. Coombes"
 said:
> Hi Nicholas,
> 
> You raise a very good point. As an R user (who develops a couple of 
> packages for our own local use), I sometimes find myself cringing in 
> anticipation of a new R (or BioConductor) release. In my perception 
> (which is almost certainly exaggerated, but that's why I emphasize that 
> it is only an opinion), clever theoretical arguments in favor of 
> structural changes have a tendency to outweigh practical considerations 
> of backwards compatibility.
> 
> One of my own interests is in "reproducible research", and I've been 
> pushing hard here at M.D. Anderson to get people to use Sweave to 
> enhance the reproducibility of their own analyses. But, more often than 
> I would like, I find that reports written in Sweave do not survive the 
> transition from one version of R to the next, because either the core 
> implementation or one of the packages they depend on has changed in some 
> small but meaningful way.
> 
> For our own packages, we have been adding extensive regression testing 
> to ensure that the same numbers come out of various computations, in 
> order to see the effects of either the changes that we make or the 
> changes in the packages we depend on.  But doing this in a nontrivial 
> way with real data leads to test suites that take a long time to run, 
> and so cannot be incorporated in the nightly builds used by CRAN.
> 
> We also encourage our analysts to include a "sessionInfo()" command in 
> an appendix to each report so we are certain to document what versions 
> of packages were used.
> 
> I suspect that the sort of validation you want will have to rely on an 
> extensive regression suite test to make certain that the things you need 
> remain stable from one release to another. That, and you'll have to be 
> slow about upgrading (which may mean foregoing support from the mailing 
> lists, where a common refrain in response to bug reports is that "you 
> aren't using the latest and greatest version", without an appreciation 
> of the fact that there can be good reasons for not changing something 
> that you know works).
> 
> Best,
>   Kevin
> 
> Nicholas Lewin-Koh wrote:
> > Hi,
> > Kudos, nice exposure, but to make this more appropriate to R-devel I
> > would just
> > like to make a small comment about the point made by the SAS executive
> > about getting
> > on an airplane yada yada ...
> > 
> > 1) It would seem to me that R has certification documents
> > 2) anyone designing airplanes, analyzing clinical trials, etc. had 
> >better be worried about a lot more than whether their software is
> >proprietary.
> > 
> > So from that point of view it would seem that R has made great strides
> > over 
> > the last 5 years especially in establishing a role for open source
> > software solutions in regulated/ commercial
> > environments. The question now is how to meld the archiac notions of
> > validation and 
> > and verification seen in industry with the very different model of open
> > source
> > development? Rather than the correctness of the software, in which I
> > think R is competitive,
> > it is how to deal with the rapid release cycles of R, and the
> > contributed packages.
> > We pull our hair out in pharma trying to figure out how we would ever
> > reconcile CRAN and validation requirements. I have no brilliant
> > soulution,
> > just food for thought
> > 
> > Nicholas
> >  

Re: [Rd] NY Times article

2009-01-10 Thread Jb

Hi all,
One relatively easy solution is to include R and all relevant versions  
of packages used in (a) a reproduction archive and/or (b) packaged  
inside a virtual machine. With storage space cheap and sources of both  
R and packages available (and easy free crossplatform virtual machine  
solutions and Linux) one can distribute not only ones own code and  
data but also all that was required to do the analyses down to the OS.


So far in our own work we've just included relevant package versions  
but we will probably start include R as well for next projects.


Hope this brainstorm helps (and credit to Ben Hansen and Mark  
Fredrickson for these ideas).


Jake

Jake Bowers
http://jakebowers.org

On Jan 10, 2009, at 1:19 PM, "Nicholas Lewin-Koh"   
wrote:



Hi,
Unfortunately one of the the cornerstones of the validation paradigm
in the clinical world (as I understand it) is that
consistency/repeatability,
documentation (of the programs consistency and maintenance) and
adherence to regulatory
requirements take supremacy even ironically over correctness. So you
still
get ridiculous requirements from regulatory bodies like type III  
sums of

squares, Last observation carried forward, etc. These edifices are
very hard to change, I know of people who have worked their whole
careers just to get the FDA to allow other treatment of missing data.

So what does this have to do with R? This comes down to the point
you made below about the R development cycle incorporating bug fixes
into new releases, and not supporting old versions. I think this has
been rehashed many times, and is not likely to change. So how to
move R into the clinic? From a practical perspective all the
development and interoperability features of R are very nice,
but how to maintain things in a way that if the underlying
R platform changes the tool or method does not, and furthermore
how to manage this in a cost effective way so that it can't
be argued that it is cheaper to pay for SAS???

These are not necessarily questions that R core has to answer,
as the burden of proof of validation is really in the hands of the
company/organization doing the submission. We just like to pretend  
that
the large price we pay for our SAS support means we can shift  
liability

:)

Rambling again,
Nicholas


On Fri, 09 Jan 2009 17:07:31 -0600, "Kevin R. Coombes"
 said:

Hi Nicholas,

You raise a very good point. As an R user (who develops a couple of
packages for our own local use), I sometimes find myself cringing in
anticipation of a new R (or BioConductor) release. In my perception
(which is almost certainly exaggerated, but that's why I emphasize  
that

it is only an opinion), clever theoretical arguments in favor of
structural changes have a tendency to outweigh practical  
considerations

of backwards compatibility.

One of my own interests is in "reproducible research", and I've been
pushing hard here at M.D. Anderson to get people to use Sweave to
enhance the reproducibility of their own analyses. But, more often  
than
I would like, I find that reports written in Sweave do not survive  
the

transition from one version of R to the next, because either the core
implementation or one of the packages they depend on has changed in  
some

small but meaningful way.

For our own packages, we have been adding extensive regression  
testing

to ensure that the same numbers come out of various computations, in
order to see the effects of either the changes that we make or the
changes in the packages we depend on.  But doing this in a nontrivial
way with real data leads to test suites that take a long time to run,
and so cannot be incorporated in the nightly builds used by CRAN.

We also encourage our analysts to include a "sessionInfo()" command  
in
an appendix to each report so we are certain to document what  
versions

of packages were used.

I suspect that the sort of validation you want will have to rely on  
an
extensive regression suite test to make certain that the things you  
need
remain stable from one release to another. That, and you'll have to  
be
slow about upgrading (which may mean foregoing support from the  
mailing

lists, where a common refrain in response to bug reports is that "you
aren't using the latest and greatest version", without an  
appreciation

of the fact that there can be good reasons for not changing something
that you know works).

Best,
   Kevin

Nicholas Lewin-Koh wrote:

Hi,
Kudos, nice exposure, but to make this more appropriate to R-devel I
would just
like to make a small comment about the point made by the SAS  
executive

about getting
on an airplane yada yada ...

1) It would seem to me that R has certification documents
2) anyone designing airplanes, analyzing clinical trials, etc. had
  better be worried about a lot more than whether their software is
  proprietary.

So from that point of view it would seem that R has made great  
strides

over
the last 5 years especially in establishing a role for op

[Rd] code validation (was Re: NY Times article)

2009-01-10 Thread Spencer Graves
Hi, All: 

 What support exists for 'regression testing' 
(http://en.wikipedia.org/wiki/Regression_testing) of R code, e.g., as 
part of the "R CMD check" process? 

 The"RUnit" package supports "unit testing" 
(http://en.wikipedia.org/wiki/Unit_testing). 

 Those concerned about software quality of code they use regularly 
could easily develop their own "softwareChecks" package that runs unit 
tests in the "\examples".  Then each time a new version of the package 
and / or R is downloaded, you can do "R CMD check" of your 
"softwareChecks":  If it passes, you know that it passed all your checks. 



 I have not used "RUnit", but I've done similar things computing 
the same object two ways then doing "stopifnot(all.equal(obj1, obj2))".  
I think the value of the help page is enhanced by showing the 
"all.equal" but not the "stopifnot".  I achieve this using "\dontshow" 
as follows: 



  obj1 <- ...
  obj2 <- ...
  \dontshow{stopifnot(}
  all.equal(obj1, obj2)
  \dontshow{)}. 



 Examples of this are contained, for example, in "fRegress.Rd" in 
the current "fda" package available from CRAN or R-Forge. 



 Best Wishes,
 Spencer

Jb wrote:

Hi all,
One relatively easy solution is to include R and all relevant versions 
of packages used in (a) a reproduction archive and/or (b) packaged 
inside a virtual machine. With storage space cheap and sources of both 
R and packages available (and easy free crossplatform virtual machine 
solutions and Linux) one can distribute not only ones own code and 
data but also all that was required to do the analyses down to the OS.


So far in our own work we've just included relevant package versions 
but we will probably start include R as well for next projects.


Hope this brainstorm helps (and credit to Ben Hansen and Mark 
Fredrickson for these ideas).


Jake

Jake Bowers
http://jakebowers.org

On Jan 10, 2009, at 1:19 PM, "Nicholas Lewin-Koh"  
wrote:



Hi,
Unfortunately one of the the cornerstones of the validation paradigm
in the clinical world (as I understand it) is that
consistency/repeatability,
documentation (of the programs consistency and maintenance) and
adherence to regulatory
requirements take supremacy even ironically over correctness. So you
still
get ridiculous requirements from regulatory bodies like type III sums of
squares, Last observation carried forward, etc. These edifices are
very hard to change, I know of people who have worked their whole
careers just to get the FDA to allow other treatment of missing data.

So what does this have to do with R? This comes down to the point
you made below about the R development cycle incorporating bug fixes
into new releases, and not supporting old versions. I think this has
been rehashed many times, and is not likely to change. So how to
move R into the clinic? From a practical perspective all the
development and interoperability features of R are very nice,
but how to maintain things in a way that if the underlying
R platform changes the tool or method does not, and furthermore
how to manage this in a cost effective way so that it can't
be argued that it is cheaper to pay for SAS???

These are not necessarily questions that R core has to answer,
as the burden of proof of validation is really in the hands of the
company/organization doing the submission. We just like to pretend that
the large price we pay for our SAS support means we can shift liability
:)

Rambling again,
Nicholas


On Fri, 09 Jan 2009 17:07:31 -0600, "Kevin R. Coombes"
 said:

Hi Nicholas,

You raise a very good point. As an R user (who develops a couple of
packages for our own local use), I sometimes find myself cringing in
anticipation of a new R (or BioConductor) release. In my perception
(which is almost certainly exaggerated, but that's why I emphasize that
it is only an opinion), clever theoretical arguments in favor of
structural changes have a tendency to outweigh practical considerations
of backwards compatibility.

One of my own interests is in "reproducible research", and I've been
pushing hard here at M.D. Anderson to get people to use Sweave to
enhance the reproducibility of their own analyses. But, more often than
I would like, I find that reports written in Sweave do not survive the
transition from one version of R to the next, because either the core
implementation or one of the packages they depend on has changed in 
some

small but meaningful way.

For our own packages, we have been adding extensive regression testing
to ensure that the same numbers come out of various computations, in
order to see the effects of either the changes that we make or the
changes in the packages we depend on.  But doing this in a nontrivial
way with real data leads to test suites that take a long time to run,
and so cannot be incorporated in the nightly builds used by CRAN.

We also encourage our analysts to include a "sessionInfo()" command in
an appendix to each report so we

[Rd] Problem with compiling shared C/C++ library for loading into R (Linux)

2009-01-10 Thread Samsiddhi Bhattacharjee
Dear all,

I am using the .Call interface to call c++ code from R. For that, I am
trying to create a dynamic library (mylib.so)
using "R CMD SHLIB" by linking my own c++ code and an external c++
library (blitz++).

The makefile works fine on my Mac, produces mylib.so and I am able to
call .Call() from R,  but on a linux
server (I think Debian),  I got the following error:

--
/usr/bin/ld: /somepath/blitz/lib/libblitz.a(globals.o):
relocation R_X86_64_32 against `a local symbol' can not be used when
making a shared object; recompile with -fPIC
/somepath/blitz/lib/libblitz.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
--

I tried recompiling the blitz++ library with the -fPIC flag, and then
linking using -fPIC, it went thorugh without error
producing a "mylib.so" file.  But when I tried "dyn.load(mylib.so)"
from R, I got the error:

--
Error in dyn.load("mylib.so") :
 unable to load shared library '/somepath/mylib.so':
 /somepath/mylib.so: undefined symbol: _ZSt4cerr
-

I will really appreciate any help or advice on this problem.

--Samsiddhi Bhattacharjee

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel