[Rd] Replacing the Random Number Generator in Stand Alone Library

2013-10-10 Thread Nigel Delaney
Hi R-Developers,

I had a question about the random number generator used in the R StandAlone
Math Library.  The stand-alone library depends on the unif_rand() function
for most simulated values, and this function is provided in the sunif.c file
in the relevant directory.  At present, this program implements the
"Marsaglia-Multicarry" algorithm, which is described throughout the R
documentation as:

 "A multiply-with-carry RNG is used, as recommended by George Marsaglia in
his post to the mailing list 'sci.stat.math'. It has a period of more than
2^60 and has passed all tests (according to Marsaglia). The seed is two
integers (all values allowed)."

However, I do not think this RNG actually passes all tests.   For example,
the Handbook of Computational Econometrics (illegal web copy at link below),
shows that it fails the mtuple test and gives an explicit example where it
leads to problems because it failed this test.  The mtuple test was
introduced by Marsaglia in 1985, and I gather he wrote his mailing list
comment that it "passes all tests" sometime after this, so I am not sure
what explains this distinction (though I am not sure if the mtuple test is
included in the diehard tests, which he may have been what he was referring
to).  However, there are clearly some areas where this PRNG runs in to
trouble (although the books example is better, another problem is that it
can't seem to simulate a value above (1/2)^1+(1/4)^4) after simulating a
value below 1e-6.

Given that the Mersenne Twister seems to be the standard for simulation
these days (and used as the default in R), it seems like it might be useful
to change the stand alone library so it also uses this routine.  I gather
this would be pretty easy to do by pulling this function from the RNG.c file
and moving it into the sunif.c file, and have a prototype of this.

However, I thought I would ask, is there a reason this hasn't been done?  Or
is it just a historical carry-over (pun intended I suppose).

Warm wishes,
Nigel


Research Fellow
Massachusetts General Hospital / Broad Institute


Book link:
http://thenigerianprofessionalaccountant.files.wordpress.com/2013/04/handboo
k-of-computational-econometrics-belsley.pdf

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] FW: Replacing the Random Number Generator in Stand Alone Library

2013-10-15 Thread Nigel Delaney
Okay, so I am guessing everyone had the same response I initially did when
hearing that this RNG might not be so hot*...  as an alternate question, if
I submitted a patch to replace the current RNG with the twister, would it be
accepted?

-N

-Original Message-
From: Nigel Delaney [mailto:nigel.dela...@outlook.com] 
Sent: Thursday, October 10, 2013 5:04 PM
To: r-devel@r-project.org
Subject: Replacing the Random Number Generator in Stand Alone Library

Hi R-Developers,

I had a question about the random number generator used in the R StandAlone
Math Library.  The stand-alone library depends on the unif_rand() function
for most simulated values, and this function is provided in the sunif.c file
in the relevant directory.  At present, this program implements the
"Marsaglia-Multicarry" algorithm, which is described throughout the R
documentation as:

 "A multiply-with-carry RNG is used, as recommended by George Marsaglia in
his post to the mailing list 'sci.stat.math'. It has a period of more than
2^60 and has passed all tests (according to Marsaglia). The seed is two
integers (all values allowed)."

However, I do not think this RNG actually passes all tests.   For example,
the Handbook of Computational Econometrics (illegal web copy at link below),
shows that it fails the mtuple test and gives an explicit example where it
leads to problems because it failed this test.  The mtuple test was
introduced by Marsaglia in 1985, and I gather he wrote his mailing list
comment that it "passes all tests" sometime after this, so I am not sure
what explains this distinction (though I am not sure if the mtuple test is
included in the diehard tests, which he may have been what he was referring
to).  However, there are clearly some areas where this PRNG runs in to
trouble (although the books example is better, another problem is that it
can't seem to simulate a value above (1/2)^1+(1/4)^4) after simulating a
value below 1e-6.

Given that the Mersenne Twister seems to be the standard for simulation
these days (and used as the default in R), it seems like it might be useful
to change the stand alone library so it also uses this routine.  I gather
this would be pretty easy to do by pulling this function from the RNG.c file
and moving it into the sunif.c file, and have a prototype of this.

However, I thought I would ask, is there a reason this hasn't been done?  Or
is it just a historical carry-over (pun intended I suppose).

Warm wishes,
Nigel


Research Fellow
Massachusetts General Hospital / Broad Institute


Book link:
http://thenigerianprofessionalaccountant.files.wordpress.com/2013/04/handboo
k-of-computational-econometrics-belsley.pdf

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Embedded R Fails When Run on LSF Queue System

2013-11-10 Thread Nigel Delaney
Hi,

I had a quick question I was hoping someone might be able to answer, as my 
journey through the source code so far has not been very profitable. I have a 
command line program that embeds R, and the program works just fine when run 
from the command line. However, when I run the program as a job in the LSF 
cluster at my institute, the following error occurs:

Fatal error: you must specify '--save', '--no-save' or '--vanilla'

This is confusing becasue I believe I have set this by changing the startup 
parameters so that the no-save option is in use. I initialize R with the 
following collection of function calls:

1-Rf_initialize_R
2-R_SetParams
3-setup_Rmainloop

At step 2 I pass in a structure that specifies the no-save option (as well as 
interactive=true, verbose=false), so I would expect it to be okay, but I still 
receive this error. Again, not at the terminal but only when run as a separate 
process.

I gathered from this thread that someone else has experienced this 
(https://stat.ethz.ch/pipermail/r-help/2008-March/155935.html), but am not sure 
how best to rectify it as I believe I have already changed the setting.

Does anyone know what a good solution or work around might be?

Thanks for any help,
Nigel 
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Embedded R Fails When Run on LSF Queue System

2013-11-11 Thread Nigel Delaney
Thanks Dirk.  I'll confess I am mostly trying to follow the practices
outlined in the writing R-extensions manual and books by Chambers and
Gentleman, but  have found some conflicting advice on the internet in
various places, and am also trying to build off a previously existing code
base (that at least worked on some
operating-systems/architectures/scenarios, but seems to break when I move
the program around to any new type of environment).  Will definitely
re-write that bit.

Sincere thanks again.  Out of curiosity, can I ask one more question since I
gather you have much more expertise than myself?  The extension manual has a
somewhat obscure, to me at least, section on threading issues (8.1.5).  My
understanding is that because the R_CStackStart is a single value, if R is
called from multiple threads (even if never simultaneously), it is
essentially best to just disable the stack check rather than try to re-write
this value each time, which I gather might need to be done (and simply
increasing the stack size of the calling thread will not help).  Is this the
approach you have taken in your projects?  Or do you think this is also a
bad idea?

-Original Message-
From: Dirk Eddelbuettel [mailto:e...@debian.org] 
Sent: Sunday, November 10, 2013 8:50 PM
To: Nigel Delaney
Cc: r-devel@r-project.org
Subject: Re: [Rd] Embedded R Fails When Run on LSF Queue System


On 10 November 2013 at 16:32, Nigel Delaney wrote:
| I had a quick question I was hoping someone might be able to answer, as my
journey through the source code so far has not been very profitable. I have
a command line program that embeds R, and the program works just fine when
run from the command line. However, when I run the program as a job in the
LSF cluster at my institute, the following error occurs:
| 
| Fatal error: you must specify '--save', '--no-save' or '--vanilla'
| 
| This is confusing becasue I believe I have set this by changing the
startup parameters so that the no-save option is in use. I initialize R with
the following collection of function calls:
| 
| 1-Rf_initialize_R
| 2-R_SetParams
| 3-setup_Rmainloop
| 
| At step 2 I pass in a structure that specifies the no-save option (as well
as interactive=true, verbose=false), so I would expect it to be okay, but I
still receive this error. Again, not at the terminal but only when run as a
separate process.
| 
| I gathered from this thread that someone else has experienced this
(https://stat.ethz.ch/pipermail/r-help/2008-March/155935.html), but am not
sure how best to rectify it as I believe I have already changed the setting.
| 
| Does anyone know what a good solution or work around might be?

As a first approximation, when the manual tells you do something a certain
way, you are in fact better off doing it that way.

I stand behind two projects embedding R: littler (with Jeff Horner), and
RInside (with Romain Francois).  Both do this explicitly.  From littler.c:

char *R_argv[] = {(char*)programName, "--gui=none", "--no-restore",
"--no-save", "--no-readline", "--silent", "", ""};
char *R_argv_opt[] = {"--vanilla", "--slave"};
int R_argc = (sizeof(R_argv) - sizeof(R_argv_opt) ) /
sizeof(R_argv[0]);

[...]
/* some logic to add R_argv_opt parts to R_argv omitted (/
[...]

Rf_initEmbeddedR(R_argc, R_argv);   /* Initialize the embedded R
interpreter */

Hope this helps,  Dirk

--
Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel