Re: [Rd] set.seed() in a package

Marvin Wright Wed, 06 Nov 2019 01:09:27 -0800

Thanks for your answers. I agree with Duncan and Henrik that we should have 
such a policy, even if we cannot test it.


> On 30. Oct 2019, at 17:59, Henrik Bengtsson <henrik.bengts...@gmail.com> 
> wrote:
> 
> Forgot to say: For,
> 
>  oseed <- base::getRandomSeed()
>  on.exit(base::setRandomSeed(oseed))
> 
> one could upgrade set.seed() to take this role, e.g.
> 
>  oseed <- set.seed(0xBEEF)
>  on.exit(set.seed(oseed))
> 
> Current, set.seed() always return NULL.
> 
> BTW, and my memory might be bad, I think I mentioned this in the past
> but was told that you cannot reset the RNG state for all types of RNG
> kinds.  That might complicate things, but on the other hand, that
> could be checked for at run-time by the above functions.
> 
> /Henrik
> 
> On Wed, Oct 30, 2019 at 9:50 AM Henrik Bengtsson
> <henrik.bengts...@gmail.com> wrote:
>> 
>>> On 30/10/2019 9:08 a.m., peter dalgaard wrote:
>>>> You can fairly easily work around that by saving and restoring 
>>>> .Random.seed.
>> 
>> This is actually quite tedious to get correct; it requires you to
>> under how and when .Random.seed is set, and what are valid values on
>> .Random.seed.   For instance, a common mistake (me too) is to reset to
>> .GlobalEnv$.Random.seed <- NULL in a fresh R session but this will
>> produce a warning on: ".Random.seed' is not an integer vector but of
>> type 'NULL', so ignored".  You end up having to do things such as:
>> 
>>  oseed <- .GlobalEnv$.Random.seed
>>  on.exit({
>>    if (is.null(oseed)) {
>>      rm(list=".Random.seed", envir= .GlobalEnv)
>>    } else {
>>      assign(".Random.seed", value=oseed, envir= .GlobalEnv)
>>    }
>>  })
>> 
>> to avoid that warning.  So, having support functions for this in base
>> R would be helpful, e.g.
>> 
>>  oseed <- base::getRandomSeed()
>>  on.exit(base::setRandomSeed(oseed))
>> 
>> Back to Marvin's point/question.  I think it would be useful if the
>> CRAN Policies would explicitly say that functions must not change the
>> random seed to a fixed one, without undoing it, unless the user
>> specifies it via an argument.  If not, there's a great risk it will
>> mess up statistical analysis.  Also, if there would a way to test
>> against such practices, which I think is really hard, I would be the
>> first one backing it up.
>> 
>> On a related note; there are packages that forward the .Random.seed
>> when loaded.  This is also an unfortunate behavior because you will
>> give different RNGs depending on that package was already loaded
>> before you called a function that depends on it or not.  For example,
>> if pkgA forwards the .Random.seed when loaded, the following is *not*
>> reproducible:
>> 
>> # User might or might not have loaded pkgA already
>> if (runif(1) < 0.5) loadNamespace(pkgA)
>> 
>> set.seed(0xBEEF)
>> loadNamespace(pkgA)
>> y <- runif(1)
>> 
>> I ran into this problem when doing some strict testing.  I argue this
>> also falls into the set of "bad" practices to be avoided.
>> 
>> /Henrik
>> 
>> On Wed, Oct 30, 2019 at 6:23 AM Duncan Murdoch <murdoch.dun...@gmail.com> 
>> wrote:
>>> 
>>> On 30/10/2019 9:08 a.m., peter dalgaard wrote:
>>>> We commit a similar sin in the help pages, e.g.
>>>> 
>>>> example(set.seed) ; runif(2)
>>>> example(set.seed) ; runif(2)
>>>> 
>>>> gives you the same random uniforms both times. (Of course it isn't that 
>>>> much of an issue, since you would rarely be running examples before any 
>>>> serious simulations.)
>>> 
>>> I think it's pretty common in example code, and that's justifiable.  But
>>> it could be avoided by using withr::with_seed() or something equivalent.
>>> 
>>> Duncan Murdoch
>>> 
>>>> 
>>>> You can fairly easily work around that by saving and restoring 
>>>> .Random.seed. I wonder if that isn't also true of the cases using 
>>>> set.seed() for other reasons?
>>>> 
>>>> -pd
>>>> 
>>>> 
>>>>> On 30 Oct 2019, at 13:46 , Duncan Murdoch <murdoch.dun...@gmail.com> 
>>>>> wrote:
>>>>> 
>>>>> On 30/10/2019 3:28 a.m., Marvin Wright wrote:
>>>>>> Hi all,
>>>>>> I recently found several calls of set.seed() in a CRAN package. These 
>>>>>> calls are in a plot function, which could lead to unexpected behaviour. 
>>>>>> See https://github.com/sammo3182/interplot/issues/33 
>>>>>> <https://github.com/sammo3182/interplot/issues/33> for a description of 
>>>>>> the problem.
>>>>>> I checked the CRAN repository policies and could not find anything about 
>>>>>> this. I would have expected a policy against setting fixed seeds 
>>>>>> somewhere in a package. Am I missing something?
>>>>> 
>>>>> set.seed() writes .Random.seed in the user's global environment, which 
>>>>> violates this policy:
>>>>> 
>>>>> - Packages should not modify the global environment (user’s workspace).
>>>>> 
>>>>> However, every call to a random number generator creates or modifies 
>>>>> .Random.seed as well, and most of those are expected and shouldn't be 
>>>>> flagged.  And interplot() is documented to do random simulations, so it 
>>>>> would be expected to change the seed:  the issue is that given the same 
>>>>> inputs it always changes it to the same thing.  I think that would be 
>>>>> quite hard for a test to detect.
>>>>> 
>>>>> Should it be a policy with no test?  Maybe, because I agree with you that 
>>>>> interplot()'s set.seed(324) is bad practice.
>>>>> 
>>>>> Duncan Murdoch
>>>>> 
>>>>> ______________________________________________
>>>>> R-devel@r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>> 
>>> 
>>> ______________________________________________
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] set.seed() in a package

Reply via email to