Duncan, Daniel, Thanks and indeed we intend to take the advice that Radford and Lukas have provided in this thread.
I do want to re-iterate that the generating system itself cannot have any conception of the use of form IDs as seeds for a PRNG *and* the system itself only generates a sequence of form IDs, which are then filtered & are passed to our API depending on basic rules on user inputs in that form. Either in our production system a truly remarkable probability event has happened or that the Mersenne-Twister is very susceptible to the first draw in the sequence to be correlated across closely related seeds. Both of these require understanding the Mersenne-Twister better. The solution here as has been suggested is to use a different RNG with adequate burn-in (in which case even MT would work) or to look more carefully at our problem and understand if we just need a hash function. In either case, we will cease to question R's implementation of Mersenne-Twister (for the time being). :) T On Sun, Nov 5, 2017 at 7:47 PM, Duncan Murdoch <murdoch.dun...@gmail.com> wrote: > On 04/11/2017 10:20 PM, Daniel Nordlund wrote: > >> Tirthankar, >> >> "random number generators" do not produce random numbers. Any given >> generator produces a fixed sequence of numbers that appear to meet >> various tests of randomness. By picking a seed you enter that sequence >> in a particular place and subsequent numbers in the sequence appear to >> be unrelated. There are no guarantees that if YOU pick a SET of seeds >> they won't produce a set of values that are of a similar magnitude. >> >> You can likely solve your problem by following Radford Neal's advice of >> not using the the first number from each seed. However, you don't need >> to use anything more than the second number. So, you can modify your >> function as follows: >> >> function(x) { >> set.seed(x, kind = "default") >> y = runif(2, 17, 26) >> return(y[2]) >> } >> >> Hope this is helpful, >> > > That's assuming that the chosen seeds are unrelated to the function > output, which seems unlikely on the face of it. You can certainly choose a > set of seeds that give high values on the second draw just as easily as you > can choose seeds that give high draws on the first draw. > > The interesting thing about this problem is that Tirthankar doesn't > believe that the seed selection process is aware of the function output. I > would say that it must be, and he should be investigating how that happens > if he is worried about the output, he shouldn't be worrying about R's RNG. > > Duncan Murdoch > > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel