>>>>> "PaulG" == Paul Gilbert <[EMAIL PROTECTED]> >>>>> on Thu, 15 Sep 2005 12:07:48 -0400 writes:
PaulG> BTW, I think there is a problem with the way the PaulG> argument "seed" is used in the new simulate in stats. PaulG> The problem is that repeated calls to simulate using PaulG> the default argument will introduce a new pattern PaulG> into the RNG: >> stats:::simulate PaulG> function (object, nsim = 1, seed = as.integer(runif(1, 0, PaulG> .Machine$integer.max)), ...) PaulG> UseMethod("simulate") PaulG> <environment: namespace:stats> >> stats:::simulate.lm PaulG> function (object, nsim = 1, seed = as.integer(runif(1, 0, PaulG> .Machine$integer.max)), ...) PaulG> { PaulG> if (!exists(".Random.seed", envir = .GlobalEnv)) PaulG> runif(1) PaulG> RNGstate <- .Random.seed PaulG> set.seed(seed) PaulG> ... PaulG> This should not be done, as the resulting RNG has not PaulG> been studied or proven. A better mechanism is to have PaulG> a default argument equal NULL, and not touch the seed PaulG> in that case. I agree so far. I think the default seed should really be NULL (with your semantic!) rather than a random number. PaulG> There are several examples of this in PaulG> the package dse1 (in bundle dse), see for example PaulG> simulate.ARMA and simulate.SS. They also use the PaulG> utilities in the setRNG package to save more of the PaulG> information necessary to reproduce PaulG> simulations. Roughly it is done like this: PaulG> simulate.x <- function (model, rng = NULL, ...) PaulG> {if (is.null(rng)) rng <- setRNG() PaulG> ## returns the RNG setting to be saved with the result PaulG> else { PaulG> old.rng <- setRNG(rng) PaulG> on.exit(setRNG(old.rng)) PaulG> } PaulG> ... as nobody has further delved into this in the mean time, this is definitely too late for R 2.2.0, even if it was desired. But I also think you should be able to live with interpreting 'seed' as 'rng' if you want, shouldn't you? PaulG> The seed by itself is not very useful if the purpose PaulG> is to be able to reproduce things, and I think it PaulG> would be a good idea to incorporate the few small PaulG> functions setRNG into stats (especially if the PaulG> simulate mechanism is being introduced). maybe we should reopen this topic {adopting ideas or even exact implementations from your 'setRNG' into stats} in a few weeks, when R 2.2.0 is released. PaulG> The argument "nsim" presumably alleviates to some PaulG> extent the above concern about changing the RNG PaulG> pattern. However, in my fairly extensive experience PaulG> it is not very workable to produce all the PaulG> simulations and then do the analysis of them. PaulG> In a Monte Carlo experiment the generated data set is just PaulG> too big. I believe this depends very much on the topic. The simulate() uses that we had envisaged with simulate() don't save all the models and then analyze them. But maybe I'm misunderstanding your point completely here. PaulG> A better approach is to do the analysis and save PaulG> only necessary information after each PaulG> simulation. That is the approach, for example, in PaulG> dse2:::EstEval. PaulG> Paul PaulG> Paul Gilbert wrote: >> Can the arguments nsim and seed be passed as part of ... in the new >> simulate generic in R-2.2.0alpha package stats? >> This would potentially allow me to use the stats generic rather than >> the one I define in dse. There are contexts where nsim and seed do not >> make sense. Well, the current specification for simulate() has been different explicitly. I agree that there are situations where both 'nsim' and 'seed' (or a generalization, say 'RNGstate') wouldn't make sense and one still would like to use something like "simulate" in the function name. >> I realize that the default arguments could be ignored, but >> it does not really make sense to introduce a new generic with that in >> mind. I think it would depend on the exaxt context if I would rather use a (slightly) different function name, or just ignore the ignorable arguments as you mention. >> (I would also prefer that the "object" argument was called >> "model" but this is less important.) I'd personally agree with that; the argument was that 'object' is very generally used in such situations. Martin Maechler ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel