On Fri, Jul 15, 2016 at 2:53 AM, Pavlyk, Oleksandr < oleksandr.pav...@intel.com> wrote: > > Hi Robert, > > Thank you for the pointers. > > I think numpy.random should have a mechanism to choose between methods for generating the underlying randomness dynamically, at a run-time, as well as an extensible framework, where developers could add more methods. The default would be MT19937 for backwards compatibility. It is important to be able to do this at a run-time, as it would allow one to use different algorithms in different threads (like different members of the parallel Mersenne twister family of generators, see MT2203). > > The framework should allow to define randomness as a bit stream, a stream of fixed size integers, or a stream of uniform reals (32 or 64 bits). This is a lot of like MKL’s abstract method for basic pseudo-random number generation. > > Each method should provide routines to sample from uniform distributions over reals (in floats and doubles), as well as over integers. > > All remaining non-uniform distributions build on top of these uniform streams.
ng-numpy-randomstate does all of these. > I think it is pretty important to refactor numpy.random to allow the underlying generators to produce a given number of independent variates at a time. There could be convenience wrapper functions to allow to get one variate for backwards compatibility, but this change in design would allow for better efficiency, as sampling a vector of random variates at once is often faster than repeated sampling of one at a time due to set-up cost, vectorization, etc. The underlying C implementation is an implementation detail, so the refactoring that you suggest has no backwards compatibility constraints. > Finally, methods to sample particular distribution should uniformly support method keyword argument. Because method names vary from distribution to distribution, it should ideally be programmatically discoverable which methods are supported for a given distribution. For instance, the standard normal distribution could support method=’Inversion’, method=’Box-Muller’, method=’Ziggurat’, method=’Box-Muller-Marsaglia’ (the one used in numpy.random right now), as well as bunch of non-named methods based on transformed rejection method (see http://statistik.wu-wien.ac.at/anuran/ ) That is one of the items under discussion. I personally prefer that one simply exposes named methods for each different scheme (e.g. ziggurat_normal(), etc.). > It would also be good if one could dynamically register a new method to sample from a non-uniform distribution. This would allow, for instance, to automatically add methods to sample certain non-uniform distribution by directly calling into MKL (or other library), when available, instead of building them from uniforms (which may remain a fall-through method). > > The linked project is a good start, but the choice of the underlying algorithm needs to be made at a run-time, That's what happens. You instantiate the RandomState class that you want. > as far as I understood, and the only provided interface to query random variates is one at a time, just like it is currently the case > in numpy.random. -- Robert Kern
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion