Re: Picky, but much more efficient arc4random_uniform!

Steffen Nurpmeso Sat, 14 May 2022 15:47:15 -0700

Stuart Henderson wrote in
 <yoauevrpnfb6u...@bamboo.spacehopper.org>:
 |On 2022/05/14 06:56, Luke Small wrote:
 |> If I use arc4random_uniform() repeatedly to create a random distribution \
 |> of
 |> say numbers less than 0x1000 or even something weird like 0x1300 will the
 |> random distribution be better with arc4random_uniform() or with mine?
 |
 |there's no point to have a choice of different arc4random_uniform_xyz
 |with different characteristics, how is somebody going to pick the
 |"right" one?
 |
 |the bottom of netbsd's version of the arc4random(3) manual says it well:
 |
 |     One may be tempted to create new APIs to accommodate different \
 |     security
 |     models and performance constraints without unpleasant surprises \
 |     on dif-
 |     ferent operating systems.  This should not be done lightly, though,
 |     because there are already too many different choices, and too \
 |     many oppor-
 |     tunities for programmers to reach for one and pick the wrong one.
 |
 |what's the perceived problem you're wanting to solve? and does that
 |problem actually exist in the first place?


The problem is that if have a low upper bound then modulo will
"remove a lot of randomization".  For example if you have
a program which generates Lotto numbers (1..49), then using
_uniform() as it is will generate many duplicates.

I used the approach i found in a TeX file over twenty years ago
("RANDOM.TEX, v.1 (Donald Arseneau) from TeTeX,
texmf/tex/generic/misc/random.tex"; and now that i look, he
published a .v2 not too long ago), it was named _clip() as it
produces something in between a minimum and a maximum:

        _max -= _min;
        ++_max;
        _min = Maximum::sir - 2; // m - 2 = 2^(32|64) - 3
        if (_max != 0)
                _min /= _max;

        for(;;) {
                uir i;

                i = _random;
                i += M1::uir;
                if(_min != 0)
                        i /= _min;

                if(i >= _max && (i != 0 || _max != 0)) {
                        _random *= 16807; // just MUL
                        if(_random == 0)
                                _random = 1;
                        continue;
                }
                _random = i;
                break;
        }

        _random += origmin;
        _Assert(1 && _random >= origmin && _random <= origmax);

That is the random number is multiplied with a, eh, non-prime, as
long as we do not fit.
However i found this became a terrible mess with 64-bit numbers,
so i removed the function, aka did not "port it over" when
i implemented a random number generator for my own use cases.
Maybe i should have increased the multiplicator.

It is really, really bad.  :-)

I only ever used it for a Lotto number program anyhow, i think,
and so i thought it would be better to simply read a one byte
random and skip those that do not fit, instead of making them
always fit with modulo 50.  That is, application specific
workaround.  Things are less harmful for large numbers, anyway.
In fact i have to say i was astonished to see someone using this
function, not too long ago i saw a FreeBSD commit fly by which
uses it!

I was a bit interested when i saw Luke's message, but i am no
mathematician and, like i said, i in fact never needed the
functionality.  And the question i would have is how small
upper_bound has to be for the modulo problem to really kick in.
And even if, whether it matters.  For example, if you have
a bitset of 1024 "in-flight things", and the small upper bound
hits a used slot, you could simply search forward until you find
a free one.  I mean, with 1024 possible values the number of
possibilities is anyhow restricted.

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

Re: Picky, but much more efficient arc4random_uniform!

Reply via email to