Stuart Henderson wrote in <yoauevrpnfb6u...@bamboo.spacehopper.org>: |On 2022/05/14 06:56, Luke Small wrote: |> If I use arc4random_uniform() repeatedly to create a random distribution \ |> of |> say numbers less than 0x1000 or even something weird like 0x1300 will the |> random distribution be better with arc4random_uniform() or with mine? | |there's no point to have a choice of different arc4random_uniform_xyz |with different characteristics, how is somebody going to pick the |"right" one? | |the bottom of netbsd's version of the arc4random(3) manual says it well: | | One may be tempted to create new APIs to accommodate different \ | security | models and performance constraints without unpleasant surprises \ | on dif- | ferent operating systems. This should not be done lightly, though, | because there are already too many different choices, and too \ | many oppor- | tunities for programmers to reach for one and pick the wrong one. | |what's the perceived problem you're wanting to solve? and does that |problem actually exist in the first place?
The problem is that if have a low upper bound then modulo will "remove a lot of randomization". For example if you have a program which generates Lotto numbers (1..49), then using _uniform() as it is will generate many duplicates. I used the approach i found in a TeX file over twenty years ago ("RANDOM.TEX, v.1 (Donald Arseneau) from TeTeX, texmf/tex/generic/misc/random.tex"; and now that i look, he published a .v2 not too long ago), it was named _clip() as it produces something in between a minimum and a maximum: _max -= _min; ++_max; _min = Maximum::sir - 2; // m - 2 = 2^(32|64) - 3 if (_max != 0) _min /= _max; for(;;) { uir i; i = _random; i += M1::uir; if(_min != 0) i /= _min; if(i >= _max && (i != 0 || _max != 0)) { _random *= 16807; // just MUL if(_random == 0) _random = 1; continue; } _random = i; break; } _random += origmin; _Assert(1 && _random >= origmin && _random <= origmax); That is the random number is multiplied with a, eh, non-prime, as long as we do not fit. However i found this became a terrible mess with 64-bit numbers, so i removed the function, aka did not "port it over" when i implemented a random number generator for my own use cases. Maybe i should have increased the multiplicator. It is really, really bad. :-) I only ever used it for a Lotto number program anyhow, i think, and so i thought it would be better to simply read a one byte random and skip those that do not fit, instead of making them always fit with modulo 50. That is, application specific workaround. Things are less harmful for large numbers, anyway. In fact i have to say i was astonished to see someone using this function, not too long ago i saw a FreeBSD commit fly by which uses it! I was a bit interested when i saw Luke's message, but i am no mathematician and, like i said, i in fact never needed the functionality. And the question i would have is how small upper_bound has to be for the modulo problem to really kick in. And even if, whether it matters. For example, if you have a bitset of 1024 "in-flight things", and the small upper bound hits a used slot, you could simply search forward until you find a free one. I mean, with 1024 possible values the number of possibilities is anyhow restricted. --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)