https://marc.info/?l=openbsd-tech&m=165259528425835&w=2
This one (which is strongly based upon my first of two versions) which I submitted after Guenther correctly trashed version 2, doesn’t reuse any part of the sample. It picks up a clean new bitfield upon failure. I think Guenther didn’t, perhaps like yourself, realize I submitted this later program. That’s why he said it wasn’t correct. It didn’t occur to me at the time of responding to him: “correct correct correct.” On Sun, May 15, 2022 at 7:47 PM Damien Miller <d...@mindrot.org> wrote: > On Sat, 14 May 2022, Luke Small wrote: > > > Look at my code. I don’t even use a modulus operator. I perform hit and > > miss with a random bitstream. > > > > How can I have a bias of something I don’t do? I return a bitstream which > > meets the parameters of being a value less than the upper bound. Much > like > > arc4random_buf(). > > > > If I use arc4random_uniform() repeatedly to create a random distribution > of > > say numbers less than 0x1000 or even something weird like 0x1300 will the > > random distribution be better with arc4random_uniform() or with mine? For > > 0x1000 mine will simply pluck 12 bits of random data straight from the > > arc4random() (and preserve the remaining 20 bits for later) on the first > > try, just like it’s arc4random_buf(). > > > > arc4random_uniform() will perform a modulus of a 32 bit number which adds > > data to the bitstream. Does it make it better? Perhaps it makes it harder > > to guess the source bits. > > > > I don’t know; and I’m not going to pretend to be a cryptologist. But I’m > > looking at modulo bias. > > > > I didn’t know what it was, before, but I basically “rejection sample”: > > > > > https://research.kudelskisecurity.com/2020/07/28/the-definitive-guide-to-modulo-bias-and-how-to-avoid-it/ > > No, you aren't: > > > for (;;) { > > if (rand_bits < bits) { > > rand_holder |= ((uint64_t)arc4random()) << > > rand_bits; > > > > /* > > * rand_bits will be a number between 0 and 31 > here > > * so the 0x20 bit will be empty > > * rand_bits += 32; > > */ > > rand_bits |= 32; > > } > > > > ret = rand_holder & uuu; > > rand_holder >>= bits; > > rand_bits -= bits; > > > > if (ret < upper_bound) > > return ret; > > } > > This isn't rejection sampling. This is reusing part of the rejected > sample. > > Think of it like this: you want to uniformly generate a number in the > range [2:10] by rolling 2x 6-sided dice. What do you do when you roll > 11 or 12? You can't just reroll one of the dice because the other dice > is constrained to be have rolled either 5 or 6, and so proceeding with > it would force the output to be in the range [6:11] for these ~5.6% > of initial rolls. Your output is no longer uniform. > > BTW the existing code already implements the prefered approach of the > article you quoted. > > -d -- -Luke