On Wed, May 18, 2022 at 02:50:28PM +1000, Damien Miller wrote: > On Tue, 17 May 2022, Raimo Niskanen wrote: > > > Why reinvent the wheel? > > > > Here is a pretty good walkthrough of established methods: > > > > https://www.pcg-random.org/posts/bounded-rands.html > > > > It sounds to me as if your suggested methor essentially is > > "Bitmask with Rejection -- Apple's Method" with the added twist > > to keep the unused bits of the base generator's word and append > > new, which might be an optimization for small ranges, but might > > be just overhead for large ones. > > > > In that post one can see that there might be other suggested smaller > > optimizations that can be applied to the OpenBSD method, and that > > the bitmask method is effective in many cases but not a clear winner. > > Oh nice, I wasn't aware that Apple had an improved algorithm. I always > thought the best avenue for a speedup was to consider only the bits that > could satisfy the constraint, but it never occurred to me how to actually > make use of this :) > > The toy benchmark below compares the existing implementation with > reimplementations of both their version as well as something more close > to Apple's actual method (which reuses the high bits). > > However, I don't see a speedup for either of the alternate approaches. > > [djm@djm ~]$ cc -O2 -Werror -Wall -Wextra -o rb rb.c && doas nice -n -20 ./rb > openbsd; elapsed = 8.327954 > bitmask; elapsed = 13.306228 > bitmask+reuse; elapsed = 11.567389
instrumenting the code to count the number of arc4random calls I see thsi: openbsd; elapsed = 2.835819; calls = 12340949 bitmask; elapsed = 4.335576; calls = 17836216 bitmask+reuse; elapsed = 3.710277; calls = 15245337 (this is a different number of trials on a slow machine). So the promise of less calls is not fulfilled. Sounds like a bug. -Otto