Hello OpenBSD developers, Let me state in advance that I'm not entirely sure if I'm sending this to the right mailing list. Based on the descriptions, however, I do believe tech@ is the correct one instead of misc@. Furthermore, I'm not subscribed to any of the lists, so please CC me if you should want to reply. Finally, I've sent this message before from another account, but it didn't appear in any of the archives. If it did reach the list, I apologize. In the meantime, I did write some extra things that could be interesting.
I've been looking at the way several (BSD) operating systems implement random numbers, in the context of an online election system (what to do with an equal number of votes per seat, etc.). My current implementation reads a byte from /dev/random and converts it to the required range, possibly reading more bytes to avoid the systematic bias a naive implementation has for ranges that are not a power of 2. It works correctly, but there are some considerations (transparency and chroot jail compatibility) that have led me to look at alternative implementations. This is where arc4random_uniform comes in. The way it avoids biased results is different from mine (my implementation uses the high-order bits instead), but the main idea of throwing away results that are outside the range that is a whole multiple of the desired range is exactly the same. It might be a viable alternative if I can convince myself (and others) that the values it generates do not favor one party over the other, no matter how slightly. Obviously, the criteria are different from those applied to cryptography, but I don't consider myself competent to express them formally. This is why I compared the function in lib/libc/crypt/arc4random.c to that in sys/dev/rnd.c. I was surprised to find that the 32-bit arithmetic in the libc version differs from that in the kernel version. Apparently, back in 2008 some change was only applied to the kernel, not to libc. This is what the kernel source code says: min = ((0xffffffff - upper_bound) + 1) % upper_bound; This is what the libc source code says: min = ((0xffffffff - (upper_bound * 2)) + 1) % upper_bound; Now, I'm not a math geek, but it seems pretty obvious to me that the version in the kernel source is to be preferred, given that it's simpler and does exactly what it has to do. I'm assuming this is also the reason the change was implemented, and that's what the commit log seems to say as well. So, I wonder why the libc source code was not changed. Was this an oversight? Secondly, the kernel code has the added advantage that it works for any possible value of upper_bound (you can check this for yourself if you don't believe me). This means that the upper_bound > 0x80000000 check can be removed, making the common case faster (upper_bound <= 0x80000000 is definitely the common case in my book) and everything easier to understand and cleaner. Personally I don't see any real drawbacks, but feel free to disagree. ;) There are also other possible implementations that are even shorter, although they aren't necessarily easier to understand. In summary, the part between #else and #endif could be replaced by any of these lines: min = ((0xffffffff - upper_bound) + 1) % upper_bound; min = (1 + ~upper_bound) % upper_bound; min = (0 - upper_bound) % upper_bound; min = -upper_bound % upper_bound; The latter two implementations work correctly because the C standard defines modulo arithmetic for unsigned integers. Replace 0 by 2**32 and you'll see why it does what it's supposed to. I'm not entirely sure the last implementation would work on a machine that doesn't use two's complement. The standard is vague in its wording. Another option still would be to replace the entire #if block by this: min = (0x100000000 - upper_bound) % upper_bound; This will continue to work correctly even if int ever becomes 64-bit. Kind regards, Jorden Verwer