This is version 1, which I was pretty sure was secure. I revamped it with a few features and implanted the binary search for 'bit'
in most cases, which aren't intentionally worst-case, it's pretty darned fast! This is a sample run of my program with your piece of code included: 1 99319 100239 2 100353 99584 3 100032 99879 4 99704 100229 5 99530 99796 6 99617 100355 7 100490 100319 8 99793 100114 9 100426 99744 10 100315 100558 11 99279 99766 12 99572 99750 13 99955 100017 14 100413 100005 15 100190 100052 16 101071 100195 17 100322 100224 18 99637 99540 19 100323 99251 20 99841 100177 21 99948 99504 22 100065 100031 23 100026 99827 24 99836 99818 25 100245 99822 26 100088 99678 27 99957 99993 28 100065 99961 29 100701 100679 30 99756 99587 31 100220 100076 32 100067 100005 33 99547 99984 34 100124 100031 35 99547 100661 36 99801 99963 37 100189 100230 38 99878 99579 39 99864 99442 40 99683 100004 41 99907 100094 42 100291 99817 43 99908 99984 44 100044 100606 45 100065 100120 46 99358 100141 47 100152 100442 48 100000 100279 49 100486 99848 newdata time = 0.240530757 seconds newdata_fast time = 0.073868626 seconds The original takes 325.620% of the runtime of newdata_fast newdataTypableFilename time = 0.236748989 seconds newdataTypableFilename_fast time = 0.115842333 seconds The original takes 204.372% of the runtime of newdataTypableFilename_fast rand_ideal time = 0.235832820 seconds rand_ideal_fast time = 0.025300798 seconds The original takes 932.116% of the runtime of rand_ideal_fast rand_short_ideal time = 0.236828684 seconds rand_short_ideal_fast time = 0.124025922 seconds The original takes 190.951% of the runtime of rand_short_ideal_fast rand_short_worst time = 0.237142869 seconds rand_short_worst_fast time = 0.294156486 seconds The original takes 80.618% of the runtime of rand_short_worst_fast rand_worst time = 0.237002775 seconds rand_worst_fast time = 0.377148420 seconds The original takes 62.841% of the runtime of rand_worst_fast shuffle time = 0.003044472 seconds shuffle_fast time = 0.002590664 seconds The original takes 117.517% of the runtime of shuffle_fast If it crashed here, you are trying to profile. Turn off pledge at the beginning of main(). On Sat, May 14, 2022 at 7:03 PM Philip Guenther <guent...@gmail.com> wrote: > On Sun, 15 May 2022, Steffen Nurpmeso wrote: > > Stuart Henderson wrote in > ... > > |what's the perceived problem you're wanting to solve? and does that > > |problem actually exist in the first place? > > > > The problem is that if have a low upper bound then modulo will "remove a > > lot of randomization". For example if you have a program which > > generates Lotto numbers (1..49), then using _uniform() as it is will > > generate many duplicates. > > Wut. The *WHOLE POINT* of arc4random_uniform() is that it has uniform > distribution. Says right so in the manpage. If an implementation of that > API fails to do that, it's a broken implementation. > > The OpenBSD implementation achieves that. NetBSD's implementation has the > same core logic. Here's my quick lotto pick demo program, doing 4900000 > picks, so they should all be somewhere near 100000: > > #include <stdlib.h> > #include <stdio.h> > #define LIMIT 49 > int > main(void) > { > unsigned counts[LIMIT] = { 0 }; > int i; > for (i = 0; i < 100000 * LIMIT; i++) > counts[arc4random_uniform(LIMIT)]++; > for (i = 0; i < LIMIT; i++) > printf("%2d\t%7u\n", i+1, counts[i]); > return 0; > } > > And a sample run: > > : bleys; ./a.out > > 1 100100 > 2 100639 > 3 99965 > 4 99729 > 5 99641 > 6 99650 > 7 100299 > 8 100164 > 9 99791 > 10 100101 > 11 100657 > 12 100042 > 13 99661 > 14 99927 > 15 99426 > 16 99491 > 17 99646 > 18 100133 > 19 100013 > 20 99942 > 21 99873 > 22 99924 > 23 99567 > 24 100152 > 25 100688 > 26 100011 > 27 100481 > 28 99980 > 29 100406 > 30 99726 > 31 99808 > 32 99929 > 33 100050 > 34 99983 > 35 100048 > 36 99771 > 37 99906 > 38 100215 > 39 100261 > 40 100426 > 41 99847 > 42 99533 > 43 100368 > 44 99695 > 45 100041 > 46 100465 > 47 99875 > 48 100034 > 49 99920 > : bleys; > > Looks pretty good to me, with repeated runs showing the values bouncing > around. > > > ... > > I was a bit interested when i saw Luke's message, but i am no > > mathematician and, like i said, i in fact never needed the > > functionality. And the question i would have is how small > > upper_bound has to be for the modulo problem to really kick in. > > If the implementation isn't broken, it's not a problem, period. > > > > And even if, whether it matters. For example, if you have > > a bitset of 1024 "in-flight things", and the small upper bound > > hits a used slot, you could simply search forward until you find > > a free one. I mean, with 1024 possible values the number of > > possibilities is anyhow restricted. > > Well hey, let's give Luke's a try and see how uniform it is: > > > #include <stdlib.h> > #include <stdio.h> > #define LIMIT 49 > int > main(void) > { > unsigned counts[LIMIT] = { 0 }; > unsigned counts2[LIMIT] = { 0 }; > int i; > for (i = 0; i < 100000 * LIMIT; i++) { > counts[arc4random_uniform(LIMIT)]++; > counts2[arc4random_uniform_fast_simple(LIMIT)]++; > } > for (i = 0; i < LIMIT; i++) > printf("%2d\t%7u\t%7u\n", i+1, counts[i], counts2[i]); > return 0; > } > > : bleys; ./a.out > 1 100188 76502 > 2 99983 76602 > 3 100521 76522 > 4 100416 76682 > 5 100171 76387 > 6 100163 76759 > 7 100024 76336 > 8 100009 76703 > 9 99769 76237 > 10 99892 76532 > 11 100197 76730 > 12 100483 76398 > 13 99769 76310 > 14 100075 76474 > 15 99781 76599 > 16 99846 76439 > 17 99814 76430 > 18 100313 76648 > 19 100259 76813 > 20 99885 77068 > 21 100302 76546 > 22 99998 76698 > 23 99491 76678 > 24 100340 76324 > 25 99763 115263 > 26 99872 153008 > 27 100022 152979 > 28 99481 153793 > 29 100018 210714 > 30 99617 229286 > 31 100167 297003 > 32 100270 449664 > 33 100468 76790 > 34 99115 76452 > 35 99921 76392 > 36 99862 76140 > 37 100485 76607 > 38 100029 75885 > 39 99577 76498 > 40 99479 76727 > 41 100139 76746 > 42 100883 76698 > 43 100102 76474 > 44 99801 76592 > 45 100117 76124 > 46 99678 76417 > 47 99770 76639 > 48 99524 77034 > 49 100151 76658 > : bleys; > > Wow, that last column is *bad*. Repeated runs show 25-32 are consistently > high, 32 being *outrageously* off, and the others are low. > > Luke's implementation does not correctly implement the API. Doesn't > matter if it's a million times faster when it doesn't deliver the goal. > > > Philip Guenther >
arc4random_uniform_fast.c
Description: Binary data