On Thu, Jun 05, 2014 at 10:58:51PM -0500, Canek Peláez Valdés wrote: > On Thu, Jun 5, 2014 at 9:56 PM, <meino.cra...@gmx.de> wrote: > > Hi, > > > > I am experimenting with the C code of the ISAAC pseudo random number > > generator > > (http://burtleburtle.net/bob/rand/isaacafa.html). > > > > Currently the implementation creates (on my embedded linux) 32 bit > > hexadecimal output. > > So it's a 32 bit integer. > > > From this I want to create random numbers in the range of [a-Za-z0-9] > > *without violating randomness* and (if possible) without throwing > > away bits of the output. > > You mean *characters* int the range [A-Za-z0-9]?
Well this isn't as simple problem as it sounds. A random 32 bit integer has 32 bits of randomness. If you take a divison reminder of 62 from this integer you will get only 5,95419631039 bits of randomness (log(62)/log(2)). So you are wasting 81,4% of your random data. Which is quite much and usually random data is quite expensive. You can save your precious random data by taking only 6 bit from your 32 bit integer and dividing it by 62. Then you will be wasting only 0,8% of random data. Another problem is alignment, but that is about mathematical correctness. > > How can I do this mathemtically (in concern of the quality of output) > > correct? > > The easiest thing to do would be: The easiest is not mathematically correct though. Random data will stay random only if you select and modify it so that randomness is preserved. If you take devison reminder of 62 from 32 bit integer there are 69 273 667 possibilities of the reminder to be 3 or less. For the reminder to 4 or more the number of possibilities is 69 273 666. In mathematically ideal case the probability for every index of the list should be same: 1/62 = 1,61290322581%. But the modulo 62 modifies this probability: for index 0-3 the probability is 69 273 667/2^32 = 1,61290324759%. And for indexes 4-61 the probability will be 69 273 666/2^32 = 1,6129032243%. If you wish not to waste those random bits the probabilities will get worse. With 6 bits of random the probability for index 0-1 will be 2/64 and for 2-63 it will be 1/64. This is a very significant change because first and second index will appear twice as much as the rest. If you add 2 characters to your list you will perfect alignment and you can take 6 bits of data without it modifying probabilities. If you are looking a mathematically perfect solution there is a simple one even if your list is not in the power of 2! Take 6 bits at a time of the random data. If the result is 62 or 63 you will discard the data and get the next 6 bits. This selectively modifies the random data but keeps the probabilities in correct balance. Now the probability for index of 0-61 is 1/62 because the probability to get 62-63 out of 64 if 0. > ------------------------------------------------------------------------------- > #include <time.h> > #include <stdio.h> > #include <stdlib.h> > > #define N (26+26+10) > > static char S[] = { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', > 'K', 'L', 'M', > 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', > 'X', 'Y', 'Z', > 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', > 'k', 'l', 'm', > 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', > 'x', 'y', 'z', > '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' }; > > int > next_character() > { > // Use the correct call for ISAAC instead of rand() > unsigned int idx = rand() % N; > return S[idx]; > } so modify the next_char function: char next_character() { static unsigned int rand = 0; //(sizeof(int) = 32) static char bit_avail = 0; char result = 0; char move_bits = 0; char bits_moved = 0; do { if (!bits_avail) { // Use the correct call for ISAAC instead of rand() rand = rand(); bit_avail = 32; } move_bits = bits_avail >= 6 ? 6 : bits_avail; result <<= move_bits; result = (result | rand & (0xFF >> (8 - move_bits))) & 0x3F; bits_avail -= move_bits; bits_moved += move_bits; rand >>= move_bits; } while (bits_moved != 6 && result > 61); return result; } This function will give perfect distribution of 1/62 probability for every index. It will waste 6 bits with the probability of 1/32 (2/64). > int > main(int argc, char* argv[]) > { > // Use the correct call for initializing the ISAAC seed > srand((unsigned int)time(NULL)); > for (int i = 0; i < 20; i++) // --std=c99 > printf("%c\n", next_character()); > return 0; > } > ------------------------------------------------------------------------------- > > If the ISAAC RNG has a good distribution, then the next_character() > function will give a good distribution among the set [A-Za-z0-9]. > > Unless I missunderstood what you meant with "create random numbers in > the range of [a-Za-z0-9]". -- -Matti