On Aug 26, 2011, at 2:57 PM, Robert G. Brown wrote: > On Fri, 26 Aug 2011, Vincent Diepeveen wrote: > >> EVERY PROGRAMMER IS DOING THIS TO USE RANDOM NUMBERS IN THEIR >> PROGRAM. > > Bullshit. "Every programmer" isn't dumb as a post. Or wasn't my > argument clear enough? Do you need me to actually post the code > for how > the GSL -- written by at least some of these programmers -- do this? > Here, I'll try again. This time I'll use smaller numbers and make an > actual table of the outcomes: Imagine only two lousy random bits, > enough to make 00, 01, 10, 11 (or 0,1,2,3). Here is the probability > table: > > r = 0 1 2 3 > ------------------------ > p = 0.25 0.25 0.25 0.25 > > Let us generate N samples from this distribution. Our expected > frequency of the occurrence of all of these numbers is: > > r = 0 1 2 3 > --------------------------------- > Np = 0.25*N 0.25*N 0.25*N 0.25*N > > Is this all clear? If I generate 100 random numbers, the expected > number of 3's is 0.25*100 = 25. Now apply mod 3 the outcomes are > now: > > r = 0 1 2 3 > r%3 = 0 1 2 0 > --------------------------------- > Np = 0.25*N 0.25*N 0.25*N 0.25*N > > You now sum the number of outcomes for each element in the mod 3 > table, > since we have two values of r that make one value of r%3 and frequency > clearly aggregates as the outcomes are independent. > > r%3 = 0 1 2 > --------------------------------- > Np = 0.50*N 0.25*N 0.25*N 0.25*N > > It is therefore twice as likely that two random bits, modulus 3, will > produce a zero. >
If you have a domain of 0..3 where a generator generates and your modulo n is just n-1, obviously that means it'll map a tad more to 0. Basically the deviation one would be able to measure in such case is that if we have a generator that runs over a field of say size m and we want to map that onto n entries then we have the next formula : m = x * n + y; Now your theory is basically if i summarize it that in such case the entries 0..y-1 will have a tad higher hit than y.. m-1. However if x is large enough that shouldn't be a big problem. If we map now in the test i'm doing onto say a few million to a billion entries, the size of that x is a number of 40+ bits for most RNG's. So that means that the deviation of the effect you show above the order of magnitued of 1 / 2^40 in such case, which is rather small. Especially because the 'test' if you want to call it like that, is operating in the granularity O ( log n ), we can fully ignore then the expected deviation granularity O ( 2 ^ 40 ). >> >> Apologies for the caps. I hope how important this is. You're >> claiming all programmers >> use random numbers in a faulty manner? > > They don't. Only you do. Everybody else takes a uniform deviate and > scales it by the number of desired integer outcomes, checking to make > sure that they don't go out of bounds and thereby e.g. get an > incorrect > endpoint frequency. The gsl code is open source and it takes two > minutes to download it and check (I just timed it). Go on, look. the > file is rng/rng.c in the gsl distro directory, the function name is > gsl_rng_uniform_int. No modulus. > > The exception is (obviously) when the range is a power of 2. In that > case ONLY, r%n where r is a binary uint and n is a power of 2 will > (obviously) equally balance the table above. Personally I'd use >> > and > shift the bits because it is faster than mod, but suit yourself, after > you've learned what you are doing. > >> >> This is important enough to further discuss about it. >> >> As nearly always you need random numbers from within a given >> domain say 0.. n-1 >> So projecting a RNG onto that domain is pretty crucial. How would >> you want to do that in a correct manner? >> >> In the slot test in fact a simple AND is enough. > > No, as I've just proven algebraically. The correct manner for > general n > is the gsl code, but in rough terms it is n*r/r_max (with care used to > avoid roundoff errors at the ends as noted). If you've been using > modulus, all your results are crap. > > Look, the reason God invented the GSL and made it open source is so > numb-nuts and smart people alike wouldn't have to constantly reinvent > the wheel, badly. Use it. Don't question it -- you obviously aren't > competent to. Just use it. If you want a random integer from 0 to n, > use gsl_rn_uniform_int. If you want this for e.g. mt19937 don't write > the latter, set up the gsl to use it to generate your ints. Learn to > use it carefully, use it correctly, but use it. > >> It's not interesting to discuss - but yes this strategy makes >> money in casino's, >> you just get thrown out of the casino and end up at the blacklist >> if you do. > > You are clearly too stupid to be allowed out of the house without a > caretaker. I'm not going to walk you through the proof that this > isn't > so as it is openly published and I've already referenced a step my > step > analysis that you can't be bothered, apparently, to actually read. > I'll > just reiterate the previous offer -- I, too, am happy to buy a > roulette > wheel and you can come over and bet Martingale against me all day. > Just > one 0, no limits and no quitting, infinite credit on both sides, we > play > until it is obvious to you that you are losing, have lost, will always > lose, and the longer you play the more that you will lose. Loser buys > the winner a case of truly excellent beer. > > Look, why don't you fix your random number code and try again, since > your simulations are obviously trash. It isn't difficult to show this > with simulations, once you actually code them correctly, but I have to > go and don't have time to do it for you. > > rgb > > Robert G. Brown http://www.phy.duke.edu/~rgb/ > Duke University Dept. of Physics, Box 90305 > Durham, N.C. 27708-0305 > Phone: 1-919-660-2567 Fax: 919-660-2525 email:r...@phy.duke.edu > > _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf