I'm intested in understanding why the standard error grows with respect to the square root of the sample size. For instance, using an honest coin and flipping it L times, the expected number of HEADS is half and we may define the error (relative to the expected number) to be
e = H - L/2, where H is the number of heads that we really obtained. The absolute value of e grows as L grows, but by how much? It seems statistical theory claims it grow by an order of the square root of L. To try to make things clearer to me, I decided to play a game. Players A, B compete to see who gets closer to the error in the number of HEADS in random samples selected by of an honest coin. Both players know the error should follow some square root of L, but B guesses 1/3 sqrt(L) while A guesses 1/2 sqrt(L) and it seems A is usually better. It seems statistical theory says the constant should be the standard deviation of the phenomenon. I may not have the proper terminology here. The standard deviation for the phenomenon of flipping an honest coin can be taken to be sqrt[((-1/2)^2 + (1/2)^2)/2] = 1/2 by defining that TAILS are zero and HEADS are one. (So that's why A is doing better.) The standard deviation giving the best constant seems clear because errors are normally distributed and that is intuitive. So the standard deviation gives a measure of how samples might vary, so we can use it to estimate how far a guess will be from the expected value. But standard deviation is only one measure. I could use the absolute deviation too, couldn't I? The absolute deviation of an honest coin turns out to be 1/2 too, so by luck that's the same answer. Maybe I'd need a different example to inspect a particular case of which measure would turn out to be better. Anyhow, it's not clear to me why standard deviation is really the best guess (if it is that at all) for the constant and it's even less clear to me why error grows with respect to the square root of the number of coin flips, that is, of the sample size. I would like to have an intuitive understanding of this, but if that's too hard, I would at least like to see some mathematical argument on an interesting book, which you might point me out to. Thank you! PS. Is this off-topic? I'm not aware of any newsgroup on statistics at the moment. Please point me to the adequate place if that's applicable? ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.