[R] on the growth of standard error

Wayne Harris wh@rr|@1 @end|ng |rom protonm@||@com
Fri Aug 21 22:25:06 CEST 2020


I'm intested in understanding why the standard error grows with respect
to the square root of the sample size.  For instance, using an honest
coin and flipping it L times, the expected number of HEADS is half and
we may define the error (relative to the expected number) to be

  e = H - L/2,

where H is the number of heads that we really obtained.  The absolute
value of e grows as L grows, but by how much?  It seems statistical
theory claims it grow by an order of the square root of L.

To try to make things clearer to me, I decided to play a game.  Players
A, B compete to see who gets closer to the error in the number of HEADS
in random samples selected by of an honest coin.  Both players know the
error should follow some square root of L, but B guesses 1/3 sqrt(L)
while A guesses 1/2 sqrt(L) and it seems A is usually better.

It seems statistical theory says the constant should be the standard
deviation of the phenomenon.  I may not have the proper terminology
here.  The standard deviation for the phenomenon of flipping an honest
coin can be taken to be sqrt[((-1/2)^2 + (1/2)^2)/2] = 1/2 by defining
that TAILS are zero and HEADS are one.  (So that's why A is doing
better.)

The standard deviation giving the best constant seems clear because
errors are normally distributed and that is intuitive.  So the standard
deviation gives a measure of how samples might vary, so we can use it to
estimate how far a guess will be from the expected value.  

But standard deviation is only one measure.  I could use the absolute
deviation too, couldn't I?  The absolute deviation of an honest coin
turns out to be 1/2 too, so by luck that's the same answer.  Maybe I'd
need a different example to inspect a particular case of which measure
would turn out to be better.

Anyhow, it's not clear to me why standard deviation is really the best
guess (if it is that at all) for the constant and it's even less clear
to me why error grows with respect to the square root of the number of
coin flips, that is, of the sample size.

I would like to have an intuitive understanding of this, but if that's
too hard, I would at least like to see some mathematical argument on an
interesting book, which you might point me out to.

Thank you!

PS. Is this off-topic?  I'm not aware of any newsgroup on statistics at
the moment.  Please point me to the adequate place if that's applicable?



More information about the R-help mailing list