[R-sig-ME] large data set implies rejection of null?

Sat Nov 27 20:22:07 CET 2010

On 11/24/10 07:59, Rolf Turner wrote:
> >>
> >> It is well known amongst statisticians that having a large enough data set will
> >> result in the rejection of *any* null hypothesis, i.e. will result in a small
> >> p-value.

This seems to be a well-accepted guideline, probably because in the
social sciences, usually, none of the predictors truly has an effect
size of zero.
However, unless I am misunderstanding it, the statement appears to me
to be more generally false.
For example, when the population difference of means actually equals
zero, in a t-test, very large sample sizes do not lead to small
p-values.

set.seed(1)
n <- 1000000  # 10^6
dat.1 <- rnorm(n/2,0,1)
dat.2 <- rnorm(n/2,0,1)
t.test(dat.1,dat.2,var.equal=T)
# p = 0.60

set.seed(1)
n <- 10000000  # 10^7
dat.1 <- rnorm(n/2,0,1)
dat.2 <- rnorm(n/2,0,1)
t.test(dat.1,dat.2,var.equal=T)
# p = 0.48

set.seed(1)
n <- 100000000  # 10^8
dat.1 <- rnorm(n/2,0,1)
dat.2 <- rnorm(n/2,0,1)
t.test(dat.1,dat.2,var.equal=T)
# p = 0.80

Such results - where the null hypothesis is NOT rejected - would
presumably also occur in any experimental situations where the null
hypothesis was literally true, regardless of the size of the data set.
No?

Daniel