[R-sig-ME] large data set implies rejection of null?
Daniel Ezra Johnson
danielezrajohnson at gmail.com
Sat Nov 27 20:22:07 CET 2010
On 11/24/10 07:59, Rolf Turner wrote:
> >>
> >> It is well known amongst statisticians that having a large enough data set will
> >> result in the rejection of *any* null hypothesis, i.e. will result in a small
> >> p-value.
This seems to be a well-accepted guideline, probably because in the
social sciences, usually, none of the predictors truly has an effect
size of zero.
However, unless I am misunderstanding it, the statement appears to me
to be more generally false.
For example, when the population difference of means actually equals
zero, in a t-test, very large sample sizes do not lead to small
p-values.
set.seed(1)
n <- 1000000 # 10^6
dat.1 <- rnorm(n/2,0,1)
dat.2 <- rnorm(n/2,0,1)
t.test(dat.1,dat.2,var.equal=T)
# p = 0.60
set.seed(1)
n <- 10000000 # 10^7
dat.1 <- rnorm(n/2,0,1)
dat.2 <- rnorm(n/2,0,1)
t.test(dat.1,dat.2,var.equal=T)
# p = 0.48
set.seed(1)
n <- 100000000 # 10^8
dat.1 <- rnorm(n/2,0,1)
dat.2 <- rnorm(n/2,0,1)
t.test(dat.1,dat.2,var.equal=T)
# p = 0.80
Such results - where the null hypothesis is NOT rejected - would
presumably also occur in any experimental situations where the null
hypothesis was literally true, regardless of the size of the data set.
No?
Daniel
More information about the R-sig-mixed-models
mailing list