[R] regression towards the mean, AS paper November 2007
Duncan Murdoch
murdoch at stats.uwo.ca
Mon Dec 17 19:32:32 CET 2007
On 12/17/2007 1:21 PM, Troels Ring wrote:
> Dear friends, regression towards the mean is interesting in medical
> circles, and a very recent paper (The American Statistician November
> 2007;61:302-307 by Krause and Pinheiro) treats it at length. An initial
> example specifies (p 303):
> "Consider the following example: we draw 100 samples from a bivariate
> Normal distribution with X0~N(0,1), X1~N(0,1) and cov(X0,X1)=0.7, We
> then calculate the p value for the null hypothesis that the means of X0
> and X1 are equal, using a paired Student's t test. The procedure is
> repeated 1000 times, producing 1000 simulated p values. Because X0 and
> X1 have identical marginal distributions, the simulated p values behave
> like independent Uniform(0,1) random variables." This I did not
> understand, and simulating like shown below produced far from uniform
> (0,1) p values - but I fail to see how it is wrong. I contacted the
> authors of the paper but they did not answer. So, please, doesn´t the
> code below specify a bivariate N(0,1) with covariance 0.7? I get p
> values = 1 all over - not interesting, but how wrong?
> Best wishes
> Troels
>
> library(MASS)
> Sigma <- matrix(c(1,0.7,0.7,1),2,2)
> Sigma
> res <- NULL
> for (i in 1:1000){
> ff <-(mvrnorm(n=100, rep(0, 2), Sigma, empirical = TRUE))
> res[i] <- t.test(ff[,1],ff[,2],paired=TRUE)$p.value}
Specifying empirical=TRUE means that your sampled values are not
independent, the means are guaranteed to match exactly, and the mean
difference is exactly zero. Thus all of the t statistics are exactly
zero, and the p-values are exactly 1.
Set empirical=FALSE (the default), and you'll see more reasonable results.
Duncan Murdoch
More information about the R-help
mailing list