Bug in rnorm. (PR#1664)
rolf@math.unb.ca
rolf@math.unb.ca
Thu, 13 Jun 2002 21:36:42 +0200 (MET DST)
There appears to be a mild bug, or at least a deficiency, in
rnorm. The bug becomes apparent when one looks at extremes
of the squares of the values generated by rnorm; rnorm is not
generating quite enough extreme values.
The R version that I am using is 1.4.1; I never got around to installing
1.5.0, and now since 1.5.1 is about to come out .... However, checking
the 1.5.0 release notes revealed no mention of fixing a bug in rnorm.
Version details:
platform sparc-sun-solaris2.7
arch sparc
os solaris2.7
system sparc, solaris2.7
status
major 1
minor 4.1
year 2002
month 01
day 30
language R
More detail regarding how the bug revealed itself:
==================================================
For n = 100, 200, ..., 1500
o I generated 1000 sequences of length n via rnorm(n),
o for each sequence x, I calculated m = the max of x^2
o I then calculated pval = 1 - pchisq(m,1)^n
o I then calculated s.hat.n = #{pval: pval < 0.05}/1000
I then plotted s.hat.n versus n. This ***should*** give a result
close to a horizontal straight line, at height 0.05 --- but it
didn't. For the larger values of n, the values of s.hat.n were
displaced significantly below 0.05.
After some discussion with colleagues, I replaced the calls to rnorm()
by calls to myrnorm() defined by
myrnorm <- function(n,mu=0,sigma=1){
mu + sigma*cos(2*pi*runif(n))*sqrt(-2*log(runif(n)))
}
which uses the ``(r,theta)'' method of generating random normals.
When I did so, the resulting values were indeed all ``close to'' 0.05,
as they should be.
I also tried the experiment using rchisq(n,1) instead of rnorm(n) (and
then of course taking m = max of x --- rather than max of x^2). Again
all the resulting values were close to 0.05 as ought to be the case.
(So rchisq() appears to be OK in this regard.)
Enclosed below is a script to demonstrate the bug.
cheers,
Rolf Turner
rolf@math.unb.ca
#==+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===
#
# Script to demonstrate the bug in rnorm.
#
myrnorm <- function(n,mu=0,sigma=1){
mu + sigma*cos(2*pi*runif(n))*sqrt(-2*log(runif(n)))
}
# If RFUN <- rnorm we get ``wrong'' answers; if RFUN <- myrnorm,
# we get ``right'' answers.
RFUN <- rnorm
NSER <- 1000
set.seed(350734)
rslt <- list()
for(K in 1:15) {
N <- 100*K
M <- matrix(RFUN(NSER*N),N,NSER)
T2 <- apply(M,2,function(x){max(x**2)})
PV <- 1 - pchisq(T2,1)**N
SZ <- sum(PV < 0.05)/NSER
rslt[[K]] <- SZ
cat(K,"\n")
}
rslt <- unlist(rslt)
plot(100*(1:15),rslt,ylim=c(0,0.1),xlab='n',ylab='s.hat.n')
abline(h=0.05)
error.bar(100*(1:15),rslt,lower=1.96*sqrt(0.05*0.95/1000),add=TRUE)
# Clean up:
rm(RFUN,NSER,K,N,M,T2,PV,SZ)
#==+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._