[R] Using R to illustrate the Central Limit Theorem
Kevin E. Thorpe
kevin.thorpe at utoronto.ca
Fri May 13 01:03:20 CEST 2005
This thread was very timely for me since I will be teaching an introductory
biostats course in the fall, including a session on CLT. I have shamelessly
taken Dr. Venables' slick piece of code and wrapped it in a function so that
I can vary the maximum sample size at will. I then created functions based
on the first to sample from the exponential and the chi-squaredistributions.
Lastly, I created a function to sample from a pdf with a parabolic shape
(confirming in the process just how rusty my calculus is :-) ). My code is
below for all to do with as they please.
Now, at the risk of asking a really obvious question ...
In my final function, I used the inverse probability integral transformation
to sample from my parabolic distribution. I create a local function rparab
shown here:
rparab <- function(nn) {
u <- 2*runif(nn) - 1
ifelse(u<0,-(abs(u)^(1/3)),u^(1/3))
}
It seems that in my version of R (2.0.1) on Linux, that calculating the cube
root of a negative number using ^(1/3) returns NaN. I looked at the help in
the arithmetic operators and did help.search("cube root"),
help.search("root")
and help.search("cube") and recognised no alternatives. So I used an
ifelse() to
deal with the negatives. Have I missed something really elementary?
Thanks
clt.unif <- function(n=20) {
N <- 10000
graphics.off()
par(mfrow = c(1,2), pty = "s")
for(k in 1:n) {
m <- (rowMeans(matrix(runif(N*k), N, k)) - 0.5)*sqrt(12*k)
hist(m, breaks = "FD", xlim = c(-4,4),
main = paste("Uniform(0,1), n = ",k,sep=""),
prob = TRUE, ylim = c(0,0.5), col = "lemonchiffon")
pu <- par("usr")[1:2]
x <- seq(pu[1], pu[2], len = 500)
lines(x, dnorm(x), col = "red")
qqnorm(m, ylim = c(-4,4), xlim = c(-4,4), pch = ".", col = "blue")
abline(0, 1, col = "red")
Sys.sleep(1)
}
}
clt.exp <- function(n=20,theta=1) {
N <- 10000
graphics.off()
par(mfrow = c(1,2), pty = "s")
for(k in 1:n) {
m <- (rowMeans(matrix(rexp(N*k,1/theta), N, k)) - theta)/sqrt(theta^2/k)
hist(m, breaks = "FD", xlim = c(-4,4),
main = paste("exp(",theta,"), n = ",k,sep=""),
prob = TRUE, ylim = c(0,0.5), col = "lemonchiffon")
pu <- par("usr")[1:2]
x <- seq(pu[1], pu[2], len = 500)
lines(x, dnorm(x), col = "red")
qqnorm(m, ylim = c(-4,4), xlim = c(-4,4), pch = ".", col = "blue")
abline(0, 1, col = "red")
Sys.sleep(1)
}
}
clt.chisq <- function(n=20,nu=1) {
N <- 10000
graphics.off()
par(mfrow = c(1,2), pty = "s")
for(k in 1:n) {
m <- (rowMeans(matrix(rchisq(N*k,nu), N, k)) - nu)/sqrt(2*nu/k)
hist(m, breaks = "FD", xlim = c(-4,4),
main = paste("Chi-Square(",nu,"), n = ",k,sep=""),
prob = TRUE, ylim = c(0,0.5), col = "lemonchiffon")
pu <- par("usr")[1:2]
x <- seq(pu[1], pu[2], len = 500)
lines(x, dnorm(x), col = "red")
qqnorm(m, ylim = c(-4,4), xlim = c(-4,4), pch = ".", col = "blue")
abline(0, 1, col = "red")
Sys.sleep(1)
}
}
clt.parab <- function(n=20) {
rparab <- function(nn) {
u <- 2*runif(nn) - 1
ifelse(u<0,-(abs(u)^(1/3)),u^(1/3))
}
N <- 10000
graphics.off()
par(mfrow = c(1,2), pty = "s")
for(k in 1:n) {
m <- (rowMeans(matrix(rparab(N*k), N, k)))/sqrt(3/(5*k))
hist(m, breaks = "FD", xlim = c(-4,4),
main = paste("n = ",k,sep=""),
prob = TRUE, ylim = c(0,0.5), col = "lemonchiffon")
pu <- par("usr")[1:2]
x <- seq(pu[1], pu[2], len = 500)
lines(x, dnorm(x), col = "red")
qqnorm(m, ylim = c(-4,4), xlim = c(-4,4), pch = ".", col = "blue")
abline(0, 1, col = "red")
Sys.sleep(1)
}
}
Bill.Venables at csiro.au wrote:
>Here's a bit of a refinement on Ted's first suggestion.
>
>
> N <- 10000
> graphics.off()
> par(mfrow = c(1,2), pty = "s")
> for(k in 1:20) {
> m <- (rowMeans(matrix(runif(M*k), N, k)) - 0.5)*sqrt(12*k)
> hist(m, breaks = "FD", xlim = c(-4,4), main = k,
> prob = TRUE, ylim = c(0,0.5), col = "lemonchiffon")
> pu <- par("usr")[1:2]
> x <- seq(pu[1], pu[2], len = 500)
> lines(x, dnorm(x), col = "red")
> qqnorm(m, ylim = c(-4,4), xlim = c(-4,4), pch = ".", col = "blue")
> abline(0, 1, col = "red")
> Sys.sleep(1)
> }
>
>
--
Kevin E. Thorpe
Biostatistician/Trialist, Knowledge Translation Program
Assistant Professor, Department of Public Health Sciences
Faculty of Medicine, University of Toronto
email: kevin.thorpe at utoronto.ca Tel: 416.946.8081 Fax: 416.971.2462
More information about the R-help
mailing list