[R] ANOVA question
Robert Latest
boblatest at gmail.com
Fri May 11 09:37:24 CEST 2012
Hello all,
I'm very satisfied to say that my grip on both R and statistics is
showing the first hints of firmness, on a very greenhorn level.
I'm faced with a problem that I intend to analyze using ANOVA, and to
test my understanding of a primitive, one-way ANOVA I've written the
self-contained practice script below. It works as expected.
But here's my question: How can I not only get the values of the
coefficients for the different levels of the explanatory factor(s),
but also the corresponding standard errors and confidence levels?
Below I have started doing that "on foot" by looping over the levels
of my single factor, but I suppose this gets complicated and messy
with more complex models. Any ideas?
Thanks,
robert
set.seed(0)
N <- 100 # sample size
MEAN <- c(10, 20, 30, 40, 50)
VAR <- c(20,20,1, 20, 20)
LABELS <- c("A", "B", "C", "D", "E")
# create a data frame with labels
df <- data.frame(Label=rep(LABELS, each=N))
df$Value <- NA
# fill in random data for each factor level
for (i in 1:length(MEAN)) {
df$Value[(1+N*(i-1)):(N*i)] <- rnorm(N, MEAN[i], sqrt(VAR[i]))
}
par(mfrow=c(2,2))
plot(df) # Box plot of the data
plot(df$Value) # scatter plot
mod_aov <- aov(Value ~ Label, data=df)
print(summary(mod_aov))
print(mod_aov$coefficients)
rsd <- mod_aov$residuals
plot(rsd)
# find and print mean() and var() for each level
for (l in levels(df$Label)) {
index <- df$Label == l
# Method 1: directly from data
smp <- df$Value[index] # extract sample for this label
ssq_smp <- var(smp)*(length(smp)-1) # sum of squares is variance
# times d.f.
# Method 2: from ANOVA residuals
rsd_grp <- rsd[index] # extract residuals
ssq_rsd <- sum(rsd_grp **2) # compute sum of squares
# print mean, variance, and difference between SSQs from the two
# methods.
write(sprintf("%s: mean=%5.1f var=%5.1f (%.2g)", l,
mean(smp), var(smp),
ssq_smp-ssq_rsd), "")
# ...and it works like expected! But is there a shortcut that would give me
# the same result in a one-liner?
}
More information about the R-help
mailing list