[R] How to read ANOVA output
Stephen Liu
satimis at yahoo.com
Thu Aug 19 11:45:32 CEST 2010
----- Original Message ----
From: "Ted.Harding at manchester.ac.uk" <Ted.Harding at manchester.ac.uk>
To: r-help at r-project.org
Cc: Stephen Liu <satimis at yahoo.com>
Sent: Wed, August 18, 2010 4:41:11 PM
Subject: RE: [R] How to read ANOVA output
Hi Ted,
Thanks for your advice.
- snip -
>You need to understand how that works (basic
>statistical theory) before even thinking of looking at the
>Tukey thing (omitted in this reply).
I have been googling a while. There were many documents discovered. I wonder
where shall I start? Which direction shall I choose? Could you please shed me
some hints. TIA
I found follows;
Basic Inferential Statistics: Theory and Application
http://owl.english.purdue.edu/owl/resource/672/05/
Basic Statistics-I
http://works.bepress.com/durgesh_chandra_pathak/10/
file download
basic_Statistics-I-fulltext.pdf
>The following is an explanation of your 1-way ANOVA written
>entirely in R (preceded by a duplicate of your ANOVA output):
Performed following steps:-
## anova(lm(values ~ ind, data = tablets))
## Analysis of Variance Table
## Response: values
## Df Sum Sq Mean Sq F value Pr(>F)
## ind 2 2.05787 1.02893 45.239 2.015e-05 ***
## Residuals 9 0.20470 0.02274
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
tabA = c(5.67, 5.67, 5.55, 5.57)
tabB = c(5.75, 5.47, 5.43, 5.45)
tabC = c(4.74, 4.45, 4.65, 4.94)
nA <- length(tabA) ; nB <- length(tabB) ; nC <- length(tabC)
nG <- nA + nB + nC
> nG
[1] 12
mG <- mean(c(tabA,tabB,tabC))
mA <- mean(tabA) ; mB <- mean(tabB) ; mC <- mean(tabC)
SSres <- sum((tabA-mA)^2) + sum((tabB-mB)^2) + sum((tabC-mC)^2)
SSres # = 0.2047
[1] 0.2047
( I suppose - ^2 here means a raised to the power of 2) ??
( SSres is the sum of squares residual (or sum of squares error it is sometimes
called), which is the variation in the dependent variable that is not predicted
by the model. Adding the SSreg to the SSres gives the SStotal, which represents
how much variation there is in the data overall) ??
SSeff <- nA*(mA-mG)^2 + nB*(mB-mG)^2 + nC*(mC-mG)^2
SSeff # = 2.057867
[1] 2.057867
(What does SSeff refer to here)??
## Number of groups = 3 hence df.groups = (3-1) = 2
(?df
Description:
Density, distribution function, quantile function and random
generation for the F distribution with ‘df1’ and ‘df2’ degrees of
freedom (and optional non-centrality parameter ‘ncp’).
What does df refer here?
) ??
df.groups <- 2
meanSSeff <- SSeff/df.groups
meanSSeff # = 1.028933
[1] 0.02274444
## df for residuals in each group = (n.group - 1):
df.res <- (nA-1) + (nB-1) + (nC-1) ## = 3 + 3 + 3 = 9
meanSSres <- SSres/df.res
meanSSres # = 0.02274444
[1] 0.02274444
## Fisher's F-ratio statistic = meanSSeff/meanSSres:
F <- meanSSeff/meanSSres
F # = 45.23889
[1] 45.23889
(Fisher's F-ratio
F-test ???
http://en.wikipedia.org/wiki/F-test
)
## P-value for F as test of difference between group means
## relative to within-group residuals (upper tail):
Pval <- pf(F, df.groups, df.res, lower.tail=FALSE)
Pval # = 2.015227e-05
[1] 2.015227e-05
(The P-values for the Popular Distributions
http://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/pvalues.htm
) ??
If I'm wrong please correct me. TIA
B.R.
Stephen
More information about the R-help
mailing list