[R] anova planned comparisons/contrasts
Greg Snow
Greg.Snow at intermountainmail.org
Tue Nov 27 00:26:18 CET 2007
Others have shown some approaches that work well for after you fit the
model. Here is another approach starting with the model fit itself:
tmp <- c("control", "glucose", "fructose",
"gluc+fruct", "sucrose")
treatment <- factor(rep( tmp, each=10 ), levels=tmp)
length <- c(75, 67, 70, 75, 65, 71, 67, 67, 76, 68,
57, 58, 60, 59, 62, 60, 60, 57, 59, 61,
58, 61, 56, 58, 57, 56, 61, 60, 57, 58,
58, 59, 58, 61, 57, 56, 58, 57, 57, 59,
62, 66, 65, 63, 64, 62, 65, 65, 62, 67)
cm1 <- rbind( int=1/5, cntrl = c(4, -1,-1,-1,-1)/4,
plus=c( 0, -1, -1, 3, -1 )/3,
sug1=c( 0, -1, 1, 0, 0),
sug2=c( 0, -1, -1, 0, 2)/2 )
cm2 <- zapsmall(solve(cm1))
contrasts(treatment) <- cm2[,-1]
treatment
sugars <- data.frame( length=length, treatment=treatment )
fit <- aov( length ~ treatment, data=sugars )
summary.lm(fit)
summary(fit)
summary(fit, split=list(treatment=list(
'Control vs. Sugars'=1, 'Gluc+Fruc vs. sugars'=2, 'Sugars'=3:4)))
Is this more what you were looking for?
--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at intermountainmail.org
(801) 408-8111
> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Tyler Smith
> Sent: Thursday, November 22, 2007 12:38 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] anova planned comparisons/contrasts
>
> Hi,
>
> I'm trying to figure out how anova works in R by translating
> the examples in Sokal And Rohlf's (1995 3rd edition)
> Biometry. I've hit a snag with planned comparisons, their box
> 9.4 and section 9.6. It's a basic anova design:
>
> treatment <- factor(rep(c("control", "glucose", "fructose",
> "gluc+fruct", "sucrose"), each = 10))
>
> length <- c(75, 67, 70, 75, 65, 71, 67, 67, 76, 68,
> 57, 58, 60, 59, 62, 60, 60, 57, 59, 61,
> 58, 61, 56, 58, 57, 56, 61, 60, 57, 58,
> 58, 59, 58, 61, 57, 56, 58, 57, 57, 59,
> 62, 66, 65, 63, 64, 62, 65, 65, 62, 67)
>
> sugars <- data.frame(treatment, length)
>
> The basic analysis is easy enough:
>
> anova(lm(length ~ treatment, sugars))
>
> S&R proceed to calculate planned comparisons between control
> and all other groups, between gluc+fruct and the remaining
> sugars, and among the three pure sugars. I can get the first
> two comparisons using the
> contrasts() function:
>
> contrasts(sugars$treatment) <- matrix(c(-4, 1, 1, 1, 1,
> 0, -1, 3, -1, -1), 5, 2)
>
> summary(lm(length ~ treatment, sugars))
>
> The results appear to be the same, excepting that the book
> calculates an F value, whereas R produces an equivalent t
> value. However, I can't figure out how to calculate a
> contrast among three groups. For those without access to
> Sokal and Rohlf, I've written an R function that performs the
> analysis they use, copied below. Is there a better way to do
> this in R?
>
> Also, I don't know how to interpret the Estimate and Std.
> Error columns from the summary of the lm with contrasts. It's
> clear the intercept in this case is the grand mean, but what
> are the other values? They appear to be the difference
> between one of the contrast groups and the grand mean --
> wouldn't an estimate of the differences between the
> contrasted groups be more appropriate, or am I (likely)
> misinterpreting this?
>
> Thanks!
>
> Tyler
>
> MyContrast <- function(Var, Group, G1, G2, G3=NULL) {
> ## Var == data vector, Group == factor
> ## G1, G2, G3 == character vectors of factor levels to contrast
>
> nG1 = sum(Group %in% G1)
> nG2 = sum(Group %in% G2)
> GrandMean = mean(Var[Group %in% c(G1, G2, G3)])
> G1Mean = mean(Var[Group %in% G1])
> G2Mean = mean(Var[Group %in% G2])
>
> if(is.null(G3))
> MScontr = nG1 * ((G1Mean - GrandMean)^2) +
> nG2 * ((G2Mean - GrandMean)^2)
> else {
> nG3 = sum(Group %in% G3)
> G3Mean = mean(Var[Group %in% G3])
> MScontr = (nG1 * ((G1Mean - GrandMean)^2) +
> nG2 * ((G2Mean - GrandMean)^2) +
> nG3 * ((G3Mean - GrandMean)^2))/2
> }
>
> An <- anova(lm(Var ~ Group))
> MSwithin = An[3]['Residuals',]
> DegF = An$Df[length(An$Df)]
> Fval = MScontr / MSwithin
> Pval = 1 - pf(Fval, 1, DegF)
>
> return (list(MS_contrasts = MScontr, MS_within = MSwithin,
> F_value = Fval, P_value = Pval)) }
>
> ## The first two contrasts produce the same (+/- rounding
> error) ## p-values as obtained using contrasts()
> MyContrast(sugars$length, sugars$treatment, 'control',
> c("fructose", "gluc+fruct", "glucose",
> "sucrose"))
> MyContrast(sugars$length, sugars$treatment, 'gluc+fruct',
> c("fructose", "glucose", "sucrose"))
>
> ## How do you do this in standard R?
> MyContrast(sugars$length, sugars$treatment, "fructose", "glucose",
> "sucrose")
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list