[R] Understanding linear contrasts in Anova using R
Max Kuhn
mxkuhn at gmail.com
Thu Sep 30 20:31:16 CEST 2010
These two resources might also help:
http://cran.r-project.org/doc/contrib/Faraway-PRA.pdf
http://cran.r-project.org/web/packages/contrast/vignettes/contrast.pdf
Max
On Thu, Sep 30, 2010 at 1:33 PM, Ista Zahn <izahn at psych.rochester.edu> wrote:
> Hi Professor Howell,
> I think the issue here is simply in the assumption that the regression
> coefficients will always be equal to the product of the means and the
> contrast codes. I tend to think of regression coefficients as the
> quotient of the covariance of x and y divided by the variance of x,
> and this definition agrees with the coefficients calculated by lm().
> See below for a long-winded example.
>
> On Wed, Sep 29, 2010 at 3:42 PM, David Howell <David.Howell at uvm.edu> wrote:
>> #I am trying to understand how R fits models for contrasts in a
>> #simple one-way anova. This is an example, I am not stupid enough to want
>> #to simultaneously apply all of these contrasts to real data. With a few
>> #exceptions, the tests that I would compute by hand (or by other software)
>> #will give the same t or F statistics. It is the contrast estimates that R
>> produces
>> #that I can't seem to understand.
>> #
>> # In searching for answers to this problem, I found a great PowerPoint slide
>> (I think by John Fox).
>> # The slide pointed to the coefficients, said something like "these are
>> coeff. that no one could love," and
>> #then suggested looking at the means to understand where they came from. I
>> have stared
>> # and stared at his means and then my means, but can't find a relationship.
>>
>> # The following code and output illustrates the problem.
>>
>> # Various examples of Anova using R
>>
>> dv <- c(1.28, 1.35, 3.31, 3.06, 2.59, 3.25, 2.98, 1.53, -2.68, 2.64,
>> 1.26, 1.06,
>> -1.18, 0.15, 1.36, 2.61, 0.66, 1.32, 0.73, -1.06, 0.24, 0.27,
>> 0.72, 2.28,
>> -0.41, -1.25, -1.33, -0.47, -0.60, -1.72, -1.74, -0.77, -0.41, -1.20,
>> -0.31, -0.74,
>> -0.45, 0.54, -0.98, 1.68, 2.25, -0.19, -0.90, 0.78, 0.05, 2.69,
>> 0.15, 0.91,
>> 2.01, 0.40, 2.34, -1.80, 5.00, 2.27, 6.47, 2.94, 0.47, 3.22,
>> 0.01, -0.66)
>>
>> group <- factor(rep(1:5, each = 12))
>>
>>
>> # Use treatment contrasts to compare each group to the first group.
>> options(contrasts = c("contr.treatment","contr.poly")) # The default
>> model2 <- lm(dv ~ group)
>> summary(model2)
>> # Summary table is the same--as it should be
>> # Intercept is Group 1 mean and other coeff. are deviations from that.
>> # This is what I would expect.
>> #summary(model1)
>> # Df Sum Sq Mean Sq F value Pr(>F)
>> # group 4 62.46 15.6151 6.9005 0.0001415 ***
>> # Residuals 55 124.46 2.2629
>> #Coefficients:
>> # Estimate Std. Error t value Pr(>|t|)
>> #(Intercept) 1.80250 0.43425 4.151 0.000116 ***
>> #group2 -1.12750 0.61412 -1.836 0.071772 .
>> #group3 -2.71500 0.61412 -4.421 4.67e-05 ***
>> #group4 -1.25833 0.61412 -2.049 0.045245 *
>> #group5 0.08667 0.61412 0.141 0.888288
>>
>>
>> # Use sum contrasts to compare each group against grand mean.
>> options(contrasts = c("contr.sum","contr.poly"))
>> model3 <- lm(dv ~ group)
>> summary(model3)
>>
>> # Again, this is as expected. Intercept is grand mean and others are
>> deviatoions from that.
>> #Coefficients:
>> # Estimate Std. Error t value Pr(>|t|)
>> # (Intercept) 0.7997 0.1942 4.118 0.000130 ***
>> # group1 1.0028 0.3884 2.582 0.012519 *
>> # group2 -0.1247 0.3884 -0.321 0.749449
>> # group3 -1.7122 0.3884 -4.408 4.88e-05 ***
>> # group4 -0.2555 0.3884 -0.658 0.513399
>>
>> #SO FAR, SO GOOD
>>
>> # IF I wanted polynomial contrasts BY HAND I would use
>> # a(i) = -2 -1 0 1 2 for linear contrast (or some
>> linear function of this )
>> # Effect = Sum(a(j)M(i)) # where M = mean
>> # Effect(linear) = -2(1.805) -1(0.675) +0(-.912) +1(.544) +2(1.889) =
>> 0.043
>> # SS(linear) = n*(Effect(linear)^2)/Sum((a(j)^2)) = 12(.043)/10 = .002
>> # F(linear) = SS(linear)/MS(error) = .002/2.263 = .001
>> # t(linear) = sqrt(.001) = .031
>>
>> # To do this in R I would use
>> order.group <- ordered(group)
>> model4 <- lm(dv~order.group)
>> summary(model4)
>> # This gives:
>> #Coefficients:
>> # Estimate Std. Error t value Pr(>|t|)
>> # (Intercept) 0.79967 0.19420 4.118 0.000130 ***
>> # order.group.L 0.01344 0.43425 0.031 0.975422
>> # order.group.Q 2.13519 0.43425 4.917 8.32e-06 ***
>> # order.group.C 0.11015 0.43425 0.254 0.800703
>> # order.group^4 -0.79602 0.43425 -1.833 0.072202 .
>>
>> # The t value for linear is same as I got (as are others) but I don't
>> understand
>> # the estimates. The intercept is the grand mean, but I don't see the
>> relationship
>> # of other estimates to that or to the ones I get by hand.
>> # My estimates are the sum of (coeff times means) i.e. 0 (intercept),
>> .0425, 7.989, .3483, -6.66
>> # and these are not a linear (or other nice pretty) function of est. from R.
>
> # OK, let's break it down
> Means <- tapply(dv, order.group, mean)
> coefs.by.hand.m4 <- c(mean(Means),
> sum(Means*contrasts(order.group)[,1]),
> sum(Means*contrasts(order.group)[,2]),
> sum(Means*contrasts(order.group)[,3]),
> sum(Means*contrasts(order.group)[,4]))
> (coefs.check.m4 <- rbind(coef(model4), coefs.by.hand.m4))
> ## OK, so these coefficeints are in fact the sum of the product of the
> contrast codes and the group means. The only difference between this
> and what you got by hand calculation should be in the actual contrast
> codes used. You don't say what they are, but I'll assume you did the
> standard thing, like this:
> contrasts.by.hand <- cont.by.hand <- matrix(c(-2, -1, 0, 1, 2, 2, -1,
> -2, -1, 2, -1, 2, 0, -2, 1, 1, -4, 6, -4, 1), byrow=FALSE, ncol=4,
> dimnames=list(1:5, c("L", "Q", "C", "x4")))
> contrasts(order.group) <- contrasts.by.hand
>
> model5 <- lm(dv~order.group)
> summary(model5)
>
> coefs.by.hand.m5 <- c(mean(Means),
> sum(Means*contrasts(order.group)[,1]),
> sum(Means*contrasts(order.group)[,2]),
> sum(Means*contrasts(order.group)[,3]),
> sum(Means*contrasts(order.group)[,4]))
> (coefs.check.m5 <- rbind(coef(model5), coefs.by.hand.m5))
>
> ## Not the same, and not a linear function of one another
>
>
> ## OK, let's see what's going on here. We can define the regression
> coefficient b_i as cov(x_iy)/var(x). First check model 4
> mm4 <- as.data.frame(cbind(dv, model.matrix(model4)[,-1]))
>
> cov.L.m4 <- cov(mm4[,"dv"], mm4[,"order.group.L"])
> cov.Q.m4 <- cov(mm4[,"dv"], mm4[,"order.group.Q"])
> cov.C.m4 <- cov(mm4[,"dv"], mm4[,"order.group.C"])
> cov.x4.m4 <- cov(mm4[,"dv"], mm4[,"order.group^4"])
>
> var.L.m4 <- var(mm4[,"order.group.L"])
> var.Q.m4 <- var(mm4[,"order.group.Q"])
> var.C.m4 <- var(mm4[,"order.group.C"])
> var.x4.m4 <- var(mm4[,"order.group^4"])
>
> covs.m4 <- c(cov.L.m4, cov.Q.m4, cov.C.m4, cov.x4.m4)
> vars.m4 <- c(var.L.m4, var.Q.m4, var.C.m4, var.x4.m4)
> coefs.by.hand.m4.2 <- c(mean(dv), covs.m4/vars.m4)
>
> (coefs.check.m4.2 <- rbind(coef(model4), coefs.by.hand.m4.2))
> ## OK those are equal. Now try for model 5
>
> mm5 <- as.data.frame(cbind(dv, model.matrix(model5)[,-1]))
> names(mm5) <- names(mm4)
>
> cov.L.m5 <- cov(mm5[,"dv"], mm5[,"order.group.L"])
> cov.Q.m5 <- cov(mm5[,"dv"], mm5[,"order.group.Q"])
> cov.C.m5 <- cov(mm5[,"dv"], mm5[,"order.group.C"])
> cov.x4.m5 <- cov(mm5[,"dv"], mm5[,"order.group^4"])
>
> var.L.m5 <- var(mm5[,"order.group.L"])
> var.Q.m5 <- var(mm5[,"order.group.Q"])
> var.C.m5 <- var(mm5[,"order.group.C"])
> var.x4.m5 <- var(mm5[,"order.group^4"])
>
> covs.m5 <- c(cov.L.m5, cov.Q.m5, cov.C.m5, cov.x4.m5)
> vars.m5 <- c(var.L.m5, var.Q.m5, var.C.m5, var.x4.m5)
> coefs.by.hand.m5.2 <- c(mean(dv), covs.m5/vars.m5)
>
> (coefs.check.m5.2 <- rbind(coef(model5), coefs.by.hand.m5.2))
>
> ## Those are equal. OK, so we see that the coefficients are
> consistently equal to cov(xy)/var(x), but only equal to
> sum(contrasts(x)*Means) under certain circumstances. What are those
> circumstances? Let's try scaling our contrasts so they have the same
> variance
>
> contrasts.by.hand.scaled <- scale(contrasts.by.hand)
> contrasts(order.group) <- contrasts.by.hand.scaled
>
> model7 <- lm(dv~order.group)
> summary(model7)
>
> coefs.by.hand.m7 <- c(mean(Means),
> sum(Means*contrasts(order.group)[,1]),
> sum(Means*contrasts(order.group)[,2]),
> sum(Means*contrasts(order.group)[,3]),
> sum(Means*contrasts(order.group)[,4]))
> (coefs.check.m7 <- rbind(coef(model7), coefs.by.hand.m7))
>
> ## Not the same, but they are a linear function of one another:
> coefs.by.hand.m7 <- c(mean(Means),
> sum(Means*contrasts(order.group)[,1])/4,
> sum(Means*contrasts(order.group)[,2])/4,
> sum(Means*contrasts(order.group)[,3])/4,
> sum(Means*contrasts(order.group)[,4])/4)
> (coefs.check.m7 <- rbind(coef(model7), coefs.by.hand.m7))
>
> ## Hopefully this clarifies what the coefficients calculated by the
> lm() function represent, i.e., cor(xy)/var(x)
>
>>
>> # Just as a check, I forced intercept through zero with with deviation
>> scores or -1 in model.
>> # Now force intercept to 0 by using deviation scores
>>
>> devdv <- dv-mean(dv)
>> model5 <- lm(devdv~order.group)
>> summary(model5)
>> #Same as above except intercept = 0
>>
>> # Now do it by removing the intercept
>> model6 <- lm(dv~order.group -1)
>> summary(model6)
>> # Weird because coeff = cell means
>> # Estimate Std. Error t value Pr(>|t|)
>> # order.group1 1.8025 0.4342 4.151 0.000116 ***
>> # order.group2 0.6750 0.4342 1.554 0.125824
>> # order.group3 -0.9125 0.4342 -2.101 0.040208 *
>> # order.group4 0.5442 0.4342 1.253 0.215464
>> # order.group5 1.8892 0.4342 4.350 5.94e-05 ***
>>
>> # BUT note that row labels would suggest that we were not solving for
>> polynomials,
>> # but the contrast matrix is made up of polynomial coefficients.
>
> ## I think you're correct that by removing the intercept lm() is no
> longer using contrast coding. I've never really understood regression
> models without intercept terms I'm afraid...
>
>>
>> # contrasts(order.group)
>> # .L .Q .C ^4
>> #[1,] -6.324555e-01 0.5345225 -3.162278e-01 0.1195229
>> #[2,] -3.162278e-01 -0.2672612 6.324555e-01 -0.4780914
>> #[3,] -3.287978e-17 -0.5345225 1.595204e-16 0.7171372
>> #[4,] 3.162278e-01 -0.2672612 -6.324555e-01 -0.4780914
>> #[5,] 6.324555e-01 0.5345225 3.162278e-01 0.1195229
>>
>> #So, when I use polynomials (either through contr.poly or by supplying a
>> matrix of
>> #coefficients, what do the estimates represent?
>
> I hope my examples have clarified this. They represent the increase in
> y for a one-unit increase in X. What that one-unit increase represents
> is of course a function of the contrast codes used.
>
>>
>> # My problem is not restricted to polynomials. If I try a set of orthogonal
>> linear contrasts
>> # on group means I get
>> #contrasts(group) <- cbind(c(1,1,0,-1,-1), c(1,-1,0,0,0), c(-0,0,0,1,-1),
>> c(1,1,4,-1,1))
> # These are not orthogonal:
> cor(contrasts(group))
> ## fixing that gives us
> contrasts(group) <- cbind(c(1,1,0,-1,-1), c(1,-1,0,0,0),
> c(-0,0,0,1,-1), c(1,1,-4,1,1))
> model3 <- lm(dv ~ group)
> summary(model3)
>
> ## These coefficients are functions of the specified contrasts:
> coefs.by.hand.m3 <- c(mean(Means),
> (mean(Means[1:2]) - mean(Means[4:5]))/2,
> (Means[1] - Means[2])/2,
> (Means[4] - Means[5])/2,
> (mean(Means[c(1,2,4,5)]) - Means[3])/5)
> ## Note that we divide each mean difference by the difference of the contrasts
> (coef.check.m3 <- rbind(coef(model3), coefs.by.hand.m3))
>
> Hope it helps,
> Ista
>
>> #model3 <- lm(dv ~ group)
>> #summary(model3)
>> #Coefficients:
>> # Estimate Std. Error t value Pr(>|t|)
>> #(Intercept) 1.5335 0.2558 5.995 1.64e-07 ***
>> #group1 0.3168 0.2279 1.390 0.170185
>> #group2 0.5638 0.3071 1.836 0.071772 .
>> #group3 -1.2840 0.3369 -3.811 0.000352 ***
>> #group4 -0.6115 0.1387 -4.408 4.88e-05 ***
>> #These are not even close to what I would expect. By hand I would compute
>> the contrasts as
>> # .0442, 1.1275, 1.3450, and 8.5608 with different t values.
>>
>> # Any help would be appreciated.
>>
>> Thanks,
>> Dave Howell
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Ista Zahn
> Graduate student
> University of Rochester
> Department of Clinical and Social Psychology
> http://yourpsyche.org
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Max
More information about the R-help
mailing list