[R] Understanding linear contrasts in Anova using R

Thu Sep 30 20:31:16 CEST 2010

These two resources might also help:

   http://cran.r-project.org/doc/contrib/Faraway-PRA.pdf
   http://cran.r-project.org/web/packages/contrast/vignettes/contrast.pdf

Max

On Thu, Sep 30, 2010 at 1:33 PM, Ista Zahn <izahn at psych.rochester.edu> wrote:
> Hi Professor Howell,
> I think the issue here is simply in the assumption that the regression
> coefficients will always be equal to the product of the means and the
> contrast codes. I tend to think of regression coefficients as the
> quotient of the covariance of x and y divided by the variance of x,
> and this definition agrees with the coefficients calculated by lm().
> See below for a long-winded example.
>
> On Wed, Sep 29, 2010 at 3:42 PM, David Howell <David.Howell at uvm.edu> wrote:
>>  #I am trying to understand how R fits models for contrasts in a
>> #simple one-way anova. This is an example, I am not stupid enough to want
>> #to simultaneously apply all of these contrasts to real data. With a few
>> #exceptions, the tests that I would compute by hand (or by other software)
>> #will give the same t or F statistics. It is the contrast estimates that R
>> produces
>> #that I can't seem to understand.
>> #
>> # In searching for answers to this problem, I found a great PowerPoint slide
>> (I think by John Fox).
>> # The slide pointed to the coefficients, said something like "these are
>> coeff. that no one could love," and
>> #then suggested looking at the means to understand where they came from. I
>> have stared
>> # and stared at his means and then my means, but can't find a relationship.
>>
>> # The following code and output illustrates the problem.
>>
>> # Various examples of Anova using R
>>
>> dv <- c(1.28,  1.35,  3.31,  3.06,  2.59,  3.25,  2.98,  1.53, -2.68,  2.64,
>>  1.26,  1.06,
>>       -1.18,  0.15,  1.36,  2.61,  0.66,  1.32,  0.73, -1.06,  0.24,  0.27,
>>  0.72,  2.28,
>>       -0.41, -1.25, -1.33, -0.47, -0.60, -1.72, -1.74, -0.77, -0.41, -1.20,
>> -0.31, -0.74,
>>       -0.45,  0.54, -0.98,  1.68,  2.25, -0.19, -0.90,  0.78,  0.05,  2.69,
>>  0.15,  0.91,
>>        2.01,  0.40,  2.34, -1.80,  5.00,  2.27,  6.47,  2.94,  0.47,  3.22,
>>  0.01, -0.66)
>>
>> group <- factor(rep(1:5, each = 12))
>>
>>
>> # Use treatment contrasts to compare each group to the first group.
>> options(contrasts = c("contr.treatment","contr.poly"))  # The default
>> model2 <- lm(dv ~ group)
>> summary(model2)
>>  # Summary table is the same--as it should be
>>  # Intercept is Group 1 mean and other coeff. are deviations from that.
>>  # This is what I would expect.
>>  #summary(model1)
>>  #              Df Sum Sq Mean Sq F value    Pr(>F)
>>  #  group        4  62.46 15.6151  6.9005 0.0001415 ***
>>  #  Residuals   55 124.46  2.2629
>>  #Coefficients:
>>  #            Estimate Std. Error t value Pr(>|t|)
>>  #(Intercept)  1.80250    0.43425   4.151 0.000116 ***
>>  #group2      -1.12750    0.61412  -1.836 0.071772 .
>>  #group3      -2.71500    0.61412  -4.421 4.67e-05 ***
>>  #group4      -1.25833    0.61412  -2.049 0.045245 *
>>  #group5       0.08667    0.61412   0.141 0.888288
>>
>>
>> # Use sum contrasts to compare each group against grand mean.
>> options(contrasts = c("contr.sum","contr.poly"))
>> model3 <- lm(dv ~ group)
>> summary(model3)
>>
>>  # Again, this is as expected. Intercept is grand mean and others are
>> deviatoions from that.
>>  #Coefficients:
>>  #              Estimate Std. Error t value Pr(>|t|)
>>  #  (Intercept)   0.7997     0.1942   4.118 0.000130 ***
>>  #  group1        1.0028     0.3884   2.582 0.012519 *
>>  #  group2       -0.1247     0.3884  -0.321 0.749449
>>  #  group3       -1.7122     0.3884  -4.408 4.88e-05 ***
>>  #  group4       -0.2555     0.3884  -0.658 0.513399
>>
>> #SO FAR, SO GOOD
>>
>> # IF I wanted polynomial contrasts BY HAND I would use
>> #    a(i) =  -2   -1   0   1   2   for linear contrast        (or some
>> linear function of this )
>> #    Effect = Sum(a(j)M(i))    # where M = mean
>> #    Effect(linear) = -2(1.805) -1(0.675) +0(-.912) +1(.544) +2(1.889) =
>> 0.043
>> #    SS(linear) = n*(Effect(linear)^2)/Sum((a(j)^2))  = 12(.043)/10 = .002
>> #    F(linear) = SS(linear)/MS(error) = .002/2.263 = .001
>> #    t(linear) = sqrt(.001) = .031
>>
>> # To do this in R I would use
>> order.group <- ordered(group)
>> model4 <- lm(dv~order.group)
>> summary(model4)
>> #  This gives:
>>    #Coefficients:
>> #                  Estimate Std. Error t value Pr(>|t|)
>> #    (Intercept)    0.79967    0.19420   4.118 0.000130 ***
>> #    order.group.L  0.01344    0.43425   0.031 0.975422
>> #    order.group.Q  2.13519    0.43425   4.917 8.32e-06 ***
>> #    order.group.C  0.11015    0.43425   0.254 0.800703
>> #    order.group^4 -0.79602    0.43425  -1.833 0.072202 .
>>
>> # The t value for linear is same as I got (as are others) but I don't
>> understand
>> # the estimates. The intercept is the grand mean, but I don't see the
>> relationship
>> # of other estimates to that or to the ones I get by hand.
>> # My estimates are the sum of (coeff times means) i.e.  0 (intercept),
>> .0425, 7.989, .3483, -6.66
>> # and these are not a linear (or other nice pretty) function of est. from R.
>
> # OK, let's break it down
> Means <- tapply(dv, order.group, mean)
> coefs.by.hand.m4 <- c(mean(Means),
>                   sum(Means*contrasts(order.group)[,1]),
>                   sum(Means*contrasts(order.group)[,2]),
>                   sum(Means*contrasts(order.group)[,3]),
>                   sum(Means*contrasts(order.group)[,4]))
> (coefs.check.m4 <- rbind(coef(model4), coefs.by.hand.m4))
> ## OK, so these coefficeints are in fact the sum of the product of the
> contrast codes and the group means. The only difference between this
> and what you got by hand calculation should be in the actual contrast
> codes used. You don't say what they are, but I'll assume you did the
> standard thing, like this:
> contrasts.by.hand <- cont.by.hand <- matrix(c(-2, -1, 0, 1, 2, 2, -1,
> -2, -1, 2, -1, 2, 0, -2, 1, 1, -4, 6, -4, 1), byrow=FALSE, ncol=4,
> dimnames=list(1:5, c("L", "Q", "C", "x4")))
> contrasts(order.group) <- contrasts.by.hand
>
> model5 <- lm(dv~order.group)
> summary(model5)
>
> coefs.by.hand.m5 <- c(mean(Means),
>                   sum(Means*contrasts(order.group)[,1]),
>                   sum(Means*contrasts(order.group)[,2]),
>                   sum(Means*contrasts(order.group)[,3]),
>                   sum(Means*contrasts(order.group)[,4]))
> (coefs.check.m5 <- rbind(coef(model5), coefs.by.hand.m5))
>
> ## Not the same, and not a linear function of one another
>
>
> ## OK, let's see what's going on here. We can define the regression
> coefficient b_i as cov(x_iy)/var(x). First check model 4
> mm4 <- as.data.frame(cbind(dv, model.matrix(model4)[,-1]))
>
> cov.L.m4 <- cov(mm4[,"dv"], mm4[,"order.group.L"])
> cov.Q.m4 <- cov(mm4[,"dv"], mm4[,"order.group.Q"])
> cov.C.m4 <- cov(mm4[,"dv"], mm4[,"order.group.C"])
> cov.x4.m4 <- cov(mm4[,"dv"], mm4[,"order.group^4"])
>
> var.L.m4 <- var(mm4[,"order.group.L"])
> var.Q.m4 <- var(mm4[,"order.group.Q"])
> var.C.m4 <- var(mm4[,"order.group.C"])
> var.x4.m4 <- var(mm4[,"order.group^4"])
>
> covs.m4 <- c(cov.L.m4, cov.Q.m4, cov.C.m4, cov.x4.m4)
> vars.m4 <- c(var.L.m4, var.Q.m4, var.C.m4, var.x4.m4)
> coefs.by.hand.m4.2 <- c(mean(dv), covs.m4/vars.m4)
>
> (coefs.check.m4.2 <- rbind(coef(model4), coefs.by.hand.m4.2))
> ## OK those are equal. Now try for model 5
>
> mm5 <- as.data.frame(cbind(dv, model.matrix(model5)[,-1]))
> names(mm5) <- names(mm4)
>
> cov.L.m5 <- cov(mm5[,"dv"], mm5[,"order.group.L"])
> cov.Q.m5 <- cov(mm5[,"dv"], mm5[,"order.group.Q"])
> cov.C.m5 <- cov(mm5[,"dv"], mm5[,"order.group.C"])
> cov.x4.m5 <- cov(mm5[,"dv"], mm5[,"order.group^4"])
>
> var.L.m5 <- var(mm5[,"order.group.L"])
> var.Q.m5 <- var(mm5[,"order.group.Q"])
> var.C.m5 <- var(mm5[,"order.group.C"])
> var.x4.m5 <- var(mm5[,"order.group^4"])
>
> covs.m5 <- c(cov.L.m5, cov.Q.m5, cov.C.m5, cov.x4.m5)
> vars.m5 <- c(var.L.m5, var.Q.m5, var.C.m5, var.x4.m5)
> coefs.by.hand.m5.2 <- c(mean(dv), covs.m5/vars.m5)
>
> (coefs.check.m5.2 <- rbind(coef(model5), coefs.by.hand.m5.2))
>
> ## Those are equal. OK, so we see that the coefficients are
> consistently equal to cov(xy)/var(x), but only equal to
> sum(contrasts(x)*Means) under certain circumstances. What are those
> circumstances? Let's try scaling our contrasts so they have the same
> variance
>
> contrasts.by.hand.scaled <- scale(contrasts.by.hand)
> contrasts(order.group) <- contrasts.by.hand.scaled
>
> model7 <- lm(dv~order.group)
> summary(model7)
>
> coefs.by.hand.m7 <- c(mean(Means),
>                   sum(Means*contrasts(order.group)[,1]),
>                   sum(Means*contrasts(order.group)[,2]),
>                   sum(Means*contrasts(order.group)[,3]),
>                   sum(Means*contrasts(order.group)[,4]))
> (coefs.check.m7 <- rbind(coef(model7), coefs.by.hand.m7))
>
> ## Not the same, but they are a linear function of one another:
> coefs.by.hand.m7 <- c(mean(Means),
>                   sum(Means*contrasts(order.group)[,1])/4,
>                   sum(Means*contrasts(order.group)[,2])/4,
>                   sum(Means*contrasts(order.group)[,3])/4,
>                   sum(Means*contrasts(order.group)[,4])/4)
> (coefs.check.m7 <- rbind(coef(model7), coefs.by.hand.m7))
>
> ## Hopefully this clarifies what the coefficients calculated by the
> lm() function represent, i.e., cor(xy)/var(x)
>
>>
>> # Just as a check, I forced intercept through zero with with deviation
>> scores or -1 in model.
>> #  Now force intercept to 0 by using deviation scores
>>
>> devdv <- dv-mean(dv)
>> model5 <- lm(devdv~order.group)
>> summary(model5)
>> #Same as above except intercept = 0
>>
>> # Now do it by removing the intercept
>> model6 <- lm(dv~order.group -1)
>> summary(model6)
>> #  Weird because coeff = cell means
>>    #             Estimate Std. Error t value Pr(>|t|)
>> #    order.group1   1.8025     0.4342   4.151 0.000116 ***
>> #    order.group2   0.6750     0.4342   1.554 0.125824
>> #    order.group3  -0.9125     0.4342  -2.101 0.040208 *
>> #    order.group4   0.5442     0.4342   1.253 0.215464
>> #    order.group5   1.8892     0.4342   4.350 5.94e-05 ***
>>
>> # BUT note that row labels would suggest that we were not solving for
>> polynomials,
>> # but the contrast matrix is made up of polynomial coefficients.
>
> ## I think you're correct that by removing the intercept lm() is no
> longer using contrast coding. I've never really understood regression
> models without intercept terms I'm afraid...
>
>>
>> # contrasts(order.group)
>> #                .L         .Q            .C         ^4
>> #[1,] -6.324555e-01  0.5345225 -3.162278e-01  0.1195229
>> #[2,] -3.162278e-01 -0.2672612  6.324555e-01 -0.4780914
>> #[3,] -3.287978e-17 -0.5345225  1.595204e-16  0.7171372
>> #[4,]  3.162278e-01 -0.2672612 -6.324555e-01 -0.4780914
>> #[5,]  6.324555e-01  0.5345225  3.162278e-01  0.1195229
>>
>> #So, when I use polynomials (either through contr.poly or by supplying a
>> matrix of
>> #coefficients, what do the estimates represent?
>
> I hope my examples have clarified this. They represent the increase in
> y for a one-unit increase in X. What that one-unit increase represents
> is of course a function of the contrast codes used.
>
>>
>> # My problem is not restricted to polynomials. If I try a set of orthogonal
>> linear contrasts
>> # on group means I get
>> #contrasts(group) <- cbind(c(1,1,0,-1,-1), c(1,-1,0,0,0), c(-0,0,0,1,-1),
>> c(1,1,4,-1,1))
> # These are not orthogonal:
> cor(contrasts(group))
> ## fixing that gives us
> contrasts(group) <- cbind(c(1,1,0,-1,-1), c(1,-1,0,0,0),
> c(-0,0,0,1,-1), c(1,1,-4,1,1))
> model3 <- lm(dv ~ group)
> summary(model3)
>
> ## These coefficients are functions of the specified contrasts:
> coefs.by.hand.m3 <-  c(mean(Means),
>                       (mean(Means[1:2]) - mean(Means[4:5]))/2,
>                       (Means[1] - Means[2])/2,
>                       (Means[4] - Means[5])/2,
>                       (mean(Means[c(1,2,4,5)]) - Means[3])/5)
> ## Note that we divide each mean difference by the difference of the contrasts
> (coef.check.m3 <- rbind(coef(model3), coefs.by.hand.m3))
>
> Hope it helps,
> Ista
>
>> #model3 <- lm(dv ~ group)
>> #summary(model3)
>> #Coefficients:
>> #            Estimate Std. Error t value Pr(>|t|)
>> #(Intercept)   1.5335     0.2558   5.995 1.64e-07 ***
>> #group1        0.3168     0.2279   1.390 0.170185
>> #group2        0.5638     0.3071   1.836 0.071772 .
>> #group3       -1.2840     0.3369  -3.811 0.000352 ***
>> #group4       -0.6115     0.1387  -4.408 4.88e-05 ***
>> #These are not even close to what I would expect. By hand I would compute
>> the contrasts as
>> # .0442, 1.1275, 1.3450, and 8.5608 with different t values.
>>
>> # Any help would be appreciated.
>>
>> Thanks,
>> Dave Howell
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Ista Zahn
> Graduate student
> University of Rochester
> Department of Clinical and Social Psychology
> http://yourpsyche.org
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 

Max