[R] how to estimate treatment-interaction contrasts

Sat Jul 14 12:37:06 CEST 2007

szhan at uoguelph.ca wrote:
> Hello, Chuck,
> Thank you very much for your help! But the contrasts I want to do  
> simutaneously is
>   contrasts(B)
>     [,1] [,2] [,3] [,4]
> b1   -4   -3   -2   -1
> b2    1   -3   -2   -1
> b3    1    2   -2   -1
> b4    1    2    3   -1
> b5    1    2    3    4
> 
> Could you please show me how to calculate estimates for ALL  
> intearaction constrasts using THESE contrasts? Say C2: c(-3, -3, 2, 2,  
> 2) as an example. I used the ortholognal constrasts as you suggest,  
> estimate for interaction contrast C2 is still -24.1.
> Joshua

Joshua:
  I use a variation on my contrasts to show how you might get the
estimates.  I then confirm the estimates by calculating the differences
of differences in means "by hand".

score <- c(7.2,6.5,6.9,6.4,6.9,6.1,6.9,5.3,7.2,5.7,5.1,5.9,7.6,6.9,6.8,
           7.2,6.6,6.9,6.4,6.0,6.0,6.9,6.9,6.4,7.5,7.7,7.0,8.6,8.8,8.3)

A <- gl(2, 15, labels=c("a1", "a2"))
B <- rep(gl(5, 3, labels=c("b1", "b2", "b3", "b4", "b5")), 2)

contrasts(B) <- matrix(c(3/5,  1/2,  0,  0,
                         3/5, -1/2,  0,  0,
                        -2/5,  0,  2/3,  0,
                        -2/5,  0, -1/3,  1/2,
                        -2/5,  0, -1/3, -1/2), ncol=4, byrow=TRUE)

fit1 <- aov(score ~ A*B)

summary(fit1, split=list(B=1:4), expand.split = TRUE)
            Df Sum Sq Mean Sq F value    Pr(>F)
A            1 3.2013  3.2013 15.1483 0.0009054 ***
B            4 8.7780  2.1945 10.3841 0.0001019 ***
  B: C1      1 1.0427  1.0427  4.9340 0.0380408 *
  B: C2      1 1.0208  1.0208  4.8304 0.0399049 *
  B: C3      1 1.2469  1.2469  5.9004 0.0246876 *
  B: C4      1 5.4675  5.4675 25.8715 5.637e-05 ***
A:B          4 5.3420  1.3355  6.3194 0.0018616 **
  A:B: C1    1 3.2267  3.2267 15.2684 0.0008734 ***
  A:B: C2    1 0.1008  0.1008  0.4771 0.4976647
  A:B: C3    1 1.9136  1.9136  9.0549 0.0069317 **
  A:B: C4    1 0.1008  0.1008  0.4771 0.4976647
Residuals   20 4.2267  0.2113
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Estimates with tests that match those above

summary(lm(score ~ A*B))

Call:
lm(formula = score ~ A * B)

Residuals:
     Min       1Q   Median       3Q      Max
-1.16667 -0.29167  0.03333  0.29167  0.73333

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)   6.4933     0.1187  54.705  < 2e-16 ***
Aa2           0.6533     0.1679   3.892 0.000905 ***
B1            0.2889     0.2423   1.192 0.247086
B2            0.4000     0.3754   1.066 0.299271
B3            0.1333     0.3251   0.410 0.686038
B4           -1.5333     0.3754  -4.085 0.000577 ***
Aa2:B1       -1.3389     0.3426  -3.907 0.000873 ***
Aa2:B2        0.3667     0.5308   0.691 0.497665
Aa2:B3       -1.3833     0.4597  -3.009 0.006932 **
Aa2:B4        0.3667     0.5308   0.691 0.497665
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.4597 on 20 degrees of freedom
Multiple R-Squared: 0.8038,     Adjusted R-squared: 0.7156
F-statistic: 9.107 on 9 and 20 DF,  p-value: 2.324e-05

# Confirm that estimates match differences of differences in means

diff(rowMeans(tapply(score, list(A,B), mean)[,1:2]) -
     rowMeans(tapply(score, list(A,B), mean)[,3:5]))
       a2
-1.338889

diff(tapply(score, list(A,B), mean)[,1] -
     tapply(score, list(A,B), mean)[,2])
       a2
0.3666667

diff(tapply(score, list(A,B), mean)[,3] -
     rowMeans(tapply(score, list(A,B), mean)[,4:5]))
       a2
-1.383333

diff(tapply(score, list(A,B), mean)[,4] -
     tapply(score, list(A,B), mean)[,5])
       a2
0.3666667

  So, applying the same strategy to your contrasts would give the following:

contrasts(B) <- matrix(c(-4/5, -3/5, -2/5, -1/5,
                          1/5, -3/5, -2/5, -1/5,
                          1/5,  2/5, -2/5, -1/5,
                          1/5,  2/5,  3/5, -1/5,
                          1/5,  2/5,  3/5,  4/5), ncol=4, byrow=TRUE)

fit1 <- aov(score ~ A*B)

summary(fit1, split=list(B=1:4), expand.split = TRUE)
            Df Sum Sq Mean Sq F value    Pr(>F)
A            1 3.2013  3.2013 15.1483 0.0009054 ***
B            4 8.7780  2.1945 10.3841 0.0001019 ***
  B: C1      1 0.0301  0.0301  0.1424 0.7099296
  B: C2      1 2.0335  2.0335  9.6221 0.0056199 **
  B: C3      1 1.2469  1.2469  5.9004 0.0246876 *
  B: C4      1 5.4675  5.4675 25.8715 5.637e-05 ***
A:B          4 5.3420  1.3355  6.3194 0.0018616 **
  A:B: C1    1 0.7207  0.7207  3.4105 0.0796342 .
  A:B: C2    1 2.6068  2.6068 12.3350 0.0021927 **
  A:B: C3    1 1.9136  1.9136  9.0549 0.0069317 **
  A:B: C4    1 0.1008  0.1008  0.4771 0.4976647
Residuals   20 4.2267  0.2113
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary(lm(score ~ A*B))

Call:
lm(formula = score ~ A * B)

Residuals:
     Min       1Q   Median       3Q      Max
-1.16667 -0.29167  0.03333  0.29167  0.73333

Coefficients:
              Estimate Std. Error   t value Pr(>|t|)
(Intercept)  6.493e+00  1.187e-01    54.705  < 2e-16 ***
Aa2          6.533e-01  1.679e-01     3.892 0.000905 ***
B1          -4.000e-01  3.754e-01    -1.066 0.299271
B2          -3.815e-16  3.754e-01 -1.02e-15 1.000000
B3          -9.000e-01  3.754e-01    -2.398 0.026373 *
B4           1.533e+00  3.754e-01     4.085 0.000577 ***
Aa2:B1      -3.667e-01  5.308e-01    -0.691 0.497665
Aa2:B2       6.000e-01  5.308e-01     1.130 0.271719
Aa2:B3       1.567e+00  5.308e-01     2.951 0.007893 **
Aa2:B4      -3.667e-01  5.308e-01    -0.691 0.497665
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.4597 on 20 degrees of freedom
Multiple R-Squared: 0.8038,     Adjusted R-squared: 0.7156
F-statistic: 9.107 on 9 and 20 DF,  p-value: 2.324e-05

  Because your contrasts are correlated, the order in which they are
entered makes a difference.  Thus, the two summaries only agree for the
last interaction contrast (Aa2:B4).
  Perhaps more intuitively, you also can see that the tests are
sensitive to order for your contrasts as follows (no output shown):

# Second Contrast First

anova(lm(score ~
         A * I(B %in% c("b1","b2")) +
                 A * I(B %in% "b1") +
    A * I(B %in% c("b1","b2","b3")) +
 A * I(B %in% c("b1","b2","b3","b4"))), test="F")

# Second Contrast Last

anova(lm(score ~ A * I(B %in% "b1") +
    A * I(B %in% c("b1","b2","b3")) +
A * I(B %in% c("b1","b2","b3","b4")) +
          A * I(B %in% c("b1","b2"))), test="F")

  So if you really do want to evaluate your correlated contrasts
simultaneously, what makes the most sense to me is:

summary(lm(score ~ A*B))

  I would be interested in other people's thoughts on this, particularly
how to use something like estimable() in the gmodels package to achieve
these contrasts.

hope this helps,

Chuck

> Quoting Chuck Cleland <ccleland at optonline.net>:
> 
>> szhan at uoguelph.ca wrote:
>>> Hello, R experts,
>>> Sorry for asking this question again again since I really want a help!
>>>
>>> I have a two-factor experiment data and like to calculate estimates of
>>> interation contrasts say factor A has levels of a1, a2, and B has
>>> levels of b1, b2, b3, b4, and b5 with 3 replicates. I am not sure the
>>> constrast estimate I got is right using the script below:
>>>
>>> score<-c(7.2,6.5,6.9,6.4,6.9,6.1,6.9,5.3,7.2,5.7,5.1,5.9,7.6,6.9,6.8,
>>> 7.2,6.6,6.9,6.4,6.0,6.0,6.9,6.9,6.4,7.5,7.7,7.0,8.6,8.8,8.3)
>>>
>>> A <- gl(2, 15, labels=c("a1", "a2"))
>>> B <- rep(gl(5, 3, labels=c("b1", "b2", "b3", "b4", "b5")), 2)
>>>
>>> contrasts(B)<-cbind(c(-4,rep(1,4)),c(rep(-3,2),rep(2,3)),
>>> +  c(rep(-2,3),rep(3,2)),c(rep(-1,4), rep(4,1)))
>>> fit1 <- aov(score ~ A*B)
>>> summary(fit1, split=list(B=1:4), expand.split = TRUE)
>>>                Df Sum Sq Mean Sq F value    Pr(>F)
>>> A            1 3.2013  3.2013 15.1483 0.0009054 ***
>>> B            4 8.7780  2.1945 10.3841 0.0001019 ***
>>>      B: C1      1 0.0301  0.0301  0.1424 0.7099296
>>>      B: C2      1 2.0335  2.0335  9.6221 0.0056199 **
>>>      B: C3      1 1.2469  1.2469  5.9004 0.0246876 *
>>>      B: C4      1 5.4675  5.4675 25.8715 5.637e-05 ***
>>> A:B          4 5.3420  1.3355  6.3194 0.0018616 **
>>>      A:B: C1    1 0.7207  0.7207  3.4105 0.0796342 .
>>>      A:B: C2    1 2.6068  2.6068 12.3350 0.0021927 **
>>>      A:B: C3    1 1.9136  1.9136  9.0549 0.0069317 **
>>>      A:B: C4    1 0.1008  0.1008  0.4771 0.4976647
>>> Residuals   20 4.2267  0.2113
>>> ---
>>> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>>>
>>> Now I like to get interaction contrast estimate for b1 and b2 vs   
>>> b3, b4 and b5
>>> cont <- c(1, -1)[A] * c(-3, -3, 2, 2, 2)[B]
>>>
>>> estimat<-sum(cont*score) # value of the contrast estimate for A:B C2
>>>
>>>> estimat
>>> [1] -24.1
>>>
>>> I am not sure the estimate for A:B C2 contrast  (-24.1) is correct
>>> because the F value given the output above(12.3350) does not equal to
>>> those I calculate below (15.2684):
>>>
>>> t.stat <- sum(cont*score)/se.contrast(fit1, as.matrix(cont))
>>>> t.stat^2
>>> Contrast 1
>>>      15.2684
>>>
>>> Could you please help me calculate the correct the estimate of
>>> interaction contrast and corresponding F value?
>>> Thanks in advance!
>>> Joshua
>>   If the contrasts for B are orthogonal, then you get the result you
>> expected:
>>
>> score <- c(7.2,6.5,6.9,6.4,6.9,6.1,6.9,5.3,7.2,5.7,5.1,5.9,7.6,6.9,6.8,
>>            7.2,6.6,6.9,6.4,6.0,6.0,6.9,6.9,6.4,7.5,7.7,7.0,8.6,8.8,8.3)
>>
>> A <- gl(2, 15, labels=c("a1", "a2"))
>> B <- rep(gl(5, 3, labels=c("b1", "b2", "b3", "b4", "b5")), 2)
>>
>> contrasts(B) <- matrix(c(3, -1,  0,  0,
>>                          3,  1,  0,  0,
>>                         -2,  0,  2,  0,
>>                         -2,  0, -1,  1,
>>                         -2,  0, -1, -1), ncol=4, byrow=TRUE)
>>
>> fit1 <- aov(score ~ A*B)
>>
>> summary(fit1, split=list(B=1:4), expand.split = TRUE)
>>
>>             Df Sum Sq Mean Sq F value    Pr(>F)
>> A            1 3.2013  3.2013 15.1483 0.0009054 ***
>> B            4 8.7780  2.1945 10.3841 0.0001019 ***
>>   B: C1      1 1.0427  1.0427  4.9340 0.0380408 *
>>   B: C2      1 1.0208  1.0208  4.8304 0.0399049 *
>>   B: C3      1 1.2469  1.2469  5.9004 0.0246876 *
>>   B: C4      1 5.4675  5.4675 25.8715 5.637e-05 ***
>> A:B          4 5.3420  1.3355  6.3194 0.0018616 **
>>   A:B: C1    1 3.2267  3.2267 15.2684 0.0008734 ***
>>   A:B: C2    1 0.1008  0.1008  0.4771 0.4976647
>>   A:B: C3    1 1.9136  1.9136  9.0549 0.0069317 **
>>   A:B: C4    1 0.1008  0.1008  0.4771 0.4976647
>> Residuals   20 4.2267  0.2113
>> ---
>> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>>
>>   Note that I put the contrast of interest for B in the first column of
>> the contrast matrix.
>>
>> hope this helps,
>>
>> Chuck
>>
>>> ______________________________________________
>>> R-help at stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> --
>> Chuck Cleland, Ph.D.
>> NDRI, Inc.
>> 71 West 23rd Street, 8th floor
>> New York, NY 10010
>> tel: (212) 845-4495 (Tu, Th)
>> tel: (732) 512-0171 (M, W, F)
>> fax: (917) 438-0894
>>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894