[R] Understanding custom contrasts
peter dalgaard
pdalgd at gmail.com
Sun May 6 19:52:46 CEST 2012
On May 6, 2012, at 17:59 , Sandy Miller wrote:
> I have a question regarding customising contrasts for linear models I
> read the section in Fox/Weisberg's CAR (2nd ed.) and was
> thinking--apparently erroneously--that the following two snippets
> would do the same:
>
> # approach 1: default treatment contrasts
> # generate data
> set.seed(1111)
> y <- c(rnorm(1000, -1), rnorm(1000, 0), rnorm(1000, 1))
> x <- factor(rep(letters[1:3], each=1000))
>
> # trying to get tests of b against a and b against c
> x <- relevel(x, "b"); contrasts(x)
> tapply(y, x, mean)
> summary(lm(y~x))
>
>
> # approach 2: custom contrasts
> # generate same data
> set.seed(1111)
> y <- c(rnorm(1000, -1), rnorm(1000, 0), rnorm(1000, 1))
> x <- factor(rep(letters[1:3], each=1000))
>
> con <- matrix(c(1, -1, 0, 0, 1, -1), ncol=2); colnames(con) <- c("a vs
> b", "b vs c")
> contrasts(x) <- con
> tapply(y, x, mean)
> summary(lm(y~x))
>
> Why is the result not the same, what am I doing wrong? Any help would
> be much appreciated.
Because of linear algebra, basically... You are confusing contrast parametrization with contrast transformation (and you're not alone in falling into that trap). The matrix you set up transforms two coefficients to three means as follows
a = A1 + 0
b = -A1 + A2
c = 0 - A2
(and of course there's an intercept to add as well). Now, the contrast transformations that you want are
a - b = 2A1 - A2
b - c = 2A2 - A1
which you will surely notice are not the same as A1 and A2. Had you instead chosen this matrix
1 0
0 0
0 -1
then you will have a-b=A1 and b-c=A2.
The reason people often get this wrong is that the literature, originating in the analysis of balanced designs, tends to (over-)emphasize "orthogonal contrast" matrices, which satisfy C'C = I. In a balanced design, this leads to nice statistical properties and interpretation, but not in the general case. In a suitable sense, the orthogonal contrast matrices are their own inverses. However, for the generic case, you need to find a solution to the equation A'C=I given A or C.
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
More information about the R-help
mailing list