[R] Trouble about the interpretation of intercept in lm models
Stefano Leonardi
stefano.leonardi at unipr.it
Tue Jan 13 16:32:13 CET 2009
Hallo,
yesterday I was puzzled when I discovered that I
probabliy miss something in the interepretation of intercept
in two-way lm models.
I thought that the intercept, using the default contr.treatment
contrasts, represents the mean of the group of observations
having zero in all column of the model.matrix.
It turns out not to be case
To be more more clear I am attaching a short example:
R>#this is to generate the dataset
R>set.seed(1) #to have the same results I have
R>B <-rep(c("a","b","c"),c(6,6,6))
R>nB <- 1:3
R>names(nB)<-c("a","b","c")
R>B <- as.factor(B)
R>A<-rep(rep(c("0","1"),c(3,3)),3)
R>A <- as.factor(A)
R>Y <- 5+ 10*as.numeric(as.character(A)) + 20*nB[as.character(B)]
+ rnorm(18,0,5)
R>#this is the generated dataset
R>data.frame(Y,A,B)
Y A B
1 21.86773 0 a
2 25.91822 0 a
3 20.82186 0 a
4 42.97640 1 a
5 36.64754 1 a
6 30.89766 1 a
7 47.43715 0 b
8 48.69162 0 b
9 47.87891 0 b
10 53.47306 1 b
11 62.55891 1 b
12 56.94922 1 b
13 61.89380 0 c
14 53.92650 0 c
15 70.62465 0 c
16 74.77533 1 c
17 74.91905 1 c
18 79.71918 1 c
R>#table with means
R>M<-tapply(Y,list(A,B),mean)
R>print(M)
a b c
0 22.86927 48.00256 62.14832
1 36.84053 57.66039 76.47119
R>lm(Y ~ A + B)
Call:
lm(formula = Y ~ A + B)
Coefficients:
(Intercept) A1 Bb Bc
23.53 12.65 22.98 39.45
I was expecting that the intercept in the lm output would be equal
to the top left corner (0-a) of my previous M table (22.86927)
which is the mean of the first three observations in the
dataset.
So how do I interpret the intercept 23.53?
I could not relate it to any other mean.
R>mean(Y[A=="0" & B=="a"])
[1] 22.86927
R>apply(M,1,mean)
0 1
44.34005 56.99070
R>apply(M,2,mean)
a b c
29.85490 52.83148 69.30975
The following of course gave me the same results as the lm call
R>X<- model.matrix(~A+B)
R>b <- solve(t(X) %*% X) %*% t(X) %*% Y
R>b
[,1]
(Intercept) 23.52957
A1 12.65066
Bb 22.97658
Bc 39.45485
Thank you in advance for any suggestion.
Stefano
--
======================================================================
Stefano Leonardi
Dipartimento di Scienze Ambientali
Universita` di Parma stefano.leonardi at unipr.it
Viale Usberti 11a Phone : +39-0521-905659
43100 PARMA (Italy) Fax : +39-0521-905402
More information about the R-help
mailing list