[R] explanation of lm's coefficients
Justin Fay
jfay at genetics.wustl.edu
Sun Aug 24 00:44:19 CEST 2003
I don't understand the coefficients returned from the lm function. I
expected these to be the mean values for each factor in the model. Given
this data and model:
data<-c(rnorm(10,mean=0,sd=1),rnorm(10,mean=1,sd=1),rnorm(10,mean=-.5,sd=1))
ftr<-as.factor(rep(1:3,each=10))
fit<-lm(data ~ ftr)
the mean values of the three facotrs from the data:
c(mean(data[1:10]), mean(data[11:20]), mean(data[21:30]))
[1] -0.3589049 0.6034931 -0.7256897
are not the same as the coefficients return from fit:
fit$coef
(Intercept) ftr2 ftr3
-0.3589049 0.9623980 -0.3667847
ftr2 and ftr3 are offset by the value of the intercept.
The fitted values are as I expected:
fit$fitted.values
1 2 3 4 5 6 7
-0.3589049 -0.3589049 -0.3589049 -0.3589049 -0.3589049 -0.3589049 -0.3589049
8 9 10 11 12 13 14
-0.3589049 -0.3589049 -0.3589049 0.6034931 0.6034931 0.6034931 0.6034931
15 16 17 18 19 20 21
0.6034931 0.6034931 0.6034931 0.6034931 0.6034931 0.6034931 -0.7256897
22 23 24 25 26 27 28
-0.7256897 -0.7256897 -0.7256897 -0.7256897 -0.7256897 -0.7256897 -0.7256897
29 30
-0.7256897 -0.7256897
My goal is to get the mean values of the factors. Although easily done,
I don't understand why the ftr2 and ftr3 are offset by the value of the
intercept. Any explanations would be appreciated.
Justin
________________________________________
Justin Fay
Assistant Professor of Genetics
Washington University School of Medicine
4566 Scott Ave, St. Louis, MO 63110
PH: 314.747.1808 Fax: 314.362.7855
More information about the R-help
mailing list