[R] Trouble about the interpretation of intercept in lm models
Marc Schwartz
marc_schwartz at comcast.net
Tue Jan 13 18:34:20 CET 2009
on 01/13/2009 11:25 AM Peter Dalgaard wrote:
> Marc Schwartz wrote:
>
>>> DF.fitted
>> Y A B F.lm
>> 1 21.86773 0 a 23.52957
>> 2 25.91822 0 a 23.52957
>> 3 20.82186 0 a 23.52957
>> 4 42.97640 1 a 36.18023
>> 5 36.64754 1 a 36.18023
>> 6 30.89766 1 a 36.18023
>> 7 47.43715 0 b 46.50615
>> 8 48.69162 0 b 46.50615
>> 9 47.87891 0 b 46.50615
>> 10 53.47306 1 b 59.15681
>> 11 62.55891 1 b 59.15681
>> 12 56.94922 1 b 59.15681
>> 13 61.89380 0 c 62.98442
>> 14 53.92650 0 c 62.98442
>> 15 70.62465 0 c 62.98442
>> 16 74.77533 1 c 75.63508
>> 17 74.91905 1 c 75.63508
>> 18 79.71918 1 c 75.63508
>>
>>
>> # Now get the means of the fitted values across
>> # the combinations of A and B
>> M <- with(DF.fitted, tapply(F.lm, list(A = A, B = B), mean))
>>
>>> M
>> B
>> A a b c
>> 0 23.52957 46.50615 62.98442
>> 1 36.18023 59.15681 75.63508
>>
>>
>> Thus:
>>
>> # Intercept = *fitted* mean at A = 0; B = "a"
>>> M["0", "a"]
>> [1] 23.52957
>
> Actually, notice that you are averaging identical values, so the "mean"
> in the tapply is slightly misleading.
>
> Notice also that the intercept may be defined even when _no_
> observations have zero entries in the design matrix. This is the usual
> case in linear regression, for instance, but it can happen in factorial
> designs (unbalanced, or using other than treatment contrasts) as well.
Good points on both accounts Peter.
Thanks,
Marc
More information about the R-help
mailing list