[R] Trouble about the interpretation of intercept in lm models
Peter Dalgaard
P.Dalgaard at biostat.ku.dk
Tue Jan 13 18:25:57 CET 2009
Marc Schwartz wrote:
>
>> DF.fitted
> Y A B F.lm
> 1 21.86773 0 a 23.52957
> 2 25.91822 0 a 23.52957
> 3 20.82186 0 a 23.52957
> 4 42.97640 1 a 36.18023
> 5 36.64754 1 a 36.18023
> 6 30.89766 1 a 36.18023
> 7 47.43715 0 b 46.50615
> 8 48.69162 0 b 46.50615
> 9 47.87891 0 b 46.50615
> 10 53.47306 1 b 59.15681
> 11 62.55891 1 b 59.15681
> 12 56.94922 1 b 59.15681
> 13 61.89380 0 c 62.98442
> 14 53.92650 0 c 62.98442
> 15 70.62465 0 c 62.98442
> 16 74.77533 1 c 75.63508
> 17 74.91905 1 c 75.63508
> 18 79.71918 1 c 75.63508
>
>
> # Now get the means of the fitted values across
> # the combinations of A and B
> M <- with(DF.fitted, tapply(F.lm, list(A = A, B = B), mean))
>
>> M
> B
> A a b c
> 0 23.52957 46.50615 62.98442
> 1 36.18023 59.15681 75.63508
>
>
> Thus:
>
> # Intercept = *fitted* mean at A = 0; B = "a"
>> M["0", "a"]
> [1] 23.52957
Actually, notice that you are averaging identical values, so the "mean"
in the tapply is slightly misleading.
Notice also that the intercept may be defined even when _no_
observations have zero entries in the design matrix. This is the usual
case in linear regression, for instance, but it can happen in factorial
designs (unbalanced, or using other than treatment contrasts) as well.
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list