[R] Interaction factor and numeric variable versus separate regressions
Sven Garbade
Sven.Garbade at med.uni-heidelberg.de
Tue Aug 7 16:58:44 CEST 2007
Dear list members,
I have problems to interpret the coefficients from a lm model involving
the interaction of a numeric and factor variable compared to separate lm
models for each level of the factor variable.
## data:
y1 <- rnorm(20) + 6.8
y2 <- rnorm(20) + (1:20*1.7 + 1)
y3 <- rnorm(20) + (1:20*6.7 + 3.7)
y <- c(y1,y2,y3)
x <- rep(1:20,3)
f <- gl(3,20, labels=paste("lev", 1:3, sep=""))
d <- data.frame(x=x,y=y, f=f)
## plot
# xyplot(y~x|f)
## lm model with interaction
summary(lm(y~x:f, data=d))
Call:
lm(formula = y ~ x:f, data = d)
Residuals:
Min 1Q Median 3Q Max
-2.8109 -0.8302 0.2542 0.6737 3.5383
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.68799 0.41045 8.985 1.91e-12 ***
x:flev1 0.20885 0.04145 5.039 5.21e-06 ***
x:flev2 1.49670 0.04145 36.109 < 2e-16 ***
x:flev3 6.70815 0.04145 161.838 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.53 on 56 degrees of freedom
Multiple R-Squared: 0.9984, Adjusted R-squared: 0.9984
F-statistic: 1.191e+04 on 3 and 56 DF, p-value: < 2.2e-16
## separate lm fits
lapply(by(d, d$f, function(x) lm(y ~ x, data=x)), coef)
$lev1
(Intercept) x
6.77022860 -0.01667528
$lev2
(Intercept) x
1.019078 1.691982
$lev3
(Intercept) x
3.274656 6.738396
Can anybody give me a hint why the coefficients for the slopes
(especially for lev1) are so different and how the coefficients from the
lm model with interaction are related to the separate fits?
Thanks, Sven
More information about the R-help
mailing list