[R] Interaction factor and numeric variable versus separate regressions

Sven Garbade Sven.Garbade at med.uni-heidelberg.de
Tue Aug 7 16:58:44 CEST 2007


Dear list members,

I have problems to interpret the coefficients from a lm model involving
the interaction of a numeric and factor variable compared to separate lm
models for each level of the factor variable.

## data:
y1 <- rnorm(20) + 6.8
y2 <- rnorm(20) + (1:20*1.7 + 1)
y3 <- rnorm(20) + (1:20*6.7 + 3.7)
y <- c(y1,y2,y3)
x <- rep(1:20,3)
f <- gl(3,20, labels=paste("lev", 1:3, sep=""))	
d <- data.frame(x=x,y=y, f=f)

## plot
# xyplot(y~x|f)

## lm model with interaction
summary(lm(y~x:f, data=d))

Call:
lm(formula = y ~ x:f, data = d)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.8109 -0.8302  0.2542  0.6737  3.5383 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  3.68799    0.41045   8.985 1.91e-12 ***
x:flev1      0.20885    0.04145   5.039 5.21e-06 ***
x:flev2      1.49670    0.04145  36.109  < 2e-16 ***
x:flev3      6.70815    0.04145 161.838  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 1.53 on 56 degrees of freedom
Multiple R-Squared: 0.9984,	Adjusted R-squared: 0.9984 
F-statistic: 1.191e+04 on 3 and 56 DF,  p-value: < 2.2e-16 

## separate lm fits
lapply(by(d, d$f, function(x) lm(y ~ x, data=x)), coef)
$lev1
(Intercept)           x 
 6.77022860 -0.01667528 

$lev2
(Intercept)           x 
   1.019078    1.691982 

$lev3
(Intercept)           x 
   3.274656    6.738396 


Can anybody give me a hint why the coefficients for the slopes
(especially for lev1) are so different and how the coefficients from the
lm model with interaction are related to the separate fits?

Thanks, Sven



More information about the R-help mailing list