[R] Interaction term in multiple regression
kfortino at email.unc.edu
kfortino at email.unc.edu
Tue Jul 14 03:31:41 CEST 2009
Hello All, Thank you for taking my question. I am looking for
information on how R handles interaction terms in a multiple regression
using the lm command. I originally noticed something was unusual
when my R output did not match the output from JMP for an identical
test run previously. Both programs give identical results for the main
test and if the models do not contain the interaction term then the
output is identical. However the results of the partial F tests differ
dramatically when the interaction term is included.
Here are the results from R of the test with the interaction:
> summary(lm(TD[Year==2007]~Kd[Year==2007]*area[Year==2007], data=boon_tot))
Call:
lm(formula = TD[Year == 2007] ~ Kd[Year == 2007] * area[Year ==
2007], data = boon_tot)
Residuals:
Min 1Q Median 3Q Max -0.42696 -0.25648 -0.11960
0.03151 1.27957
Coefficients:
Estimate Std. Error t value
Pr(>|t|) (Intercept) 5.5714 1.7995
3.096 0.0148 *
Kd[Year == 2007] 0.2867 4.0696 0.070
0.9456 area[Year == 2007] 0.8192 0.2874 2.851
0.0215 *
Kd[Year == 2007]:area[Year == 2007] -1.8074 0.6320 -2.860 0.0211 *
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.5238 on 8 degrees of freedom
Multiple R-squared: 0.6826, Adjusted R-squared: 0.5636 F-statistic:
5.736 on 3 and 8 DF, p-value: 0.02155
Here are the results from JMP for the same model
Source df SS MS F p
Model 3 4.72157318 1.57385773 5.73591141 0.02155127
Error 8 2.19509349 0.27438669
C. Total 11 6.91666667
Source Est. Std Error t value p > t
Intercept 10.4933505 1.24016642 8.46124381 0.00002911
Kd -11.213166 2.95096414 -3.7998315 0.00523792
area (ha) 0.04560254 0.03069489 1.48567197 0.17567049
(Kd-0.428)*
(area (ha)-6.3625) -1.8074455 0.63195669 -2.860078 0.02114887
As you can see although the results of the main test and the
interaction term are identical, the estimate and std error of the other
factors are very different.
Additionally if I remove the interaction term from the model, the two
programs then give identical results.
Any thoughts as to why they differ would be appreciated.
Sincerely
Ken
More information about the R-help
mailing list