[R] How are interaction terms computed in lm's result / problems with interaction terms in lm?
David Winsemius
dwinsemius at comcast.net
Sun Sep 18 21:50:44 CEST 2016
> On Sep 18, 2016, at 12:39 PM, mviljamaa <mviljamaa at kapsi.fi> wrote:
>
>> On Sep 18, 2016, at 11:01 AM, mviljamaa <mviljamaa at kapsi.fi> wrote:
>> Also if you, rather than doing what's done below, do:
>> fit3 <- lm(kidmomhsage$kid_score ~ kidmomhsage$mom_age + kidmomhsage$mom_hs + kidmomhsage$mom_age * kidmomhsage$mom_hs)
>> Then this gives the result:
>> Call:
>> lm(formula = kidmomhsage$kid_score ~ kidmomhsage$mom_age + kidmomhsage$mom_hs +
>> kidmomhsage$mom_age * kidmomhsage$mom_hs)
>> Coefficients:
>> (Intercept)
>> 110.542
>> kidmomhsage$mom_age
>> -1.522
>> kidmomhsage$mom_hs
>> -41.287
>> kidmomhsage$mom_age:kidmomhsage$mom_hs
>> 2.391
>> Where the interaction term now seems properly interpretable. So perhaps this is the way to use interaction terms with lm.
>
> But why does
>
> fit3 <- lm(kidmomhsage$kid_score ~ kidmomhsage$mom_age * kidmomhsage$mom_hs)
>
> also give exactly the same result:
>
> Call:
> lm(formula = kidmomhsage$kid_score ~ kidmomhsage$mom_age * kidmomhsage$mom_hs)
>
> Coefficients:
> (Intercept)
> 110.542
> kidmomhsage$mom_age
> -1.522
> kidmomhsage$mom_hs
> -41.287
> kidmomhsage$mom_age:kidmomhsage$mom_hs
> 2.391
>
> It's as if lm is interpreting there having to also be "independent" mom_age and mom_hs variables, if there's just the interaction term. Why does it work this way?
kidmomhsage$mom_age * kidmomhsage$mom_hs
... is expanded by the formula-engine so that it is exactly:
kidmomhsage$mom_age + kidmomhsage$mom_hs + kidmomhsage$mom_age:kidmomhsage$mom_hs
(That's essentially the definiton of the `*`-operator in the formula-world.)
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list