[R] How are interaction terms computed in lm's result / problems with interaction terms in lm?

David Winsemius dwinsemius at comcast.net
Sun Sep 18 21:50:44 CEST 2016


> On Sep 18, 2016, at 12:39 PM, mviljamaa <mviljamaa at kapsi.fi> wrote:
> 
>> On Sep 18, 2016, at 11:01 AM, mviljamaa <mviljamaa at kapsi.fi> wrote:
>> Also if you, rather than doing what's done below, do:
>> fit3 <- lm(kidmomhsage$kid_score ~ kidmomhsage$mom_age + kidmomhsage$mom_hs + kidmomhsage$mom_age * kidmomhsage$mom_hs)
>> Then this gives the result:
>> Call:
>> lm(formula = kidmomhsage$kid_score ~ kidmomhsage$mom_age + kidmomhsage$mom_hs +
>>   kidmomhsage$mom_age * kidmomhsage$mom_hs)
>> Coefficients:
>>                          (Intercept)
>>                              110.542
>>                  kidmomhsage$mom_age
>>                               -1.522
>>                   kidmomhsage$mom_hs
>>                              -41.287
>> kidmomhsage$mom_age:kidmomhsage$mom_hs
>>                                2.391
>> Where the interaction term now seems properly interpretable. So perhaps this is the way to use interaction terms with lm.
> 
> But why does
> 
> fit3 <- lm(kidmomhsage$kid_score ~ kidmomhsage$mom_age * kidmomhsage$mom_hs)
> 
> also give exactly the same result:
> 
> Call:
> lm(formula = kidmomhsage$kid_score ~ kidmomhsage$mom_age * kidmomhsage$mom_hs)
> 
> Coefficients:
>                           (Intercept)
>                               110.542
>                   kidmomhsage$mom_age
>                                -1.522
>                    kidmomhsage$mom_hs
>                               -41.287
> kidmomhsage$mom_age:kidmomhsage$mom_hs
>                                 2.391
> 
> It's as if lm is interpreting there having to also be "independent" mom_age and mom_hs variables, if there's just the interaction term. Why does it work this way?

kidmomhsage$mom_age * kidmomhsage$mom_hs 

... is expanded by the formula-engine so that it is exactly:

 kidmomhsage$mom_age + kidmomhsage$mom_hs + kidmomhsage$mom_age:kidmomhsage$mom_hs

(That's essentially the definiton of the `*`-operator in the formula-world.)



David Winsemius
Alameda, CA, USA



More information about the R-help mailing list