# [R] Understanding the intercept value in a multiple linear regression with categorical values

Joao Azevedo joao.c.azevedo at gmail.com
Fri Jul 27 14:16:10 CEST 2012

```Hi!

able to understand how the coding schemes are applied in the supplied
examples, but they only use a single explanatory variable. My problem
is with understanding the model when there are multiple categorical
explanatory variables.

--
Joao.

On Fri, Jul 27, 2012 at 1:04 PM, Jean V Adams <jvadams at usgs.gov> wrote:
> Joao,
>
> There's a very thorough explanation at
> http://www.ats.ucla.edu/stat/r/library/contrast_coding.htm
>
> Jean
>
>
> Joao Azevedo <joao.c.azevedo at gmail.com> wrote on 07/27/2012 06:32:31 AM:
>
>>
>> Hi!
>>
>> I'm failing to understand the value of the intercept value in a
>> multiple linear regression with categorical values. Taking the
>> "warpbreaks" data set as an example, when I do:
>>
>> > lm(breaks ~ wool, data=warpbreaks)
>>
>> Call:
>> lm(formula = breaks ~ wool, data = warpbreaks)
>>
>> Coefficients:
>> (Intercept)        woolB
>>      31.037       -5.778
>>
>> I'm able to understand that the value of intercept is the mean value
>> of breaks when wool equals "A", and that adding up the "woolB"
>> coefficient to the intercept value I get the mean value of breaks when
>> wool equals "B". However, if I also consider the tension variable in
>> the model, I'm unable to figure out the meaning of the intercept
>> value:
>>
>> > lm(breaks ~ wool + tension, data=warpbreaks)
>>
>> Call:
>> lm(formula = breaks ~ wool + tension, data = warpbreaks)
>>
>> Coefficients:
>> (Intercept)        woolB     tensionM     tensionH
>>      39.278       -5.778      -10.000      -14.722
>>
>> I thought it would be the mean value of breaks when either wool equals
>> "A" or tension equals "L", but that isn't true for this dataset.
>>
>> Any clues on interpreting the value of intercept?
>>
>> Thanks!
>>
>> --
>> Joao.

```