# [R] Intercept in Model Matrix (Parameters not what I expected)

Bert Gunter bgunter.4567 at gmail.com
Mon Aug 22 17:14:24 CEST 2016

```Justin:

is typically off topic here. Briefly, you do seem confused about
and so may be of little help. However....

Note that in your little 8 run example design, the response lives in 8
dims, and so your model matrix can have at most 8 independent columns.
~(A+B) has 4, which, using contr.treatment treatments could be
Intercept, A2,B2, B3 (since (B3+B4) - (B2+B1) is confounded with (A2 -
A1), where these are "dummy" encodings of 0 and 1). Adding all
pairwise products of the non-intercept columns  would not give you any
more, as all are all 0's. I do not know the algorithm that lm/aov uses
to choose which of the contrasts to estimate, but it makes no
difference: there can only be 3 beyond the intercept, and all others
are linear combinations of these.

If this is not useful to you, either:

1. Hope for a response here that is more helpful;
2. Consult a local statistical expert;
3. Read up on linear models (there are multiple books and internet sources);
4. Post on stats.stackexchange.com again.

Cheers,
Bert

## Note to others. If I have erred in any of the above, PLEASE CORRECT.

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Sun, Aug 21, 2016 at 6:44 PM, Justin Thong <justinthong93 at gmail.com> wrote:
> I have something which has been bugging me and I have even asked this on
> cross validated but I did not get a response.  Let's construct a simple
> example. Below is the code.
>
> A<-gl(2,4) #factor of 2 levels
> B<-gl(4,2) #factor of 4 levels
> df<-data.frame(y,A,B)
>
> As you can see, B is nested within A.
> The peculiar result I am interested in the output of the model matrix when
> I fit for a nested model . *How does R decide what is included inside the
> intercept?* Since we are using dummy coding, the coefficients of the model
> is interpreted as the difference between a particular level and the
> reference level/the intercept for an single factor model. I understand for
> model ~A, A1 becomes the intercept and that for model ~A+B, A1 and B1
> (both) become the intercept.
>
> *I do not get why when we use a nested model, A1:B2 appears as a column
> inside the model matrix. Why isn't the first parameter of the interaction
> subspace A1:B1 or A2:B1? *I think I am missing the concept. I think the
> intercept is A1. *Hence, Why do we not compare the levels of A1:B1 and
> A1(intercept)  or A2:B1 and A1(intercept)?*
>
> #nested model
>> mod<-aov(y~A+A:B)
>> model.matrix(mod)
>   (Intercept) A2 A1:B2 A2:B2 A1:B3 A2:B3 A1:B4 A2:B4
> 1           1  0     0     0     0     0     0     0
> 2           1  0     0     0     0     0     0     0
> 3           1  0     1     0     0     0     0     0
> 4           1  0     1     0     0     0     0     0
> 5           1  1     0     0     0     1     0     0
> 6           1  1     0     0     0     1     0     0
> 7           1  1     0     0     0     0     0     1
> 8           1  1     0     0     0     0     0     1
>
>
> --
> Yours sincerely,
> Justin
>
> *I check my email at 9AM and 4PM everyday*
> *If you have an EMERGENCY, contact me at +447938674419(UK) or
> +60125056192(Malaysia)*
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help