[R-pkg-devel] Problem in stats::model.matrix when omitting two-way interactions

Paul Buerkner paul.buerkner at gmail.com
Thu Mar 30 12:54:09 CEST 2017


Hi all,

recently I stumbled upen a problem in stats::model.matrix that I think is
worth reporting.

When I run:

> dat <- data.frame(
>   y = rnorm(8),
>   x1 = factor(rep(0:1, each = 4)),
>   x2 = factor(rep(rep(0:1, each = 2), 2)),
>   x3 = factor(rep(0:1, 4))
> )
>
> stats::model.matrix(y ~ x1+x2+x3 + x1:x2:x3, dat)

I get a matrix with 12 columns, which are linearily dependent and thus not
identified in a linear model:

> summary(lm(y ~  x1+x2+x3 + x1:x2:x3, dat))

Of course, there is usually no need for such a formula that ignores the
two-way interactions, but from my point of view, model.matrix should still
return only 8 columns (or less) in order to produce identified models.

I wonder if this is some sort of intendend behavior or just a side effect
of the way model.matrix handles factors.

Many thanks in advance.

Paul

	[[alternative HTML version deleted]]



More information about the R-package-devel mailing list