[R] Order of formula terms in model.matrix

Charles C. Berry ccberry at ucsd.edu
Sun Jan 17 19:34:40 CET 2016

On Sun, 17 Jan 2016, Lars Bishop wrote:

> I’d appreciate your help on understanding the following. 

> It is not very clear to me from the model.matrix documentation, why 
> simply changing the order of terms in the formula may change the number 
> of resulting columns. Please note I’m purposely not including main 
> effects in the model formula in this case.

IIRC, there are some heuristics involved harking back to the White Book. I 
recall there have been discussions of whether and how this could be fixed 
before on this list and or R-devel, but I cannot seem to lay my browser on 
them right now.

> set.seed(1)
> x1 <- rnorm(100)
> f1 <- factor(sample(letters[1:3], 100, replace = TRUE))
> trt <- sample(c(-1,1), 100, replace = TRUE)
> df <- data.frame(x1=x1, f1=f1, trt=trt)
> dim(model.matrix( ~ x1:trt + f1:trt, data = df))
> [1] 100 4
> dim(model.matrix(~ f1:trt + x1:trt, data = df))
> [1] 100 5

By `x1:trt' I guess you mean the same thing as `I(x1*trt)'.

If you use the latter form, the issue you raise goes away.

Note that `I(some.expr)' gives you the ability to force the behavior of 
model.matrix to be exactly what you want by suitably crafting `some.expr', 
heuristics notwithstanding.



More information about the R-help mailing list