[R] Different behavior of model.matrix between R 3.2 and R3.1.1

Therneau, Terry M., Ph.D. therneau at mayo.edu
Mon Jun 15 16:05:39 CEST 2015


Frank,
   I don't think there is any way to "fix" your problem except the way that I did it.

library(survival)
tdata <- data.frame(y=c(1,3,3,5, 5,7, 7,9, 9,13),
                     x1=factor(letters[c(1,1,1,1,1,2,2,2,2,2)]),
                     x2= c(1,2,1,2,1,2,1,2,1,2))

fit1 <- lm( y ~ x1 * strata(x2) - strata(x2), tdata)
coef(fit1)
        (Intercept)                x1b x1a:strata(x2)x2=2 x1b:strata(x2)x2=2
           3.000000           5.000000           1.000000           1.666667

Your code is calling model.matrix with the same model frame and terms structure as the lm 
call above (I checked).  In your case you know that the underlying model has 2 intercepts 
(strata), one for the group with x2=1 and another for the group with x2=2, but how is the 
model.matrix routine supposed to guess that?  It can't, so model.matrix returns the proper 
result for the lm call.  As seen above the result is not singular, while for the Cox model 
it is singular due to the extra intercept.

This is simply an extension of leaving the "intercept" term in the model and then removing 
that column from the returned X matrix, which is necessary to have the correct coding for 
ordinary factor variables, something we've both done since day 1.  In order for 
model.matrix to do the right thing with interactions, it has to know how many intercepts 
there actually are.

I've come to the conclusion that the entire thrust of 'contrasts' in S was wrong headed, 
i.e., the "remove redundant columns from the X matrix ahead of time" logic.  It is simply 
not possible for the model.matrix routine to guess correctly for all y and x combinations, 
something that been acknowledged in R by changing the default for "singular.ok" to TRUE. 
Dealing with this after the fact via a good contrast function (a la SAS -- heresy!) would 
have been a much better design choice.  But as long as I'm in R the coxph routine tries to 
be a good citizen.

Terry T.



More information about the R-help mailing list