[Rd] Different behavior of model.matrix between R 3.2 and R3.1.1
Frank Harrell
f.harrell at Vanderbilt.Edu
Wed Jun 17 01:10:03 CEST 2015
Terry Therneau has been very helpful on r-help but we can't figure out
what change in R in the past months made extra columns appear in
model.matrix when the terms object is subsetted to remove stratification
factors in a Cox model. Terry has changed his logic in the survival
package to avoid this issue but he requires generating a larger design
matrix then dropping columns.
A simple example is below.
strat <- function(x) x
d <- expand.grid(a=c('a1','a2'), b=c('b1','b2'))
d$y <- c(1,3,2,4)
f <- y ~ a * strat(b)
m <- model.frame(f, data=d)
Terms <- drop.terms(terms(f, data=d), 2)
model.matrix(Terms, m)
(Intercept) aa2 aa1:strat(b)b2 aa2:strat(b)b2
1 1 0 0 0
2 1 1 0 0
3 1 0 1 0
4 1 1 0 1
. . .
The column corresponding to a='a1' b='b2' should not be there
(aa1:strat(b)b2).
This does seem to be a change in R. Any help appreciated.
Terms attributes factor and term.labels are:
attr(,"factors")
a a:strat(b)
y 0 0
a 1 2
strat(b) 0 1
attr(,"term.labels")
[1] "a" "a:strat(b)"
Frank
More information about the R-devel
mailing list