[R] Column names of model.matrix's output with contrast.arg

Christophe Dutang dut@ngc @end|ng |rom gm@||@com
Fri Jun 14 08:12:23 CEST 2024


Dear list,

Changing the default contrasts used in glm() makes me aware how model.matrix() set column names.

With default contrasts, model.matrix() use the level values to name the columns. However with other contrasts, model.matrix() use the level indexes. In the documentation, I don’t see anything in the documentation related to this ? It does not seem natural to have such a behavior?

Any comment is welcome.

An example is below.

Kind regards, Christophe  


#example from ?glm
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- paste0("O", gl(3,1,9))
treatment <- paste0("T", gl(3,3))

X3 <- model.matrix(counts ~ outcome + treatment)
X4 <- model.matrix(counts ~ outcome + treatment, contrasts = list("outcome"="contr.sum"))
X5 <- model.matrix(counts ~ outcome + treatment, contrasts = list("outcome"="contr.helmert"))

#check with original factor
cbind.data.frame(X3, outcome)
cbind.data.frame(X4, outcome)
cbind.data.frame(X5, outcome)

#same issue with glm
glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
glm.D94 <- glm(counts ~ outcome + treatment, family = poisson(), contrasts = list("outcome"="contr.sum"))
glm.D95 <- glm(counts ~ outcome + treatment, family = poisson(), contrasts = list("outcome"="contr.helmert"))

coef(glm.D93)
coef(glm.D94)
coef(glm.D95)

#check linear predictor
cbind(X3 %*% coef(glm.D93), predict(glm.D93))
cbind(X4 %*% coef(glm.D94), predict(glm.D94))

-------------------------------------------------
Christophe DUTANG
LJK, Ensimag, Grenoble INP, UGA, France
ILB research fellow
Web: http://dutangc.free.fr



More information about the R-help mailing list