[R] interpreting "not defined because of singularities" in lm

Duncan Murdoch murdoch at stats.uwo.ca
Mon Mar 30 12:40:07 CEST 2009

jiblerize22 at yahoo.com wrote:
> I run lm to fit an OLS model where one of the covariates is a factor with 30 levels. I use contr.treatment() to set the base level of the factor, so when I run lm() no coefficients are estimated for that level. But in addition (and regardless of which level I choose to be the base), lm also gives a vector of NA coefficients for another level of my factor.
> The output says that these coefficients were "not defined because of singularities", suggesting maybe that the 28 estimated coefficients are sufficient to pin down the 29th... but why is this the case? Why am I going from 30 levels to 28 coefficients? Am I misunderstanding the way factors/levels are supposed to work?
The usual cause of this is that one of the levels is not present in the 
data set.  Another possibility is collinearity with some other covariate 
in your model.

Duncan Murdoch

