# [R] hypothesis testing for rank-deficient linear models

dhinds@sonic.net dhinds at sonic.net
Wed Jan 11 02:53:30 CET 2006

```Take the following example:

a <- rnorm(100)
b <- trunc(3*runif(100))
g <- factor(trunc(4*runif(100)),labels=c('A','B','C','D'))
y <- rnorm(100) + a + (b+1) * (unclass(g)+2)
m <- lm(y~a+b*g)
summary(m)

Here b is discrete but not treated as a factor.  I am interested in
computing the effect of b within groups defined by the factor g.  One
way to do that is with the estimable() function from gmodels:

ct <- cbind(1,contr.treatment(4))
dimnames(ct)[[2]] <- c('b','b:gB','b:gC','b:gD')
estimable(m, ct, conf.int=0.95)

Another way is the contrast() function from the Design library:

o <- ols(y~a+b*g)
contrast(o, list(b=1,g=levels(g)), list(b=0,g=levels(g)))

Now take the situation where there are empty cells in the b x g table,
so that the model is rank deficient, i.e.:

m <- lm(y~a+b*g, subset=(b==0 | g!='B'))
summary(m)

Now there's trouble.  The estimable() function and Design library do
not seem to handle rank deficient models well.  Here is my current
best attempt to get what I want:

my.coef <- function(n)
{
ct <- contr.treatment(levels(g), base=n)
u <- update(m, contrasts=list(g=ct))
uc <- coef(summary(u))['b',]
if (any(is.na(coef(u))) && any(!is.na(alias(u)\$Complete)))
uc[1:4] <- NA
uc
}
t(sapply(1:4, my.coef))

This seems adequate for my immediate purposes and I can clean it up to
be more general.  But I'm wondering if there is an easier/better way
using existing R facilities?  Here, I'm doing more model fitting than
necessary, but (for now) speed is not a factor and I was having
trouble writing a concise solution that worked on just the original
model object.

-- David Hinds

```