[R] hypothesis testing for rank-deficient linear models
dhinds@sonic.net
dhinds at sonic.net
Wed Jan 11 02:53:30 CET 2006
Take the following example:
a <- rnorm(100)
b <- trunc(3*runif(100))
g <- factor(trunc(4*runif(100)),labels=c('A','B','C','D'))
y <- rnorm(100) + a + (b+1) * (unclass(g)+2)
m <- lm(y~a+b*g)
summary(m)
Here b is discrete but not treated as a factor. I am interested in
computing the effect of b within groups defined by the factor g. One
way to do that is with the estimable() function from gmodels:
ct <- cbind(1,contr.treatment(4))
dimnames(ct)[[2]] <- c('b','b:gB','b:gC','b:gD')
estimable(m, ct, conf.int=0.95)
Another way is the contrast() function from the Design library:
dd <- datadist(a,b,g)
options(datadist='dd')
o <- ols(y~a+b*g)
contrast(o, list(b=1,g=levels(g)), list(b=0,g=levels(g)))
Now take the situation where there are empty cells in the b x g table,
so that the model is rank deficient, i.e.:
m <- lm(y~a+b*g, subset=(b==0 | g!='B'))
summary(m)
Now there's trouble. The estimable() function and Design library do
not seem to handle rank deficient models well. Here is my current
best attempt to get what I want:
my.coef <- function(n)
{
ct <- contr.treatment(levels(g), base=n)
u <- update(m, contrasts=list(g=ct))
uc <- coef(summary(u))['b',]
if (any(is.na(coef(u))) && any(!is.na(alias(u)$Complete)))
uc[1:4] <- NA
uc
}
t(sapply(1:4, my.coef))
This seems adequate for my immediate purposes and I can clean it up to
be more general. But I'm wondering if there is an easier/better way
using existing R facilities? Here, I'm doing more model fitting than
necessary, but (for now) speed is not a factor and I was having
trouble writing a concise solution that worked on just the original
model object.
-- David Hinds
More information about the R-help
mailing list