[R] hypothesis testing for rank-deficient linear models

Fri Jan 13 02:22:04 CET 2006

dhinds at sonic.net wrote:
> Take the following example:

>     a <- rnorm(100)
>     b <- trunc(3*runif(100))
>     g <- factor(trunc(4*runif(100)),labels=c('A','B','C','D'))
>     y <- rnorm(100) + a + (b+1) * (unclass(g)+2)
...

Here's a cleaned-up function to compute estimable within-group effects
for rank deficient models.  For the above data, it could be invoked
as:

    > m <- lm(y~a+b*g, subset=(b==0 | g!='B'))
    > subgroup.effects(m, 'b',  g=c('A','B','C','D'))
      g Estimate  StdError  t.value      p.value
    1 A 2.779167 0.4190213  6.63252 4.722978e-09
    2 B       NA        NA       NA           NA
    3 C 4.572431 0.3074402 14.87258 6.226445e-24
    4 D 5.920809 0.3502251 16.90572 3.995266e-27

Again, I'm not sure whether this is a good approach, or whether there
is an easier way using existing R functions.  One problem is figuring
exactly which terms are not estimable from the available data.  My
hack using alias() is not satisfactory and I've already run into cases
where it fails.  But I'm having trouble coming up with a more general,
correct test?

-- David Hinds

--------------------

subgroup.effects <- function(model, term, ...)
{
    my.coef <- function(n)
    {
        contr <- lapply(names(args), function(i)
                        contr.treatment(args[[i]], unclass(gr[n,i])))
        names(contr) <- names(args)
        u <- update(model, formula=model$formula,
                    data=model$data, contrasts=contr)
        uc <- coef(summary(u))[term,]
        if (any(is.na(coef(u))) &&
            any(!is.na(alias(u)$Complete)))
            uc[1:4] <- NA
        uc
    }
    args <- list(...)
    gr <- expand.grid(...)
    d <- data.frame(gr, t(sapply(1:nrow(gr), my.coef)))
    names(d) <- c(names(gr),'Estimate','StdError','t.value','p.value')
    d
}