[R] Regression with factor having1 level
Robert McGehee
rmcgehee at gmail.com
Fri Mar 11 16:03:01 CET 2016
Hi,
In case this is helpful for anyone, I think I've coded a satisfactory
function answering my problem (of handling formulas containing 1-level
factors) by hacking liberally at the model.matrix code to remove any
model terms for which the contrast fails. As it's a problem I've come
across a lot (since my data frames have factors and lots of missing
values), adding support for 1-level factors might be a nice item for
the R Wishlist. I suppose a key question is, does anyone ever _want_
to see the error "contrasts can be applied only to factors with 2 or
more levels", or should the contrasts function just add a column of
all zeros (or ones) to the design matrix and let the modelling
functions handle that the same way it does any other zero-variance
term?
Anyway, my function below:
lmresid <- function(formula, data) {
mf <- model.frame(formula, data=data, na.action=na.exclude)
omit <- attr(mf, "na.action")
t <- terms(mf)
contr.funs <- as.character(getOption("contrasts"))
namD <- names(mf)
for (i in namD) if (is.character(mf[[i]]))
mf[[i]] <- factor(mf[[i]])
isF <- vapply(mf, function(x) is.factor(x) || is.logical(x), NA)
isF[1] <- FALSE
isOF <- vapply(mf, is.ordered, NA)
for (nn in namD[isF])
if (is.null(attr(mf[[nn]], "contrasts"))) {
noCntr <- try(contrasts(mf[[nn]]) <- contr.funs[1 +
isOF[nn]], silent=TRUE)
if (inherits(noCntr, "try-error")) { # Remove term
from model on error
mf[[nn]] <- NULL
t <- terms(update(t, as.formula(paste("~ . -", nn))), data=mf)
}
}
ans <- .External2(stats:::C_modelmatrix, t, mf)
r <- .lm.fit(ans, mf[[1]])$residual
stats:::naresid.exclude(omit, r)
}
## Note that lmresid now returns the same values as resid with the
## 1-level factor removed.
df <- data.frame(y=c(0,2,4,6,8), x1=c(1,1,2,2,NA),
x2=factor(c("A","A","A","A","B")))
lmresid(y~x1+x2, data=df)
resid(lm(y~x1, data=df, na.action=na.exclude))
--Robert
PS, Peter, wasn't sure if you also meant to add comments, but they
didn't come through.
On Fri, Mar 11, 2016 at 3:40 AM, peter dalgaard <pdalgd at gmail.com> wrote:
>
>> On 11 Mar 2016, at 02:03 , Robert McGehee <rmcgehee at gmail.com> wrote:
>>
>>> df <- data.frame(y=c(0,2,4,6,8), x1=c(1,1,2,2,NA),
>> x2=factor(c("A","A","A","A","B")))
>>> resid(lm(y~x1+x2, data=df, na.action=na.exclude)
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
>
>
>
>
>
>
>
>
>
More information about the R-help
mailing list