[Rd] step.lm() fails to drop {many empty 2-way factor cells}
(PR#3527)
ripley at stats.ox.ac.uk
ripley at stats.ox.ac.uk
Mon Jul 21 13:12:36 MEST 2003
Your example works on 2 of the 3 systems I tried (and in fact the warnings
you got on stepAIC are spurious). It's all down to rounding errors.
I've added a fuzz in the termination test.
B
On Wed, 16 Jul 2003 maechler at stat.math.ethz.ch wrote:
> Exec. Summary:
> step() basically ``fails'' whereas MASS' stepAIC() does work
>
> This may not be a bug in the strictest sense, but at least
> something for the wish list. Unfortunately I have no time
> currently to investigate further myself but want to be sure this
> won't be forgotten:
>
> The example is using a real data set with 216 observations on 9
> variables -- where we have anonymized variable and factor level names.
> The 8 explanatory variables are 4 factors (with 3 / 2 / 4 / 4 levels)
> and 4 continuous covariates. The factor combinations observed
> are very far from balanced, and have actually quite a few 0 cells.
>
> The following is a reproducible R script
> (you need internet connection and our FTP server working) :
>
>
> ## step.lm() "fails" when 2-way factor interactions have (many) empty cells
> ## whereas stepAIC() works fine
>
> Dat <- read.table("ftp://stat.ethz.ch/U/maechler/R/step-ex.tab", header=TRUE)
> Dat$f3 <- ordered(Dat$f3)
>
> ## Full Model : y ~ (f1+f2+f3+f4 + C1+C2+C3+C4)
> ## ---------- where f1..f4 are FACTORS and C1..C4 are continuous covariates
>
> ## "Look at data"
> str(Dat)
> summary(Dat)
>
> ## several 2-way interactions have empty cells, e.g.
> with(Dat, table(f1,f4)) #!
> with(Dat, table(f1,f3))
>
> ## see the full design:
> with(Dat, ftable(table(f1,f2,f3,f4))) ##--> many many 0
> pairs(Dat, col = 1+as.integer(Dat$f1), pch = 1+as.integer(Dat$sex))
>
> ## Fit a "full model" with all 2-way interactions :
> lmAll <- lm(formula = y ~ .^2, data = Dat)
> summary(lmAll) # many insignificant ones {BTW: why is it bad to have "*" here?}
>
> drop1(lmAll)
> ## clearly shows interaction terms that should be dropped (would decrease AIC)
>
> ## BUT:
> sr <- step(lmAll)
> ## nothing eliminated ?step claims it would work when drop1() does
>
> library(MASS)
> sAr <- stepAIC(lmAll) ## <<< stepAIC() *does* work
> ## but with instructive
> ## There were 19 warnings (use warnings() to see them)
> warnings()
> ##- Warning messages:
> ##- 1: 0 df terms are changing AIC in: stepAIC(lmAll)
> ##- 2: 0 df terms are changing AIC in: stepAIC(lmAll)
> ##- ...
> ##- ...
> summary(sAr)
>
> ______________________________________________
> R-devel at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list