[R] StepAIC
Christian Hennig
fm3a004 at math.uni-hamburg.de
Mon Mar 29 18:07:37 CEST 2004
Dear list,
here is an example of stepAIC that I do not understand.
The data is n=42, Lage is the only factor and there are four other
variables treated as continuous.
First you see the stepAIC-forward solution (fs7). The strange thing here
is that apparently not all interactions are tried for inclusion, but only
WQ:Lage. In particular, I think that WFL:Lage should be tried
in the last two steps, where WFL and Lage are already in the fit.
After fs7, I give the output of fs6 (backward), where all interactions are
tried as I have expected. (regsubsets works properly forward and
backward.)
Do I misunderstand something or is something strange going on in the
forward fit?
(I don't want to discuss here if the forward fit is a good thing to do
from a data analytic viewpoint. I agree that I should presumably not
choose it. However, I want to understand what the algorithm does.)
Thank you,
Christian
> w6 <- lm(Preis~RW1+WFL+WQ+VD+Lage+Lage*WFL+Lage*WQ+Lage*VD,
+ data=wohnung)
> w7 <- lm(Preis~1, data=wohnung)
> fs7 <-
stepAIC(w7,scope=list(upper=~RW1+WFL+WQ+VD+Lage+Lage*WFL+Lage*WQ+Lage*VD,
+ lower=~1), direction="forward")
Start: AIC= 623.57
Preis ~ 1
Df Sum of Sq RSS AIC
+ WQ 1 37219390 75101315 609
+ Lage 1 19029749 93290956 618
+ WFL 1 12506022 99814682 621
+ RW1 1 7299347 105021358 623
<none> 112320704 624
+ VD 1 5170556 107150149 624
Step: AIC= 608.66
Preis ~ WQ
Df Sum of Sq RSS AIC
+ Lage 1 4736613 70364702 608
<none> 75101315 609
+ WFL 1 1863992 73237323 610
+ VD 1 555800 74545515 610
+ RW1 1 462284 74639030 610
Step: AIC= 607.92
Preis ~ WQ + Lage
Df Sum of Sq RSS AIC
+ WFL 1 4721973 65642729 607
<none> 70364702 608
+ WQ:Lage 1 2829768 67534934 608
+ RW1 1 2567408 67797294 608
+ VD 1 678458 69686244 610
Step: AIC= 607.01
Preis ~ WQ + Lage + WFL
Df Sum of Sq RSS AIC
+ WQ:Lage 1 5610596 60032132 605
+ RW1 1 3404796 62237933 607
<none> 65642729 607
+ VD 1 925528 64717201 608
Step: AIC= 605.25
Preis ~ WQ + Lage + WFL + WQ:Lage
Df Sum of Sq RSS AIC
+ RW1 1 3492210 56539923 605
<none> 60032132 605
+ VD 1 355353 59676779 607
Step: AIC= 604.74
Preis ~ WQ + Lage + WFL + RW1 + WQ:Lage
Df Sum of Sq RSS AIC
<none> 56539923 605
+ VD 1 94023 56445900 607
Backward fit:
> stepAIC(w6)
Start: AIC= 596.53
Preis ~ RW1 + WFL + WQ + VD + Lage + Lage * WFL + Lage * WQ +
Lage * VD
Df Sum of Sq RSS AIC
- WQ:Lage 1 190953 40507327 595
- RW1 1 865788 41182162 595
<none> 40316374 597
- WFL:Lage 1 6491181 46807556 601
- VD:Lage 1 12307855 52624230 606
Step: AIC= 594.73
Preis ~ RW1 + WFL + WQ + VD + Lage + WFL:Lage + VD:Lage
Df Sum of Sq RSS AIC
- RW1 1 756790 41264117 594
- WQ 1 1910020 42417348 595
<none> 40507327 595
- WFL:Lage 1 10302360 50809687 602
- VD:Lage 1 13222644 53729971 605
Step: AIC= 593.51
Preis ~ WFL + WQ + VD + Lage + WFL:Lage + VD:Lage
Df Sum of Sq RSS AIC
- WQ 1 1793962 43058080 593
<none> 41264117 594
- WFL:Lage 1 12069383 53333500 602
- VD:Lage 1 13657842 54921959 604
Step: AIC= 593.3
Preis ~ WFL + VD + Lage + WFL:Lage + VD:Lage
Df Sum of Sq RSS AIC
<none> 43058080 593
- WFL:Lage 1 14241342 57299422 603
- VD:Lage 1 19078878 62136957 607
Call:
lm(formula = Preis ~ WFL + VD + Lage + WFL:Lage + VD:Lage, data = wohnung)
Coefficients:
(Intercept) WFL VD Lage2 WFL:Lage2
VD:Lage2
-53269.15 55.92 8025.62 59259.63 -46.71
-8233.36
***********************************************************************
Christian Hennig
Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg
hennig at math.uni-hamburg.de, http://www.math.uni-hamburg.de/home/hennig/
#######################################################################
ich empfehle www.boag-online.de
More information about the R-help
mailing list