[R] [example code] RE: AIC() vs. mle.aic() vs. step()?
Alexandra Thorn
alexandra.thorn at tufts.edu
Thu Jun 23 15:29:48 CEST 2011
Ok, here's some example code showing how I get different output for AIC
vs. mle.aic(). Now that I've taken another look at the independent
variables, I'm wondering whether missing values in one of the variables
might be what is messing me up. I'm going to see if the behavior
changes when I remove those...
Alexandra
#Code with outputs
R> require(wle)
R> xA # 1st independent variable (categorical)
[1] Diffuse Diffuse Diffuse Diffuse Diffuse Diffuse Diffuse Diffuse
Diffuse
[10] Diffuse Diffuse Ring Diffuse Diffuse Ring Diffuse Diffuse
Diffuse
[19] Diffuse Diffuse Ring Diffuse Diffuse Diffuse Diffuse Diffuse
Diffuse
[28] Diffuse Diffuse Ring Ring Diffuse Diffuse Ring Ring
Ring
[37] Diffuse Diffuse Ring Ring Ring Ring Ring Ring
Diffuse
[46] Other Ring Ring Ring Ring Ring Other Ring
Ring
[55] Ring Ring Ring Ring Ring Ring Ring Diffuse
Diffuse
[64] Diffuse Ring Ring Ring Diffuse Diffuse Diffuse Diffuse
Diffuse
[73] Diffuse Diffuse Diffuse Other Ring Ring Ring Ring
Diffuse
[82] Diffuse Diffuse Diffuse Ring Ring Ring Ring Ring
Diffuse
[91] Other Other Ring Ring Ring Other Ring Other
Diffuse
[100] Diffuse Diffuse Ring Ring
Levels: Diffuse Other Ring
R> x5 # 2nd independent variable
[1] 35.1890163 22.8565556 15.2969944 9.6002241 25.0393843
21.1797882
[7] 9.2677660 14.5228280 6.6982274 5.7889657 21.4854297
20.5942436
[13] 20.2180106 0.4442017 5.0414991 26.9849474 14.7613970
10.3045834
[19] 13.4192478 13.9074085 6.7219989 13.2569404 18.1492698
8.9814628
[25] 14.2575003 21.8982503 8.5661574 15.3434996 7.4060632
10.2824613
[31] 23.4777018 35.3389594 51.5448186 6.9571801 23.3166747
35.2280399
[37] 53.3812646 44.7933630 25.5658796 9.6980968 2.9003139
4.8073814
[43] 6.9274067 8.6178642 43.9578503 0.0000000 44.1995269
14.6878355
[49] 5.6385462 0.0000000 21.1687124 20.5669418 0.0000000
0.0000000
[55] 28.4924849 8.7184163 18.8744437 20.9748315 21.3849539
163.1436925
[61] 10.8565582 9.9297861 0.0000000 0.0000000 41.9369100
121.7625948
[67] 13.5709398 20.1040412 14.1449650 8.2172524 10.1649988
19.5981176
[73] 20.3028117 17.0104638 12.6129991 8.2051932 6.4293587
22.1598564
[79] 13.9703385 23.0206302 15.2590230 14.4778824 2.4819054
21.8293460
[85] 25.1515167 32.1050850 12.5154914 11.6927538 9.4048632
38.4559899
[91] 53.1959167 14.4917170 10.2548528 8.8227194 12.8573515
10.0589965
[97] 12.8868929 9.6626724 5.9826061 3.2581190 13.4467376
8.8065840
[103] 17.7734493
> x15 # 3rd independent variable
[1] 1.69924629 -1.63414400 0.71415169 4.17480342 1.52512663
1.73541068
[7] -5.47498002 0.95681283 -1.48092555 1.51101949 -2.25838766
2.12958863
[13] 1.43795703 -4.48003373 -3.65963009 -0.76346388 -2.44019863
1.32552847
[19] 1.89863804 1.80655970 -0.74175682 1.30112633 -1.06424643
-1.47852202
[25] 0.09035915 NA NA 1.82385292 -0.15308708
1.04685322
[31] 2.45599032 1.36474093 -2.39863477 -0.21220447 -2.50255033
-1.92296430
[37] -0.24577578 -1.96756216 0.43349997 0.88459859 -0.12755905
2.31771322
[43] -1.21846731 1.75082992 -3.02346893 -4.15582445 1.09946460
4.30008522
[49] 4.37542383 NA -1.93641862 -0.01919492 -2.39609318
-3.12228102
[55] 0.48804606 -1.42886437 -3.52078266 3.22115286 0.87942540
-0.29385365
[61] 0.40030867 0.84382607 -0.14445408 -0.61903527 NA
1.53158894
[67] -1.01595045 0.18857375 -1.24703875 -0.53766035 -0.43305094
1.30035414
[73] 0.08256647 -0.01008154 -1.89151834 0.60161181 1.38339048
1.70782208
[79] 0.48995599 NA 0.71774340 NA 0.35578308
-1.30038021
[85] 0.18170942 -0.76999772 -0.52860127 -0.58713905 2.45770818
-3.79345760
[91] -0.73700348 1.85916858 0.48523489 -2.24404921 -3.71691741
-0.80525820
[97] 0.20768561 -0.05588210 NA -0.50332833 0.70407465
-0.57391160
[103] -1.11740646
> y1 #response variable
[1] 0.11736407 0.12793015 0.06627390 0.03385292 0.05111586 0.12896867
[7] 0.21030113 0.10661115 0.02321079 0.06035170 0.17966075 0.22120809
[13] 0.16367033 0.07062699 0.11563063 0.62809888 0.13571557 0.14366535
[19] 0.16453117 0.04030618 0.29904079 0.13865458 0.25814464 0.09636693
[25] 0.14262893 0.12619897 0.15919200 0.10713175 0.18137740 0.37961763
[31] 0.16831734 0.02425770 0.12793015 0.23174790 0.16384251 0.41976893
[37] 0.12498691 0.18960957 0.33873792 0.19594614 0.44510411 0.45554491
[43] 0.70821663 0.20739951 0.07828510 0.07393444 0.12290867 0.22614130
[49] 0.49742825 0.04013179 0.58127117 0.05216166 0.27597288 0.14090123
[55] 0.22120809 0.49090375 0.33216113 0.03437621 0.12031011 0.10261893
[61] 0.58141318 0.06244214 0.03594604 0.17966075 0.09345085 0.43887815
[67] 0.51929244 0.20501885 0.04663966 0.33104604 0.28841287 0.26924687
[73] 0.29495726 0.23675230 0.33385065 0.02814909 0.25281753 0.21240608
[79] 0.15204576 0.18288671 0.32867804 0.19813360 0.17379109 0.20755404
[85] 0.10898273 0.10303441 0.19145080 0.38541988 0.29372153 0.19337137
[91] 0.06810569 0.06357357 0.15778877 0.21364239 0.33999760 0.13670444
[97] 0.11900238 0.01315180 0.30599263 0.05201595 0.30131938 0.22017956
[103] 0.23811364
R> summary(mle.aic(lm(y1~xP+x5+x15)),max.num=30) # mle.aic output
Call:
mle.aic(formula = lm(y1 ~ xP + x5 + x15))
Akaike Information Criterion (AIC):
(Intercept) xPNA xPRing x5 x15 aic
[1,] 1 1 1 0 0 -113.60
[2,] 1 1 1 0 1 -112.80
[3,] 1 1 1 1 0 -112.20
[4,] 1 0 1 0 0 -112.10
[5,] 1 1 1 1 1 -111.30
[6,] 1 0 1 0 1 -111.20
[7,] 1 0 1 1 0 -110.60
[8,] 1 0 1 1 1 -109.60
[9,] 1 1 0 0 0 -98.05
[10,] 1 1 0 0 1 -96.66
[11,] 1 1 0 1 0 -96.28
[12,] 1 1 0 1 1 -94.86
[13,] 1 0 0 0 0 -90.92
[14,] 1 0 0 0 1 -89.32
[15,] 1 0 0 1 0 -89.06
[16,] 1 0 0 1 1 -87.45
[17,] 0 0 1 1 1 -59.09
[18,] 0 0 1 1 0 -57.98
[19,] 0 1 1 1 1 -57.34
[20,] 0 1 1 1 0 -56.35
Printed the first 20 best models
R> AIC(lm(y1~xA)) # Model 1 above
[1] -120.3801
R> AIC(lm(y1~xA+x15)) # Model 2 above
[1] -110.8642
R> AIC(lm(y1~xA+x5)) # Model 3 above
[1] -118.9906
On Thu, 2011-06-23 at 09:05 -0400, Alexandra Thorn wrote:
> The packages is wle.
>
> I'll put together some code that shows the behavior I'm talking about,
> and send it to the list.
>
> Alexandra
>
> On Thu, 2011-06-23 at 13:51 +0200, Rubén Roa wrote:
> > I don't find the mle.aic function. Thus it does not ship with R and it's in some contributed package.
> > What package is that?
> > If you had asked for help providing minimal, self-contained, reproducible code, you'd have realized that you need to tell people what package you are using.
> >
> > ___________________________________________________________________________________
> >
> > Dr. Rubén Roa-Ureta
> > AZTI - Tecnalia / Marine Research Unit
> > Txatxarramendi Ugartea z/g
> > 48395 Sukarrieta (Bizkaia)
> > SPAIN
> >
> >
> >
> > > -----Mensaje original-----
> > > De: r-help-bounces at r-project.org
> > > [mailto:r-help-bounces at r-project.org] En nombre de Alexandra Thorn
> > > Enviado el: miércoles, 22 de junio de 2011 22:38
> > > Para: r-help at r-project.org
> > > Asunto: [R] AIC() vs. mle.aic() vs. step()?
> > >
> > > I know this a newbie question, but I've only just started
> > > using AIC for model comparison and after a bunch of different
> > > keyword searches I've failed to find a page laying out what
> > > the differences are between the AIC scores assigned by AIC()
> > > and mle.aic() using default settings.
> > >
> > > I started by using mle.aic() to find the best submodels, but
> > > then I wanted to also be able to make comparisons with a
> > > couple of submodels that were nowhere near the top, so I
> > > started calculating AIC values using AIC(). What I found was
> > > that not only the scores, but also the ranking of the models
> > > was different. I'm not sure if this has to do with the fact
> > > that mle.aic() scores are based on the full model, or some
> > > sort of difference in penalties, or something else.
> > >
> > > Could anybody enlighten me as to the differences between
> > > these functions, or how I can use the same scoring system to
> > > find the best models and also compare to far inferior models?
> > >
> > > Failing that, could someone point me to an appropriate
> > > resource that might help me understand?
> > >
> > > Thanks in advance,
> > > Alexandra
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
>
More information about the R-help
mailing list