[R] Regsubsets model selection

Maximilian Lklweryc maxlklweryc at gmail.com
Tue Sep 25 17:05:06 CEST 2012


Hi,
I have 12 independent variables and one dependent variable. Now I want to
select the best adj. R squared model by using the regsubsets command, so I
code:

> plot(regsubsets(Gesamt ~ CommunistSocialist + CountrySize + GNI + Lifeexp
+ Schoolyears + ExpMilitary + Mortality +
+   PopPoverty + PopTotal + ExpEdu + ExpHealth, data=olympiadaten, nbest=1,
nvmax=12), scale='adjr2')

Then I get the picture I attached. The problem is, that the best model has
an adjusted R squard of 0.49. But if I regress e.g. my y on only the
variable PopTotal, then I already get an adjusted R squared of 0.779! So
this simple model is way better but it is not recognized by the regsubsets
command. I don't know why R does this and how can I change this?

And a general question: If I take the best model by AIC, does this model
also has the highest (best) adj. R squared? Should I select my models by
information criterions or by R squared? And what is exactly the difference,
I mean, both take into account the fitting and the nunber of variables
right? Thanks a lot!


Thanks a lot for your help!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: subsets.png
Type: image/png
Size: 8196 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120925/95425045/attachment-0002.png>


More information about the R-help mailing list