[R] Regsubsets model selection
Maximilian Lklweryc
maxlklweryc at gmail.com
Tue Sep 25 17:05:06 CEST 2012
Hi,
I have 12 independent variables and one dependent variable. Now I want to
select the best adj. R squared model by using the regsubsets command, so I
code:
> plot(regsubsets(Gesamt ~ CommunistSocialist + CountrySize + GNI + Lifeexp
+ Schoolyears + ExpMilitary + Mortality +
+ PopPoverty + PopTotal + ExpEdu + ExpHealth, data=olympiadaten, nbest=1,
nvmax=12), scale='adjr2')
Then I get the picture I attached. The problem is, that the best model has
an adjusted R squard of 0.49. But if I regress e.g. my y on only the
variable PopTotal, then I already get an adjusted R squared of 0.779! So
this simple model is way better but it is not recognized by the regsubsets
command. I don't know why R does this and how can I change this?
And a general question: If I take the best model by AIC, does this model
also has the highest (best) adj. R squared? Should I select my models by
information criterions or by R squared? And what is exactly the difference,
I mean, both take into account the fitting and the nunber of variables
right? Thanks a lot!
Thanks a lot for your help!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: subsets.png
Type: image/png
Size: 8196 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120925/95425045/attachment-0002.png>
More information about the R-help
mailing list