[R] all possible subsets, with AIC
NutterB at ccf.org
Mon Feb 15 14:01:28 CET 2010
I've dabbled in this a little bit, and the result of my dabbling is
attached. I'll give you fair warning, however. The attached function
can take a long time to run, and if your model has 10 or more
predictors, you may be retired before it finishes running.
In any case, it will models for all possible subsets of predictors in
lm, glm, or coxph. If requested, it will also plot the R-squared,
Adjusted R-squared, AIC, or BIC of those models (when the values are
applicable to the model). It might give you a good starting point.
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of kcleary2
Sent: Friday, February 12, 2010 3:19 PM
To: r-help at r-project.org
Subject: [R] all possible subsets, with AIC
I have a question about doing ALL possible subsets regression with a
general linear model. My goal is to produce cumulative Akaike weights
for each of 7 predictor variables-to obtain this I need R to:
Show me ALL possible subsets, not just the best possible subsets
me an AIC value for each model (instead of a BIC value).
I have tried to
do this in library(RcmdrPlugin.HH), and using the "leaps" code below.
With the leaps code my problem is that my response is not a vector, it's
a single value (density of a species)
ANy help would be greatly
appreciated. Thanks a lot,
leaps() performs an exhaustive search for the best subsets of the
variables in x for predicting y in linear regression, using an efficient
branch-and-bound algorithm. It is a compatibility wrapper for regsubsets
 does the same thing better.
Since the algorithm returns a
best model of each size, the results do not depend on a penalty model
for model size: it doesn't make any difference whether you want to use
AIC, BIC, CIC, DIC, ...
leaps(x=, y=, wt=rep(1, NROW(x)), int=TRUE, method=c("Cp", "adjr2",
"r2"), nbest=10, names=NULL, df=NROW(x),
A matrix of predictors
A response vector
Optional weight vector
intercept to the model
Calculate Cp, adjusted R-squared or
Number of subsets of each size to report
vector of names for columns of x
Total degrees of freedom to
use instead of nrow(x) in calculating Cp and adjusted R-squared
Implement misfeatures of leaps() in S
Department of Fish, Wildlife, and Conservation Biology Colorado State
University Fort Collins, CO
[[alternative HTML version deleted]]
R-help at r-project.org mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.
P Please consider the environment before printing this e-mail
Cleveland Clinic is ranked one of the top hospitals
in America by U.S.News & World Report (2009).
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
Confidentiality Note: This message is intended for use
only by the individual or entity to which it is addressed
and may contain information that is privileged,
confidential, and exempt from disclosure under applicable
law. If the reader of this message is not the intended
recipient or the employee or agent responsible for
delivering the message to the intended recipient, you are
hereby notified that any dissemination, distribution or
copying of this communication is strictly prohibited. If
you have received this communication in error, please
contact the sender immediately and destroy the material in
its entirety, whether electronic or hard copy. Thank you.
More information about the R-help