[R] variable selection in logistic

Frank E Harrell Jr f.harrell at vanderbilt.edu
Thu Sep 3 06:06:49 CEST 2009


annie Zhang wrote:
> Hi, Frank,
>  
> You mean the backward and forward stepwise selection is bad? You also 
> suggest the penalized logistic regression is the best choice? Is there 
> any function to do it as well as selecting the best penalty?
>  
> Annie

All variable selection is bad unless its in the context of penalization. 
  You'll need penalized logistic regression not necessarily with 
variable selection, for example a quadratic penalty as in a case study 
in my book, or an L1 penalty (lasso) using other packages.

Frank

> 
> On Wed, Sep 2, 2009 at 7:41 PM, Frank E Harrell Jr 
> <f.harrell at vanderbilt.edu <mailto:f.harrell at vanderbilt.edu>> wrote:
> 
>     David Winsemius wrote:
> 
> 
>         On Sep 2, 2009, at 9:36 PM, annie Zhang wrote:
> 
>             Hi, R users,
> 
>             What may be the best function in R to do variable selection
>             in logistic
>             regression?
> 
> 
>         PhD theses, and books by famous statisticians have been pursuing
>         the answer to that question for decades.
> 
>             I have the same number of variables as the number of samples,
>             and I want to select the best variablesfor prediction. Is
>             there any function
>             doing forward selection followed by backward elimination in
>             stepwise
>             logistic regression?
> 
> 
>         You should probably be reading up on penalized regression
>         methods. The stepwise procedures reporting unadjusted
>         "significance" made available by SAS and SPSS to the unwary
>         neophyte user have very poor statistical properties.
> 
>         -- 
> 
>         David Winsemius, MD
> 
> 
>     Amen to that.
> 
>     Annie, resist the temptation.  These methods bite.
> 
>     Frank
> 
> 
>         Heritage Laboratories
>         West Hartford, CT
> 
>         ______________________________________________
>         R-help at r-project.org <mailto:R-help at r-project.org> mailing list
>         https://stat.ethz.ch/mailman/listinfo/r-help
>         PLEASE do read the posting guide
>         http://www.R-project.org/posting-guide.html
>         <http://www.r-project.org/posting-guide.html>
>         and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
>     -- 
>     Frank E Harrell Jr   Professor and Chair           School of Medicine
>                         Department of Biostatistics   Vanderbilt University
> 
> 


-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University




More information about the R-help mailing list