[R] Stepwise Regression and PLS

Frank E Harrell Jr feh3k at spamcop.net
Mon Feb 2 12:43:44 CET 2004


On Sun, 1 Feb 2004 20:03:36 -0800 (PST)
Jinsong Zhao <jinsong_zh at yahoo.com> wrote:

> 
> --- Frank E Harrell Jr <feh3k at spamcop.net> wrote:
> > > 
> > > For the case of stepwise regression, I have found
> > that
> > > the subsets I got using regsubsets() are
> > collinear.
> > > However, the variables in SPSS's result are not
> > > collinear. I wonder what I should do to get a same
> > or
> > > better linear model.
> > 
> > I think you missed the point.  None of the variable
> > selection procedures
> > will provide results that have a fair probability of
> > replicating in
> > another sample.
> > 
> > FH
> > ---
> > Frank E Harrell Jr   Professor and Chair          
> > School of Medicine
> >                      Department of Biostatistics  
> > Vanderbilt University
> 
> Do you mean different procedures will provide
> different results? Maybe I don't understand your email
> correctly. Now, I just hope I could get a reasonable
> linear model using stepwise method in R, but I don't
> know how to deal with collinear problem.
> 
> =====
> (Mr.) Jinsong Zhao

No, I mean the SAME procedure will provide different results.  Use the
bootstrap, or use simulation to repeatedly sample from the same population
and the same true regression model.  You will see dramatically different
"final models" selected by same algorithm.  The algorithm is inherently
unstable unless perhaps you have a sample an order of magnitude larger
than the one you have.  See
http://www.pitt.edu/~wpilib/statfaq/regrfaq.html) which contains some good
references.

---
Frank E Harrell Jr   Professor and Chair           School of Medicine
                     Department of Biostatistics   Vanderbilt University




More information about the R-help mailing list