[R] Stepwise Regression and PLS

Chris Lawrence cnlawren at olemiss.edu
Mon Feb 2 06:03:45 CET 2004


Jinsong Zhao wrote:

> Do you mean different procedures will provide different results? Maybe 
> I don't understand your email correctly. Now, I just hope I could get 
> a reasonable linear model using stepwise method in R, but I don't know 
> how to deal with collinear problem.

What Dr. Harrell means (in part) is that stepwise regression leads to 
models that often "overfit" the observed data pattern--i.e. models that 
are not generalizable.  More elaboration can be found here (including 
comments from Dr. Harrell):

http://www.gseis.ucla.edu/courses/ed230bc1/notes4/swprobs.html

Key quote: "Personally, I would no more let an automatic routine select 
my model than I would let some best-fit procedure pack my suitcase."  
The bottom line advice here would be: don't use stepwise regression.

Peter Kennedy, in "A Guide to Econometrics" (pp. 187-89) suggests the 
following options for dealing with collinearity:

1. "Do nothing."  The main problem in OLS when variables are collinear 
is that the estimated variances of the parameters are often inflated.
2. Obtain more data.
3. Formalize relationships among regressors (for example, in a 
simultaneous equation model).
4. Specify a relationship among the *parameters*.
5. Drop one or more variables.  (In essence, a subset of #4 where 
coefficients are set to zero.)
6. Incorporate estimates from other studies.  (A Bayesian might consider 
using a strong prior.)
7. Form a principal component from the variables, and use that instead.
8. Shrink the OLS estimates using the ridge or Stein estimators.

Hope this helps.


Chris

-- 
Dr. Chris Lawrence <cnlawren at olemiss.edu> - http://blog.lordsutch.com/




More information about the R-help mailing list