[R] how do remove those predictor which have p value greater than 0.05 in GLM?

Greg Snow Greg.Snow at imail.org
Tue Nov 23 21:48:38 CET 2010


What Frank was trying to tell you is that the p-values don't have much meaning if you do stepwise regression (sometimes they are worse than useless).  The p-values are computed based on certain assumptions, once you remove a variable because it is "Not Significant", then recompute, those assumptions no longer hold, so the p-values are not answering the question that you are asking.

I remember the 1st time I read about this and had the knee jerk reaction that stepwise regression was useful based mainly on having learned it from a text book and used it several times to get something that looked good.  But my personal epiphany came when I asked myself the question "What question does stepwise regression answer?".  I still have not found the answer (question) to that question, but I have determined that none of the questions that I am interested in answering fit.

With modern tools (R as an example) there are better tools (actually correct) for answering the questions that used to be answered with stepwise regression, it is better to use those tools.  Which tool is best depends on what question you are actually interested in answering.  Stepwise procedures continue to be taught, but mostly due to historical inertia (well I learned it when I took regression), but things are shifting away from it now (it should probably still be mentioned so that new graduates can still get jobs when asked about it in interviews, and as history not to be repeated).

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of shubha
> Sent: Monday, November 22, 2010 3:10 PM
> To: r-help at r-project.org
> Subject: Re: [R] how do remove those predictor which have p value
> greater than 0.05 in GLM?
> 
> 
> Thanks for the response, Frank.
> I am not saying that I want to delete a variables because of p>0.5. But
> my
> concern was: I am using backward stepwise logistic regression, it keeps
> the
> variables in the final model if the variable significantly contributing
> in
> the model. Otherwise, it should not be in the final model.
> Using other software, they give correct results. But R, did not. I want
> those variables if p<0.05, otherwise exclude from the model. If you
> include
> that variables, it will affect the Log likelihood ratio and AIC. I want
> to
> change a P-value criterion <=0.05 in the model.  Any suggestions.
> thanks
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/how-do-
> remove-those-predictor-which-have-p-value-greater-than-0-05-in-GLM-
> tp3053921p3054540.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list