Gavin Simpson
gavin.simpson at ucl.ac.uk
Mon Sep 29 11:02:08 CEST 2008
On Sun, 2008-09-28 at 21:23 -0500, Frank E Harrell Jr wrote:
> Darin Brooks wrote:
> > I certainly appreciate your comments, Bert. It is abundantly clear that I
> Darin,
> I think the point is that the confidence you can assign to the "best
> available variables" is zero. That is the probability that stepwise
> variable selection will select the correct variables.
> It is probably better to build a model based on the knowledge in the
> field you alluded to, rather than to use P-values to decide.
> Frank Harrell
Hi Frank, et al
I don't have Darin's original email to hand just now, but IIRC he turned
on the testing by p-values, something that add1 and drop1 do not do by
default.
Venables and Ripley's MASS contains stepAIC and there they make use of
drop1 in the regression chapters (Apologies if I have made sweeping
statements that are just plain wrong here - I'm at home this morning and
don't seem to have either of my two MASS copies here with me).
Would the same criticisms made by yourself and Bert, amongst others, in
this thread be levelled at simplifying models using AIC rather than via
p-values? Part of the issue with stepwise procedures is that they don't
correct the overall Type I error rate (even if you use 0.05 as your
cut-off for each test, overall your error rate can be much larger). Does
AIC allow one to get out of this bit of the problem with stepwise
methods?
I'd appreciate any thoughts you or others on the list may have on this.
All the best, and thanks for an interesting discussion thus far.
G
> > It's more a statement that it expresses a statistical perspective very
> > succinctly, somewhat like a Zen koan. Frank's book,"Regression Modeling
> > Strategies", has entire chapters on reasoned approaches to your question.
> > His website also has quite a bit of material free for the taking.
> >
