[R] model simplification using Crawley as a guide

Frank E Harrell Jr f.harrell at vanderbilt.edu
Thu Jun 12 00:53:06 CEST 2008


Ben Bolker wrote:
> Lucke, Joseph F <Joseph.F.Lucke <at> uth.tmc.edu> writes:
> 
>> And to follow FH and HW
>>
>> What level of significance are you using? .05 is excessively liberal.
>> Are you adjusting your p-values for the number of possible models? Do
>> you realize the p-values for dropping a term, being selected as the
>> maximum of a set of p-values, do not follow their usual distributions?
>> How are you compensating for sample size, as a p-value's being
>> significant is a function of sample size?  How are you compensating for
>> the fact that the current model choice is dependent on the previous
>> model choices? How do you know your tree of model choices is the optimal
>> one?  Have you considered cross-validation?  Are you looking for a model
>> that true describes a phenomenon or a predictive model that can be used
>> for practical purposes?
>>
> 
>    Ouch.  While Frank Harrell and Joseph Lucke are raising
> serious issues about model selection, maybe we could keep in mind that
> we don't want to scare off all the students who ever try to use R
> to figure out basic statistics.  I would follow Peter Dalgaard's advice
> (about "drop1") and Hadley Wickham's (about graphical diagnostics), 
> and if possible bring up the other issues about
> model selection with others around you -- if you're a student, ask
> your prof. or someone in the stats department.  It can be tough
> to try to do things right if those around you are still
> doing them wrong ...  If you tell us what field you're in we
> may be able to point you to more subject-specific references
> (e.g. Whittingham, Mark J., Philip A. Stephens, Richard B. Bradbury, and Robert
> P. Freckleton. 2006. Why do we still use stepwise modelling in ecology and
> behaviour? Journal of Animal Ecology 75, no. 5: 1182-1189)
> 
>    Ben Bolker

Good points Ben.  For now I'd recommend simply that the allergic 
reaction to insignificant statistical tests be treated with an 
antihistimine :-)

Frank

> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University



More information about the R-help mailing list