[R] model simplification using Crawley as a guide
Marc Schwartz
marc_schwartz at comcast.net
Thu Jun 12 03:10:10 CEST 2008
on 06/11/2008 05:53 PM Frank E Harrell Jr wrote:
> Ben Bolker wrote:
>> Lucke, Joseph F <Joseph.F.Lucke <at> uth.tmc.edu> writes:
>>
>>> And to follow FH and HW
>>>
>>> What level of significance are you using? .05 is excessively liberal.
>>> Are you adjusting your p-values for the number of possible models? Do
>>> you realize the p-values for dropping a term, being selected as the
>>> maximum of a set of p-values, do not follow their usual distributions?
>>> How are you compensating for sample size, as a p-value's being
>>> significant is a function of sample size? How are you compensating for
>>> the fact that the current model choice is dependent on the previous
>>> model choices? How do you know your tree of model choices is the optimal
>>> one? Have you considered cross-validation? Are you looking for a model
>>> that true describes a phenomenon or a predictive model that can be used
>>> for practical purposes?
>>>
>>
>> Ouch. While Frank Harrell and Joseph Lucke are raising
>> serious issues about model selection, maybe we could keep in mind that
>> we don't want to scare off all the students who ever try to use R
>> to figure out basic statistics. I would follow Peter Dalgaard's advice
>> (about "drop1") and Hadley Wickham's (about graphical diagnostics),
>> and if possible bring up the other issues about
>> model selection with others around you -- if you're a student, ask
>> your prof. or someone in the stats department. It can be tough
>> to try to do things right if those around you are still
>> doing them wrong ... If you tell us what field you're in we
>> may be able to point you to more subject-specific references
>> (e.g. Whittingham, Mark J., Philip A. Stephens, Richard B. Bradbury,
>> and Robert
>> P. Freckleton. 2006. Why do we still use stepwise modelling in ecology
>> and
>> behaviour? Journal of Animal Ecology 75, no. 5: 1182-1189)
>>
>> Ben Bolker
>
> Good points Ben. For now I'd recommend simply that the allergic
> reaction to insignificant statistical tests be treated with an
> antihistamine :-)
A vote for Frank's comment to be added to the 'fortunes' package.
:-)
Regards,
Marc
More information about the R-help
mailing list