[R] Off Topic: Statistical "philosophy" rant
Dan Bolser
dmb at mrc-dunn.cam.ac.uk
Thu Jan 13 01:38:10 CET 2005
On Wed, 12 Jan 2005, Berton Gunter wrote:
>R-Listers.
>
>The following is a rant originally sent privately to Frank Harrell in
>response to remarks he made on this list. The ideas are not new or original,
>but he suggested I share it with the list, as he felt that it might be of
>wider interest, nonetheless. I have real doubts about this, and I apologize
>in advance to those who agree that I should have kept my remarks private.
>In view of this, if you wish to criticize my remarks on list, that's fine,
>but I won't respond (I've said enough already!). I would be happy to discuss
>issues (a little) further off list with anyone who wishes to bother, but not
>on list.
>
>Also, Frank sent me a relevant reference for those who might wish to read a
>more thoughtful consideration of the issues:
>
>@ARTICLE{far92cos,
> author = {Faraway, J. J.},
> year = 1992,
> title = {The cost of data analysis},
> journal = J Comp Graphical Stat,
> volume = 1,
> pages = {213-229},
> annote = {bootstrap; validation; predictive accuracy; modeling strategy;
> regression diagnostics;model uncertainty}
>}
>
>I welcome further relevant references, pro or con!
>
>Finally, I need to emphasize that these are clearly my very personal views
>and do not reflect those of my company or colleagues.
>
>Cheers to all ...
>-----------
>
>The relevant portion of Frank's original comment was in a thread about K-S
>tests for the goodness of fit of a parametric distribution:
>
>...
>> If you use the empirical CDF to select a parametric
>> distribution, the final estimate of the distribution will inherit the
>> variance of the ECDF.
>> The main reason statisticians think that
>> parametric curve fits are far more efficient than
>> nonparametric ones is
>> that they don't account for model uncertainty in their final
>> confidence
>> intervals.
>>
>> -- Frank Harrell
>
>My reply:
>
>That's a perceptive remark, but I would go further... You mentioned
>**model** uncertainty. In fact, in any data analysis in which we explore the
>data first to choose a model, fit the model (parametric or non..), and then
>use whatever (pivots from parametric analysis; bootstrapping;...) to say
>something about "model uncertainty," we're always kidding ourselves and our
>colleagues because we fail to take into account the considerable variability
>introduced by our initial subjective exploration and subsequent choice of
>modeling strategy. One can only say (at best) that the stated model
>uncertainty is an underestimate of the true uncertainty. And very likely a
>considerable underestimate because of the model choice subjectivity.
>
>Now I in no way wish to discourage or abridge data exploration; only to
>point out that we statisticians have promulgated a self-serving and
>unrealistic view of the value of formal inference in quantifying true
>scientific uncertainty when we do such exploration -- and that there is
>therefore something fundamentally contradictory in our own rhetoric and
>methods. Taking a larger view, I think this remark is part of the deeper
>epistemological issue of characterizing what can be scientifically "known"
>or, indeed, defining the difference between science and art, say. My own
>view is that scientific certainty is a fruitless concept: we build models
>that we benchmark against our subjective measurements (as the measurements
>themselves depend on earlier scientific models) of "reality." Insofar as
>data can limit or support our flights of modeling fancy, they do; but in the
>end, it is neither an objective process nor one whose "uncertainty" can be
>strictly quantified.
I totally agree with the above and I am totally unqualified to comment on
the below.
You (and others) might find these papers interesting...
http://www.santafe.edu/~chaos/chaos/pubs.htm
Specifically papers like...
Synchronizing to the Environment: Information Theoretic Constraints on
Agent Learning.
http://www.santafe.edu/~cmg/papers/stte.pdf
Is Anything Ever New? Considering Emergence.
http://www.santafe.edu/~cmg/papers/EverNew.pdf
Observing Complexity and The Complexity of Observation
http://www.santafe.edu/~cmg/papers/OCACO.pdf
What Lies Between Order and Chaos?
http://www.santafe.edu/~cmg/papers/wlboac.pdf
And probably many more.
>In creating the illusion that "statistical methods" can
>overcome these limitations, I think we have both done science a disservice
>and relegated ourselves to an isolated, fringe role in scientific inquiry.
>
>Needless to say, opposing viewpoints to such iconclastic remarks are
>cheerfully welcomed.
Does it make any difference to the mass of Saturn?
Dan.
>
>Best regards,
>
>Bert Gunter
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
More information about the R-help
mailing list