[R] Off Topic: Statistical "philosophy" rant

Dan Bolser dmb at mrc-dunn.cam.ac.uk
Thu Jan 13 01:38:10 CET 2005


On Wed, 12 Jan 2005, Berton Gunter wrote:

>R-Listers.
>
>The following is a rant originally sent privately to Frank Harrell in
>response to remarks he made on this list. The ideas are not new or original,
>but he suggested I share it with the list, as he felt that it might be of
>wider interest, nonetheless. I have real doubts about this, and I apologize
>in advance to those who agree that I should have kept my remarks private.
>In view of this, if you wish to criticize my remarks on list, that's fine,
>but I won't respond (I've said enough already!). I would be happy to discuss
>issues (a little) further off list with anyone who wishes to bother, but not
>on list. 
>
>Also, Frank sent me a relevant reference for those who might wish to read a
>more thoughtful consideration of the issues:
>
>@ARTICLE{far92cos,
>   author = {Faraway, J. J.},
>   year = 1992,
>   title = {The cost of data analysis},
>   journal = J Comp Graphical Stat,
>   volume = 1,
>   pages = {213-229},
>   annote = {bootstrap; validation; predictive accuracy; modeling strategy;
>            regression diagnostics;model uncertainty}
>}
>
>I welcome further relevant references, pro or con!
>
>Finally, I need to emphasize that these are clearly my very personal views
>and do not reflect those of my company or colleagues. 
>
>Cheers to all ...
>-----------
>
>The relevant portion of Frank's original comment was in a thread about K-S
>tests for the goodness of fit of a parametric distribution:
>
>...
>> If you use the empirical CDF to select a parametric 
>> distribution, the final estimate of the distribution will inherit the 
>> variance of the ECDF.
>> The main reason statisticians think that 
>> parametric curve fits are far more efficient than 
>> nonparametric ones is 
>> that they don't account for model uncertainty in their final 
>> confidence 
>> intervals.
>> 
>> -- Frank Harrell
>
>My reply:
>
>That's a perceptive remark, but I would go further... You mentioned
>**model** uncertainty. In fact, in any data analysis in which we explore the
>data first to choose a model, fit the model (parametric or non..), and then
>use whatever (pivots from parametric analysis; bootstrapping;...) to say
>something about "model uncertainty," we're always kidding ourselves and our
>colleagues because we fail to take into account the considerable variability
>introduced by our initial subjective exploration and subsequent choice of
>modeling strategy. One can only say (at best) that the stated model
>uncertainty is an underestimate of the true uncertainty. And very likely a
>considerable underestimate because of the model choice subjectivity.
>
>Now I in no way wish to discourage or abridge data exploration; only to
>point out that we statisticians have promulgated a self-serving and
>unrealistic view of the value of formal inference in quantifying true
>scientific uncertainty when we do such exploration -- and that there is
>therefore something fundamentally contradictory in our own rhetoric and
>methods. Taking a larger view, I think this remark is part of the deeper
>epistemological issue of characterizing what can be scientifically "known"
>or, indeed, defining the difference between science and art, say. My own
>view is that scientific certainty is a fruitless concept: we build models
>that we benchmark against our subjective measurements (as the measurements
>themselves depend on earlier scientific models) of "reality." Insofar as
>data can limit or support our flights of modeling fancy, they do; but in the
>end, it is neither an objective process nor one whose "uncertainty" can be
>strictly quantified. 

I totally agree with the above and I am totally unqualified to comment on
the below.


You (and others) might find these papers interesting...

http://www.santafe.edu/~chaos/chaos/pubs.htm


Specifically papers like...

Synchronizing to the Environment: Information Theoretic Constraints on
Agent Learning.
http://www.santafe.edu/~cmg/papers/stte.pdf

Is Anything Ever New? Considering Emergence.
http://www.santafe.edu/~cmg/papers/EverNew.pdf


Observing Complexity and The Complexity of Observation
http://www.santafe.edu/~cmg/papers/OCACO.pdf


What Lies Between Order and Chaos?
http://www.santafe.edu/~cmg/papers/wlboac.pdf



And probably many more.


>In creating the illusion that "statistical methods" can
>overcome these limitations, I think we have both done science a disservice
>and relegated ourselves to an isolated, fringe role in scientific inquiry.
>
>Needless to say, opposing viewpoints to such iconclastic remarks are
>cheerfully welcomed.

Does it make any difference to the mass of Saturn?

Dan.

>
>Best regards,
>
>Bert Gunter
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>




More information about the R-help mailing list