[R] Inference for R Spam
Rolf Turner
r.turner at auckland.ac.nz
Thu Mar 5 00:43:45 CET 2009
On 5/03/2009, at 12:13 PM, Bert Gunter wrote:
>
> "The purpose of the subject or discipline ``statistics'' is in essence
> to answer the question ``could the phenomenon we observed have arisen
> simply by chance?'', or to quantify the *uncertainty* in any estimate
> that we make of a quantity."
>
>
> May I take strong issue with this characterization? It is far too
> narrow and
> constraining. We are scientists first and foremost. The most
> important and
> useful thing I do is to collaborate with other scientists to frame
> good
> questions, design good experiments and studies, and gain insight
> into the
> results of those experiments and studies (usually via graphical
> displays,
> for which R is superbly suited). Blessing data with P-values is
> rarely of
> much importance, and is often frankly irrelevant and even
> misleading (but
> that's another rant).
>
> George Box said this much better than I: "The business of the
> statistician
> is to catalyze the scientific learning process."
>
> This is much much more than you intimate.
I must respectfully disagree. Far be it from me to argue with George
Box,
but nevertheless ... it may be statisticians *business* to catalyze the
scientific learning process, but that is the business of *any*
scientist.
What we bring to the process is our understanding of the essentials of
statistics, just as the chemist brings her understanding of the
essentials
of chemistry and the biologist her understanding of the essentials of
biology.
The essentials of statistics consist in answering the question of
``could
this phenomenon have arisen by chance?'' This is where we contribute
in a
way that other scientists do not. They don't understand variability,
the
poor dears. (Unless they have been well taught and thereby have become
in part statisticians themselves.) They have a devastating tendency
to treat
an estimated regression line as *the* regression line, the truth.
And so on.
The *way* we address the question of ``could it have happened by
chance''
and the way we address the problem of quantifying variability is
indeed open
to a broad range of techniques including graphics.
Note that I did not say word one about p-values. The example I gave was
a scientific question --- is there a difference in the home field
advantage
between the English Premier Division and the equivalent division or
league
in Italy? How much of a difference? You may wish to throw in a p-
value,
or you may not. You will probably wish to look at a confidence
interval.
You may wish to look at the question from the point of view of the
distribution
of (home) - (away) differences, in which case graphics will most
certainly
help. But it comes down to answering the basic question. If you
have no
ability to answer such questions you are not, or might as well not be, a
statistician.
cheers,
Rolf Turner
######################################################################
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
More information about the R-help
mailing list