[R] p-values

Tue Apr 27 23:25:22 CEST 2004

On 27-Apr-04 Greg Tarpinian wrote:
> I apologize if this question is not completely 
> appropriate for this list.

Never mind! (I'm only hoping that my response is ... )

> [...]
> This week I have been reading "Testing Precise
> Hypotheses" by J.O. Berger & Mohan Delampady,
> Statistical Science, Vol. 2, No. 3, 317-355 and
> "Bayesian Analysis: A Look at Today and Thoughts of
> Tomorrow" by J.O. Berger, JASA, Vol. 95, No. 452, p.
> 1269 - 1276, both as supplements to my Math Stat.
> course.
> 
> It appears, based on these articles, that p-values are
> more or less useless.

I don't have these articles available, but I'm guessing
that they stress the Bayesian approach to inference.
Saying "p-values are more or less useless" is controversial.
Bayesians consider p-values to be approximately irrelevant
to the real question, which is what you can say about
the probability that a hypothesis is true/false, or
what is the probability that a parameter lies in a
particular range (sometimes the same question); and the
"probability" they refer to is a posterior probability
distribution on hypotheses, or over parameter values.
The "P-value" which is emitted at the end of standard
analysis is not such a probability, but instead is that part
of a distribution over the sample space which is defined
by a "cut-off" value of a test statistic calculated from the
data. So they are different entities. Numerically they may
coincide; indeed, for statistical problems with a certain
structure the P-value is equal to the Bayesian posterior
probability when a particular prior distribution is
adopted.

> If this is indeed the case,
> then why is a p-value typically given as a default
> output?  For example, I know that PROC MIXED and 
> lme( ) both yield p-values for fixed effects terms.

P-values are not as useless as sometimes claimed. They
at least offer a measure of discrepancy between data and
hypothesis (the smaller the P-value, the more discrepant
the data), and they offer this measure on a standard scale,
the "probabiltiy scale" -- the chance of getting something
at least as discrepant, if the hypothesis being tested is
true. What "discrepant" objectively means is defined by
the test statistic used in calculating the P-value: larger
values of the test statistic correspond to more discrepant
data.

Confidence intervals are essentially aggregates of hypotheses
which have not been rejected at a significance level equal
to 1 minus the P-value.

The P-value/confidence-interval approach (often called the
"frequentist approach") gives results which do not depend
on assuming any prior distribution on the parameters/hypotheses,
and therefore could be called "objective" in that they
avoid being accused of importing "subjective" information
into the inference in the form of a Bayesion prior distribution.
This can have the consequence that your confidence interval
may include values in a range which, a priori, you do not
acept as plausible; or exclude a range of values in which
you are a priori confident that the real value lies.
The Bayesian comment on this situation is that the frequentist
approach is "incoherent", to which the frequentist might
respond "well, I just got an unlucky experiment this time"
(which is bound to occur with due frequency).

> The theory I am learning does not seem to match what
> is commonly available in the software, and I am just
> wondering why.

The standard ritual for evaluating statistical estimates
and hypothesis tests is frequentist (as above). Rightly
interpreted, it is by no means useless. For complex
historical reasons, it has become the norm in "research
methodology", and this is essentially why it is provided
by the standard software packages (otherwise pharmaceutical
companies would never buy the software, since they need
this in order to get past the FDA or other regulatory
authority). However, because this is the "norm", such
results often have more meaning attributed to them than
they can support, by people disinclined to delve into
what "rightly interpreted" might mean.

This is not a really clean answer to your question; but
then your question touches on complex and conflicting
issues!

Hoping this helps (and hoping that I am not poking a
hornets' nest here)!
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 167 1972
Date: 27-Apr-04                                       Time: 22:25:22
------------------------------ XFMail ------------------------------