[R] P values

Duncan Murdoch murdoch.duncan at gmail.com
Sun May 9 03:38:41 CEST 2010

On 08/05/2010 9:14 PM, Joris Meys wrote:
> On Sat, May 8, 2010 at 7:02 PM, Bak Kuss <bakkuss at gmail.com> wrote:
>> Just wondering.
>> The smallest the p-value, the closer  to 'reality'  (the more accurate)
>> the model is supposed to (not) be (?).
>> How realistic is it to be that (un-) real?
> That's a common misconception. A p-value expresses no more than the chance
> of obtaining the dataset you observe, given that your null hypothesis _and
> your assumptions_ are true. 

I'd say it expresses even less than that.  A p-value is simply a 
transformation of the test statistic to a standard scale.  In the nicer 
situations, if the null hypothesis is true, it'll have a uniform 
distribution on [0,1].  If H0 is false but the truth lies in the 
direction of the alternative hypothesis, the p-value should have a 
distribution that usually gives smaller values.  So an unusually small 
value is a sign that H0 is false:  you don't see values like 1e-6 from a 
U(0,1) distribution very often, but that could be a common outcome under 
the alternative hypothesis.   (The not so nice situations make things a 
bit more complicated, because the p-value might have a discrete 
distribution, or a distribution that tends towards large values, or the 
U(0,1) null distribution might be a limiting approximation.) 

So to answer Bak, the answer is that yes, a well-designed statistic will 
give p-values that tend to be smaller the further the true model gets 
from the hypothesized one, i.e. smaller p-values are probably associated 
with larger departures from the null.  But the p-value is not a good way 
to estimate that distance.  Use a parameter estimate instead.

Duncan Murdoch

> Essentially, a p-value is as "real" as your
> assumptions. In that way I can understand what Robert wants to say. But with
> lare enough datasets, bootstrapping or permutation tests gives often about
> the same p-value as the asymptotic approximation. At that moment, the
> central limit theorem comes into play, which says that when the sample size
> is big enough, the mean is -close to- normally distributed. In those cases,
> the test statistic also follows the proposed distribution and your p-value
> is closer to "reality". Mind you, the "sample size" for a specific statistic
> is not always merely the number of observations, especially in more advanced
> methods. Plus, violations of other assumptions, like independence of the
> observations, changes the picture again.
> The point is : what is reality? As Duncan said, a small p-value indicates
> that your null hypothesis is not true. That's exactly what you look for,
> because that is the proof the relation in your dataset you're looking at,
> did not emerge merely by chance. You're not out to calculate the exact
> chance. Robert is right, reporting an exact p-value of 1.23 e-7 doesn't make
> sense at all. But the rejection of your null-hypothesis is as real as life.
> The trick is to test the correct null hypothesis, and that's were it most
> often goes wrong...
> Cheers
> Joris
>> bak
>> p.s. I am no statistician
>>        [[alternative HTML version deleted]]
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list