[R] P values
murdoch.duncan at gmail.com
Sun May 9 03:38:41 CEST 2010
On 08/05/2010 9:14 PM, Joris Meys wrote:
> On Sat, May 8, 2010 at 7:02 PM, Bak Kuss <bakkuss at gmail.com> wrote:
>> Just wondering.
>> The smallest the p-value, the closer to 'reality' (the more accurate)
>> the model is supposed to (not) be (?).
>> How realistic is it to be that (un-) real?
> That's a common misconception. A p-value expresses no more than the chance
> of obtaining the dataset you observe, given that your null hypothesis _and
> your assumptions_ are true.
I'd say it expresses even less than that. A p-value is simply a
transformation of the test statistic to a standard scale. In the nicer
situations, if the null hypothesis is true, it'll have a uniform
distribution on [0,1]. If H0 is false but the truth lies in the
direction of the alternative hypothesis, the p-value should have a
distribution that usually gives smaller values. So an unusually small
value is a sign that H0 is false: you don't see values like 1e-6 from a
U(0,1) distribution very often, but that could be a common outcome under
the alternative hypothesis. (The not so nice situations make things a
bit more complicated, because the p-value might have a discrete
distribution, or a distribution that tends towards large values, or the
U(0,1) null distribution might be a limiting approximation.)
So to answer Bak, the answer is that yes, a well-designed statistic will
give p-values that tend to be smaller the further the true model gets
from the hypothesized one, i.e. smaller p-values are probably associated
with larger departures from the null. But the p-value is not a good way
to estimate that distance. Use a parameter estimate instead.
> Essentially, a p-value is as "real" as your
> assumptions. In that way I can understand what Robert wants to say. But with
> lare enough datasets, bootstrapping or permutation tests gives often about
> the same p-value as the asymptotic approximation. At that moment, the
> central limit theorem comes into play, which says that when the sample size
> is big enough, the mean is -close to- normally distributed. In those cases,
> the test statistic also follows the proposed distribution and your p-value
> is closer to "reality". Mind you, the "sample size" for a specific statistic
> is not always merely the number of observations, especially in more advanced
> methods. Plus, violations of other assumptions, like independence of the
> observations, changes the picture again.
> The point is : what is reality? As Duncan said, a small p-value indicates
> that your null hypothesis is not true. That's exactly what you look for,
> because that is the proof the relation in your dataset you're looking at,
> did not emerge merely by chance. You're not out to calculate the exact
> chance. Robert is right, reporting an exact p-value of 1.23 e-7 doesn't make
> sense at all. But the rejection of your null-hypothesis is as real as life.
> The trick is to test the correct null hypothesis, and that's were it most
> often goes wrong...
>> p.s. I am no statistician
>> [[alternative HTML version deleted]]
>> R-help at r-project.org mailing list
>> PLEASE do read the posting guide
>> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help