[R] P values
David Winsemius
dwinsemius at comcast.net
Sun May 9 05:11:41 CEST 2010
On May 8, 2010, at 9:38 PM, Duncan Murdoch wrote:
> On 08/05/2010 9:14 PM, Joris Meys wrote:
>> On Sat, May 8, 2010 at 7:02 PM, Bak Kuss <bakkuss at gmail.com> wrote:
>>
>>
>>> Just wondering.
>>>
>>> The smallest the p-value, the closer to 'reality' (the more
>>> accurate)
>>> the model is supposed to (not) be (?).
>>>
>>> How realistic is it to be that (un-) real?
>>>
>>>
>>
>> That's a common misconception. A p-value expresses no more than the
>> chance
>> of obtaining the dataset you observe, given that your null
>> hypothesis _and
>> your assumptions_ are true.
>
>
> I'd say it expresses even less than that. A p-value is simply a
> transformation of the test statistic to a standard scale. In the
> nicer situations, if the null hypothesis is true, it'll have a
> uniform distribution on [0,1]. If H0 is false but the truth lies in
> the direction of the alternative hypothesis, the p-value should have
> a distribution that usually gives smaller values. So an unusually
> small value is a sign that H0 is false: you don't see values like
> 1e-6 from a U(0,1) distribution very often, but that could be a
> common outcome under the alternative hypothesis. (The not so nice
> situations make things a bit more complicated, because the p-value
> might have a discrete distribution, or a distribution that tends
> towards large values, or the U(0,1) null distribution might be a
> limiting approximation.)
> So to answer Bak, the answer is that yes, a well-designed statistic
> will give p-values that tend to be smaller the further the true
> model gets from the hypothesized one, i.e. smaller p-values are
> probably associated with larger departures from the null. But the p-
> value is not a good way to estimate that distance. Use a parameter
> estimate instead.
And. Thank you for this paper. As a non-statistician I found it most
instructive:
http://pubs.amstat.org/doi/pdfplus/10.1198/000313008X332421
--
David.
>
> Duncan Murdoch
>
>
>> Essentially, a p-value is as "real" as your
>> assumptions. In that way I can understand what Robert wants to say.
>> But with
>> lare enough datasets, bootstrapping or permutation tests gives
>> often about
>> the same p-value as the asymptotic approximation. At that moment, the
>> central limit theorem comes into play, which says that when the
>> sample size
>> is big enough, the mean is -close to- normally distributed. In
>> those cases,
>> the test statistic also follows the proposed distribution and your
>> p-value
>> is closer to "reality". Mind you, the "sample size" for a specific
>> statistic
>> is not always merely the number of observations, especially in more
>> advanced
>> methods. Plus, violations of other assumptions, like independence
>> of the
>> observations, changes the picture again.
>>
>> The point is : what is reality? As Duncan said, a small p-value
>> indicates
>> that your null hypothesis is not true. That's exactly what you look
>> for,
>> because that is the proof the relation in your dataset you're
>> looking at,
>> did not emerge merely by chance. You're not out to calculate the
>> exact
>> chance. Robert is right, reporting an exact p-value of 1.23 e-7
>> doesn't make
>> sense at all. But the rejection of your null-hypothesis is as real
>> as life.
>>
>> The trick is to test the correct null hypothesis, and that's were
>> it most
>> often goes wrong...
>>
>> Cheers
>> Joris
>>
>>
>>> bak
>>>
>>> p.s. I am no statistician
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list