[R] P values

Sun May 9 05:11:41 CEST 2010

On May 8, 2010, at 9:38 PM, Duncan Murdoch wrote:

> On 08/05/2010 9:14 PM, Joris Meys wrote:
>> On Sat, May 8, 2010 at 7:02 PM, Bak Kuss <bakkuss at gmail.com> wrote:
>>
>>
>>> Just wondering.
>>>
>>> The smallest the p-value, the closer  to 'reality'  (the more  
>>> accurate)
>>> the model is supposed to (not) be (?).
>>>
>>> How realistic is it to be that (un-) real?
>>>
>>>
>>
>> That's a common misconception. A p-value expresses no more than the  
>> chance
>> of obtaining the dataset you observe, given that your null  
>> hypothesis _and
>> your assumptions_ are true.
>
>
> I'd say it expresses even less than that.  A p-value is simply a  
> transformation of the test statistic to a standard scale.  In the  
> nicer situations, if the null hypothesis is true, it'll have a  
> uniform distribution on [0,1].  If H0 is false but the truth lies in  
> the direction of the alternative hypothesis, the p-value should have  
> a distribution that usually gives smaller values.  So an unusually  
> small value is a sign that H0 is false:  you don't see values like  
> 1e-6 from a U(0,1) distribution very often, but that could be a  
> common outcome under the alternative hypothesis.   (The not so nice  
> situations make things a bit more complicated, because the p-value  
> might have a discrete distribution, or a distribution that tends  
> towards large values, or the U(0,1) null distribution might be a  
> limiting approximation.)
> So to answer Bak, the answer is that yes, a well-designed statistic  
> will give p-values that tend to be smaller the further the true  
> model gets from the hypothesized one, i.e. smaller p-values are  
> probably associated with larger departures from the null.  But the p- 
> value is not a good way to estimate that distance.  Use a parameter  
> estimate instead.

And. Thank you for this paper. As a non-statistician I found it most  
instructive:

http://pubs.amstat.org/doi/pdfplus/10.1198/000313008X332421

-- 
David.
>
> Duncan Murdoch
>
>
>> Essentially, a p-value is as "real" as your
>> assumptions. In that way I can understand what Robert wants to say.  
>> But with
>> lare enough datasets, bootstrapping or permutation tests gives  
>> often about
>> the same p-value as the asymptotic approximation. At that moment, the
>> central limit theorem comes into play, which says that when the  
>> sample size
>> is big enough, the mean is -close to- normally distributed. In  
>> those cases,
>> the test statistic also follows the proposed distribution and your  
>> p-value
>> is closer to "reality". Mind you, the "sample size" for a specific  
>> statistic
>> is not always merely the number of observations, especially in more  
>> advanced
>> methods. Plus, violations of other assumptions, like independence  
>> of the
>> observations, changes the picture again.
>>
>> The point is : what is reality? As Duncan said, a small p-value  
>> indicates
>> that your null hypothesis is not true. That's exactly what you look  
>> for,
>> because that is the proof the relation in your dataset you're  
>> looking at,
>> did not emerge merely by chance. You're not out to calculate the  
>> exact
>> chance. Robert is right, reporting an exact p-value of 1.23 e-7  
>> doesn't make
>> sense at all. But the rejection of your null-hypothesis is as real  
>> as life.
>>
>> The trick is to test the correct null hypothesis, and that's were  
>> it most
>> often goes wrong...
>>
>> Cheers
>> Joris
>>
>>
>>> bak
>>>
>>> p.s. I am no statistician
>>>
>>>       [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT