[R] P values
Duncan Murdoch
murdoch.duncan at gmail.com
Fri May 7 19:31:50 CEST 2010
Robert A LaBudde wrote:
> At 07:10 AM 5/7/2010, Duncan Murdoch wrote:
>
>> Robert A LaBudde wrote:
>>
>>> At 01:40 PM 5/6/2010, Joris Meys wrote:
>>>
>>>
>>>> On Thu, May 6, 2010 at 6:09 PM, Greg Snow <Greg.Snow at imail.org> wrote:
>>>>
>>>>
>>>>
>>>>> Because if you use the sample standard deviation then it is a t test not a
>>>>> z test.
>>>>>
>>>>>
>>>>>
>>>> I'm doubting that seriously...
>>>>
>>>> You calculate normalized Z-values by substracting the sample mean and
>>>> dividing by the sample sd. So Thomas is correct. It becomes a Z-test since
>>>> you compare these normalized Z-values with the Z distribution, instead of
>>>> the (more appropriate) T-distribution. The T-distribution is essentially a
>>>> Z-distribution that is corrected for the finite sample size. In Asymptopia,
>>>> the Z and T distribution are identical.
>>>>
>>>>
>>> And it is only in Utopia that any P-value less than 0.01 actually
>>> corresponds to reality.
>>>
>>>
>>>
>> I'm not sure what you mean by this. P-values are simply statistics
>> calculated from the data; why wouldn't they be real if they are small?
>>
>
> Do you truly believe an actual real-life distribution accurately is
> fit by a normal distribution at quantiles of 0.001, 0.0001 or beyond?
>
Not often, but I don't see how that is relevant. I would normally
conclude that a P-value of 0.01, 0.001, or especially 0.0001 didn't come
from the null distribution.
My model for the null distribution and the distribution that actually
generated the data and the P-value differ by *a lot*, not just a little
bit. (This is somewhat obvious with samples that aren't too large. With
really large samples, "a lot" may need to be interpreted carefully.)
> "The map is not the territory", and just because you can calculate
> something from a model doesn't mean it's true.
>
> The real world is composed of mixture distributions, not pure ones.
>
> The P-value may be real, but its reality is subordinate to the
> distributional assumption involved, which always fails at some level.
> I'm simply asserting that level is in the tails at probabilities of
> 0.01 or less.
>
> Statisticians, even eminent ones such as yourself and lesser lights
> such as myself, frequently fail to keep this in mind. We accept such
> assumptions as "normality", "equal variances", etc., on an
> "eyeballometric" basis, without any quantitative understanding of
> what this means about limitations on inference, including P-values.
>
> Inference in statistics is much cruder and more judgmental than we
> like to portray. We should at least be honest among ourselves about
> the degree to which our hand-waving assumptions work.
>
I think I agree with you that I would have a hard time arguing against a
test based on a slightly different null distribution, and that test
would likely give a P-value quite different from the one I calculated
based on my assumption. But my conclusion would be the same: P <
0.0001 means there's likely something wrong with the assumptions in the
null distribution.
> I remember at the O. J. Simpson trial, the DNA expert asserted that a
> match would occur only once in 7 billion people. I wondered at the
> time how you could evaluate such an assertion, given there were less
> than 7 billion people on earth at the time.
>
So that's clear evidence that the null model he was using was not the
truth. It would have been just as clear if he'd said 1 in a million, or
1 in a trillion.
> When I was at a conference on optical disk memories when they were
> being developed, I heard a talk about validating disk specifications
> against production. One statement was that the company would also
> validate the "undetectable error rate" specification of 1 in 10^16
> bits. I amusingly asked how they planned to validate the
> "undetectable" error rate. The response was handwaving and "Just as
> we do everything else". The audience laughed, and the speaker didn't
> seem to know what the joke was.
>
That's not a p-value, that's a probability of an error, which is quite a
different thing. There the number does matter, an error of 1 in 10^6 is
quite different from an error of 1 in 10^16.
Duncan Murdoch
> In both these cases the values were calculable, but that didn't mean
> that they applied to reality.
>
> ================================================================
> Robert A. LaBudde, PhD, PAS, Dpl. ACAFS e-mail: ral at lcfltd.com
> Least Cost Formulations, Ltd. URL: http://lcfltd.com/
> 824 Timberlake Drive Tel: 757-467-0954
> Virginia Beach, VA 23464-3239 Fax: 757-467-2947
>
> "Vere scire est per causas scire"
> ================================================================
>
>
More information about the R-help
mailing list