[R] wilcox.test p-value = 0
Bryan Keller
bskeller at wisc.edu
Wed Sep 16 16:54:36 CEST 2009
That's right, if the test is exact it is not possible to get a p-value of zero. wilcox.test does not provide an exact p-value in the presence of ties so if there are any ties in your data you are getting a normal approximation. Incidentally, if there are any ties in your data set I would strongly recommend computing the *exact* p-value because using the normal approximation on tied data sets will either inflate type I error rate or reduce power depending on how the ties are distributed. Depending on the pattern of ties this can result in gross under or over estimation of the p-value.
I guess this is all by way of saying that you should always compute the exact p-value if possible.
The package exactRankTests uses the algorithm by Mehta Patel and Tsiatis (1984). If your sample sizes are larger, there is a freely available .exe by Cheung and Klotz (1995) that will do exact p-values for sample sizes larger than 100 in each group!
You can find it at http://pages.cs.wisc.edu/~klotz/
Bryan
> Hi Murat,
> I am not an expert in either statistics nor R, but I can imagine that since the
> default is exact=TRUE, It numerically computes the probability, and it may
> indeed be 0. if you use wilcox.test(x, y, exact=FALSE) it will give you a
> normal aproximation, which will most likely be different from zero.
No, the exact p-value can't be zero for a discrete distribution. The smallest possible value in this case would, I think, be 1/choose(length(x)+length(y),length(x)), or perhaps twice that.
More generally, the approach used by format.pvalue() is to display very small p-values as <2e-16, where 2e-16 is machine epsilon. I wouldn't want to claim optimality for this choice, but it seems a reasonable way to represent "very small".
-thomas
> Hope this helps.
> Keo.
>
> Murat Tasan escribi?:
>> hi, folks,
>>
>> how have you gone about reporting a p-value from a test when the
>> returned value from a test (in this case a rank-sum test) is
>> numerically equal to 0 according to the machine?
>>
>> the next lowest value greater than zero that is distinct from zero on
>> the machine is likely algorithm-dependent (the algorithm of the test
>> itself), but without knowing the explicit steps of the algorithm
>> implementation, it is difficult to provide any non-zero value. i
>> initially thought to look at .Machine at double.xmin, but i'm not
>> comfortable with reporting p < .Machine at double.xmin, since without
>> knowing the specifics of the implementation, this may not be true!
>>
>> to be clear, if i have data x, and i run the following line, the
>> returned value is TRUE.
>>
>> wilcox.test(x)$p.value == 0
>>
>> thanks for any help on this!
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
------------------------------
-------------
Bryan Keller, Doctoral Student/Project Assistant
Educational Psychology - Quantitative Methods
The University of Wisconsin - Madison
More information about the R-help
mailing list