[R] trouble with wilcox.test

P Ehlers ehlers at math.ucalgary.ca
Thu Aug 18 10:42:58 CEST 2005



P Ehlers wrote:
> 
> Prof Brian Ripley wrote:
> 
>> On Wed, 17 Aug 2005, Greg Hather wrote:
>>
>>
>>> I'm having trouble with the wilcox.test command in R.
>>
>>
>>
>> Are you sure it is not the concepts that are giving 'trouble'?
>> What real problem are you trying to solve here?
>>
>>
>>> To demonstrate the anomalous behavior of wilcox.test, consider
>>>
>>>
>>>> wilcox.test(c(1.5,5.5), c(1:10000), exact = F)$p.value
>>>
>>>
>>> [1] 0.01438390
>>>
>>>> wilcox.test(c(1.5,5.5), c(1:10000), exact = T)$p.value
>>>
>>>
>>> [1] 6.39808e-07 (this calculation takes noticeably longer).
>>>
>>>> wilcox.test(c(1.5,5.5), c(1:20000), exact = T)$p.value
>>>
>>>
>>> (R closes/crashes)
>>>
>>> I believe that wilcox.test(c(1.5,5.5), c(1:10000), exact = F)$p.value 
>>> yields a bad result because of the normal approximation which R uses 
>>> when exact = F.
>>
>>
>>
>> Expecting an approximation to be good in the tail for m=2 is pretty 
>> unrealistic.  But then so is believing the null hypothesis of a common 
>> *continuous* distribution.  Why worry about the distribution under a 
>> hypothesis that is patently false?
>>
>> People often refer to this class of tests as `distribution-free', but 
>> they are not.  The Wilcoxon test is designed for power against shift 
>> alternatives, but here there appears to be a very large difference in 
>> spread.  So
>>
>>
>>> wilcox.test(5000+c(1.5,5.5), c(1:10000), exact = T)$p.value
>>
>>
>> [1] 0.9989005
>>
>> even though the two samples differ in important ways.
>>
>>
>>
>>> Any suggestions for how to compute wilcox.test(c(1.5,5.5), 
>>> c(1:20000), exact = T)$p.value?
>>
>>
>>
>> I get (current R 2.1.1 on Linux)
>>
>>
>>> wilcox.test(c(1.5,5.5), c(1:20000), exact = T)$p.value
>>
>>
>> [1] 1.59976e-07
>>
>> and no crash.  So the suggestion is to use a machine adequate to the 
>> task, and that probably means an OS with adequate stack size.
>>
>>
>>>     [[alternative HTML version deleted]]
>>
>>
>>
>>> PLEASE do read the posting guide! 
>>> http://www.R-project.org/posting-guide.html
>>
>>
>>
>> Please do heed it.  What version of R and what machine is this?  And 
>> do take note of the request about HTML mail.
>>
> 
> One could also try wilcox.exact() in package exactRankTests (0.8-11)
> which also gives (with suitable patience)
> 
> [1] 1.59976e-07
> 
> even on my puny 256M Windows laptop.
> 
> Still, it might be worthwhile adding a "don't do something this silly"
> error message to wilcox.test() rather than having it crash R. Low
> priority, IMHO.
> 
> Windows XP SP2
> "R version 2.1.1, 2005-08-11"
> 
> Peter Ehlers
> 

I should also mention package coin's wilcox_test() which does the
job in about a quarter of the time used by exactRankTests.

Peter Ehlers




More information about the R-help mailing list