[R] wilcox.test - difference between p-values of R and online calculators

peter dalgaard pdalgd at gmail.com
Wed Sep 3 23:20:04 CEST 2014

```Notice that correct=TRUE for wilcox.test refers to the continuity correction, not the correction for ties.

You can fairly easily simulate from the exact distribution of W:

x <- c(359,359,359,359,359,359,335,359,359,359,359,
359,359,359,359,359,359,359,359,359,359,303,359,359,359)
y <- c(332,85,359,359,359,220,231,300,359,237,359,183,286,
355,250,105,359,359,298,359,359,359,28.6,359,359,128)
R <- rank(c(x,y))
sim <- replicate(1e6,sum(sample(R,25))) - 325

# With no ties, the ranks would be a permutation of 1:51, and we could do
sim2 <- replicate(1e6,sum(sample(1:51,25))) - 325

In either case, the p-value is the probability that W >= 485 or W <= 165, and

> mean(sim >= 485 | sim <= 165)
[1] 0.000151
> mean(sim2 >= 485 | sim2 <= 165)
[1] 0.002182

Also, try

plot(density(sim))
lines(density(sim2))

and notice that the distribution of sim is narrower than that of sim2 (hence the smaller p-value with tie correction), but also that the normal approximationtion is not nearly as good as for the untied case. The "clumpiness" is due to the fact that 35 of the ranks have the maximum value of 34 (corresponding to the original 359's).

-pd

On 03 Sep 2014, at 19:13 , David L Carlson <dcarlson at tamu.edu> wrote:

> Since they all have the same W/U value, it seems likely that the difference is how the different versions adjust the standard error for ties. Here are a couple of posts addressing the issues of ties:
>
> http://tolstoy.newcastle.edu.au/R/e8/help/09/12/9200.html
>
> David C
>
> Sent: Wednesday, September 3, 2014 9:20 AM
> To: David L Carlson
> Cc: Tal Galili; r-help at r-project.org
> Subject: Re: [R] wilcox.test - difference between p-values of R and online calculators
>
> Tal and David, thanks for your messages.
>
> I should have added that I tried all variations of true/false values for the exact and correct parameters. Running with correct=FALSE makes only a tiny change, resulting in W = 485, p-value = 0.0002481.
>
> At one point, I also thought that the discrepancy between R and these online calculators might come from how ties are handled, but the fact that R and two of the online calcultors reach the same U/W values seems to indicate that ties aren't the issue, since (I believe) the U or W values contain all of the information needed to calculate the p-value, assuming the number of samples is also known for each condition. (However, it's been a while since I looked into how MWU tests work, so maybe now's the time to refresh.) If that's correct, the discrepancy seems to be based in what R does with the W value that is identical to the U values of two of the online calculators. (I'm also assuming that U and W have the same meaning, which seems likely.)
>
>
> ____________________
>
> On Wed, Sep 3, 2014 at 9:10 AM, David L Carlson <dcarlson at tamu.edu<mailto:dcarlson at tamu.edu>> wrote:
> That does not change the results. The problem is likely to be the way ties are handled. The first sample has 25 values of which 23 are identical (359). The second sample has 26 values of which 12 are identical (359). The difference between the implementations may be a result of the way the ties are ranked. For example the R function rank() offers 5 different ways of handling the rank on tied observations. With so many ties, that could make a substantial difference.
>
> Package coin has wilxon_test() which uses Monte Carlo simulation to estimate the confidence limits.
>
> -------------------------------------
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
>
>
> -----Original Message-----
> From: r-help-bounces at r-project.org<mailto:r-help-bounces at r-project.org> [mailto:r-help-bounces at r-project.org<mailto:r-help-bounces at r-project.org>] On Behalf Of Tal Galili
> Sent: Wednesday, September 3, 2014 5:24 AM
> Cc: r-help at r-project.org<mailto:r-help at r-project.org>
> Subject: Re: [R] wilcox.test - difference between p-values of R and online calculators
>
> It seems your numbers has ties. What happens if you run wilcox.test with
> correct=FALSE, will the results be the same as the online calculators?
>
>
>
> ----------------Contact
> Details:-------------------------------------------------------
> Contact me: Tal.Galili at gmail.com<mailto:Tal.Galili at gmail.com> |
> Read me: www.talgalili.com<http://www.talgalili.com> (Hebrew) | www.biostatistics.co.il<http://www.biostatistics.co.il> (Hebrew) |
> www.r-statistics.com<http://www.r-statistics.com> (English)
> ----------------------------------------------------------------------------------------------
>
>
>
> On Wed, Sep 3, 2014 at 3:54 AM, W Bradley Knox <bradknox at mit.edu<mailto:bradknox at mit.edu>> wrote:
>
>> Hi.
>>
>> I'm taking the long-overdue step of moving from using online calculators to
>> compute results for Mann-Whitney U tests to a more streamlined system
>> involving R.
>>
>> However, I'm finding that R computes a different result than the 3 online
>> calculators that I've used before (all of which approximately agree). These
>> calculators are here:
>>
>> http://elegans.som.vcu.edu/~leon/stats/utest.cgi
>> http://vassarstats.net/utest.html
>> http://www.socscistatistics.com/tests/mannwhitney/
>>
>> An example calculation is
>>
>>
>> *wilcox.test(c(359,359,359,359,359,359,335,359,359,359,359,359,359,359,359,359,359,359,359,359,359,303,359,359,359),c(332,85,359,359,359,220,231,300,359,237,359,183,286,355,250,105,359,359,298,359,359,359,28.6,359,359,128))*
>>
>> which prints
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *Wilcoxon rank sum test with continuity correction  data: c(359, 359, 359,
>> 359, 359, 359, 335, 359, 359, 359, 359, 359, and c(332, 85, 359, 359, 359,
>> 220, 231, 300, 359, 237, 359, 183, 359, 359, 359, 359, 359, 359, 359, 359,
>> 359, 303, 359, 359, and 286, 355, 250, 105, 359, 359, 298, 359, 359, 359,
>> 28.6, 359, 359) and 359, 128)  W = 485, p-value = 0.0002594 alternative
>> hypothesis: true location shift is not equal to 0 Warning message: In
>> wilcox.test.default(c(359, 359, 359, 359, 359, 359, 335, 359, : cannot
>> compute exact p-value with ties*
>>
>>
>> However, all of the online calculators find p-values close to 0.0025, 10x
>> the value output by R. All results are for a two-tailed case. Importantly,
>> the W value computed by R *does agree* with the U values output by the
>> first two online calculators listed above, yet it has a different p-value.
>>
>> Can anyone shed some light on how and why R's calculation differs from that
>> of these online calculators? Thanks for your time.
>>
>> ____________________
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org<mailto:R-help at r-project.org> mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org<mailto:R-help at r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> and provide commented, minimal, self-contained, reproducible code.
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help