[R] wilcox.test - difference between p-values of R and online calculators

W Bradley Knox bradknox at mit.edu
Wed Sep 3 16:20:21 CEST 2014


Tal and David, thanks for your messages.

I should have added that I tried all variations of true/false values for
the exact and correct parameters. Running with correct=FALSE makes only a
tiny change, resulting in W = 485, p-value = 0.0002481.

At one point, I also thought that the discrepancy between R and these
online calculators might come from how ties are handled, but the fact that
R and two of the online calcultors reach the same U/W values seems to
indicate that ties aren't the issue, since (I believe) the U or W values
contain all of the information needed to calculate the p-value, assuming
the number of samples is also known for each condition. (However, it's been
a while since I looked into how MWU tests work, so maybe now's the time to
refresh.) If that's correct, the discrepancy seems to be based in what R
does with the W value that is identical to the U values of two of the
online calculators. (I'm also assuming that U and W have the same meaning,
which seems likely.)

- Brad

____________________
W. Bradley Knox, PhD
http://bradknox.net
bradknox at mit.edu


On Wed, Sep 3, 2014 at 9:10 AM, David L Carlson <dcarlson at tamu.edu> wrote:

> That does not change the results. The problem is likely to be the way ties
> are handled. The first sample has 25 values of which 23 are identical
> (359). The second sample has 26 values of which 12 are identical (359). The
> difference between the implementations may be a result of the way the ties
> are ranked. For example the R function rank() offers 5 different ways of
> handling the rank on tied observations. With so many ties, that could make
> a substantial difference.
>
> Package coin has wilxon_test() which uses Monte Carlo simulation to
> estimate the confidence limits.
>
> -------------------------------------
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
>
>
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
> On Behalf Of Tal Galili
> Sent: Wednesday, September 3, 2014 5:24 AM
> To: W Bradley Knox
> Cc: r-help at r-project.org
> Subject: Re: [R] wilcox.test - difference between p-values of R and online
> calculators
>
> It seems your numbers has ties. What happens if you run wilcox.test with
> correct=FALSE, will the results be the same as the online calculators?
>
>
>
> ----------------Contact
> Details:-------------------------------------------------------
> Contact me: Tal.Galili at gmail.com |
> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
> www.r-statistics.com (English)
>
> ----------------------------------------------------------------------------------------------
>
>
>
> On Wed, Sep 3, 2014 at 3:54 AM, W Bradley Knox <bradknox at mit.edu> wrote:
>
> > Hi.
> >
> > I'm taking the long-overdue step of moving from using online calculators
> to
> > compute results for Mann-Whitney U tests to a more streamlined system
> > involving R.
> >
> > However, I'm finding that R computes a different result than the 3 online
> > calculators that I've used before (all of which approximately agree).
> These
> > calculators are here:
> >
> > http://elegans.som.vcu.edu/~leon/stats/utest.cgi
> > http://vassarstats.net/utest.html
> > http://www.socscistatistics.com/tests/mannwhitney/
> >
> > An example calculation is
> >
> >
> >
> *wilcox.test(c(359,359,359,359,359,359,335,359,359,359,359,359,359,359,359,359,359,359,359,359,359,303,359,359,359),c(332,85,359,359,359,220,231,300,359,237,359,183,286,355,250,105,359,359,298,359,359,359,28.6,359,359,128))*
> >
> > which prints
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > *Wilcoxon rank sum test with continuity correction  data: c(359, 359,
> 359,
> > 359, 359, 359, 335, 359, 359, 359, 359, 359, and c(332, 85, 359, 359,
> 359,
> > 220, 231, 300, 359, 237, 359, 183, 359, 359, 359, 359, 359, 359, 359,
> 359,
> > 359, 303, 359, 359, and 286, 355, 250, 105, 359, 359, 298, 359, 359, 359,
> > 28.6, 359, 359) and 359, 128)  W = 485, p-value = 0.0002594 alternative
> > hypothesis: true location shift is not equal to 0 Warning message: In
> > wilcox.test.default(c(359, 359, 359, 359, 359, 359, 335, 359, : cannot
> > compute exact p-value with ties*
> >
> >
> > However, all of the online calculators find p-values close to 0.0025, 10x
> > the value output by R. All results are for a two-tailed case.
> Importantly,
> > the W value computed by R *does agree* with the U values output by the
> > first two online calculators listed above, yet it has a different
> p-value.
> >
> > Can anyone shed some light on how and why R's calculation differs from
> that
> > of these online calculators? Thanks for your time.
> >
> > ____________________
> > W. Bradley Knox, PhD
> > http://bradknox.net
> > bradknox at mit.edu
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list