Prof Brian Ripley ripley at stats.ox.ac.uk
Sun Jan 30 17:10:52 CET 2011

```Where did you get the idea that the location estimate in a 2-sample
Wilcoxon test is the difference in medians?  (It is a common
misconception, but not I believe to be found in R.  The estimate is
the median of differences, not the difference of medians: and the test
is not of a difference of population medians either, unless the two
populations differ only in location.)

On Sun, 30 Jan 2011, Graham Smith wrote:

> I am sure I am opening myself up to looking stupid, but  I have two samples
> with medians of 613.5  and 189 (difference in location of 424 compared to
> the difference suggested from the wilcoxon of 291.5)
>
>> wilcox.test(pipwtCount,pipwdCount, conf.int=TRUE, na.rm=TRUE)
>
>    Wilcoxon rank sum test
>
> data:  pipwtCount and pipwdCount
> W = 822, p-value = 0.01227
> alternative hypothesis: true location shift is not equal to 0
> 95 percent confidence interval:
>  58 639
> sample estimates:
> difference in location
>                 291.5
>
> The data is here
>
>> pipwtCount
> [1]  532  298  215 1588   38  180  284  376 5349 1024  650  605 1307 6147
> 21
> [16]  453   23 1983 1048  464 2183 1028 1361  163  175 5944  569  622  793
> 70
> [31]   67 1188  248 3010   19 2179 1339  408  113  739 2615 4619
>
>> pipwdCount
> [1]   89  384   12  703    2  138  189  383  314  482   96  907   90 1193
> 154
> [16]  305   61  414 4764 1066  121  143  102  174   44 2896   NA 1103  161
> 199
>
>> median(pipwtCount)
> [1] 613.5
>> median(pipwdCount,na.rm=T)
> [1] 189
>> 613.5-189
> [1] 424.5
>
>
> I would appreciate if someone could point out the obvious to me, and explain
> why there is such a large discrepancy in the differences in location.
>
>
>
> Many thanks,
>
> Graham
>
>
>

