[R] Wilcoxon signed-rank test

Tue Aug 22 19:08:14 CEST 2017

This query is offtopic for this list, as it is about statistics, not R
programming. stats.stackexchange.com is a good venue for statistics
questions.

However, you are confused. Wilcoxon does NOT test for differences in
population means. e.g.

Consider the 2 samples:

A: 5,6,7
B: 1,2, 50

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Tue, Aug 22, 2017 at 9:20 AM, Karolis Uziela
<karolis.uziela at gmail.com> wrote:
> Hi,
>
> I am using wilcox.test function to test the difference between the means of
> two samples. The data points are paired, so I am using a paired test.
>
> There is one strange case. Sample A has a higher mean than a sample B.
> However, wilcox.test function says that sample B has a significantly higher
> "mean rank" than sample A. How is it possible?
>
> Here is the code (data file is attached):
> df <- read.table("wilcox_data.txt", head=TRUE)
> mean(df$A)
> [1] 0.7987849
> mean(df$B)
> [1] 0.7977966
> mean(df$C)
> [1] 0.6350737
>
> wilcox.test(df$B, df$A, paired=TRUE, alternative="greater")
>         Wilcoxon signed rank test with continuity correction
>
> data:  df$B and df$A
> V = 134300, p-value = 3.299e-05
> alternative hypothesis: true location shift is greater than 0
>
> wilcox.test(df$C, df$A, paired=TRUE, alternative="greater")
>         Wilcoxon signed rank test with continuity correction
>
> data:  df$C and df$A
> V = 41423, p-value = 1
> alternative hypothesis: true location shift is greater than 0
>
> The p-value of the first test is rather low (3.299e-05), which indicates
> that the alternative hypothesis is true - sample B has a higher "mean rank"
> than sample A. Just to make sure I am not doing a dumb mistake, I added a
> third variable C to this example, which is much smaller than A or B. As
> expected, the second test has p-value = 1, which means that "mean rank" of
> C is lower than A (null hypothesis is true).
>
> I am afraid, I am not very strong in statistics, but I would very much
> appreciate if someone could explain me in simple words:
> 1) Wikipedia says that Wilcoxon signed-rank test is used to test whether
> population "mean ranks" differ. What is exactly the definition of "mean
> rank"? https://en.wikipedia.org/wiki/Wilcoxon_signed-rank_test
> 2) How can the mean of a variable A be bigger than the mean of variable B,
> but the "mean rank" of variable B is significantly bigger than "mean rank"
> of variable A.
>
> There is a small chance that this is because of a bug in wilcox.test
> function, but it is probably more likely that this paradox is because of
> some statistics phenomena that I don't understand.
>
> Best regards,
> Karolis Uziela
>
> P. S. I have another strange example, where the difference between A and B
> is much smaller than the difference between A and C, but the significance
> of the "mean rank" difference between A and B is much larger then the
> significance of mean rank difference between A and C. For simplicity
> reasons, I didn't add that example here, but I guess that the answer to the
> above question will be related to this one.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.