[Rd] Inconsistencies in wilcox.test
Martin Maechler
m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Sat Dec 7 21:59:01 CET 2019
>>>>> Karolis Koncevičius
>>>>> on Sat, 7 Dec 2019 20:55:36 +0200 writes:
> Hello,
> Writing to share some things I've found about wilcox.test() that seem a
> a bit inconsistent.
> 1. Inf values are not removed if paired=TRUE
> # returns different results (Inf is removed):
> wilcox.test(c(1,2,3,4), c(0,9,8,7))
> wilcox.test(c(1,2,3,4), c(0,9,8,Inf))
> # returns the same result (Inf is left as value with highest rank):
> wilcox.test(c(1,2,3,4), c(0,9,8,7), paired=TRUE)
> wilcox.test(c(1,2,3,4), c(0,9,8,Inf), paired=TRUE)
Now which of the two cases do you consider correct ?
IHMO, the 2nd one is correct: it is exactly one property of the
*robustness* of the wilcoxon test and very desirable that any
(positive) outlier is treated the same as just "the largest
value" and the test statistic (and hence the p-value) should not
change.
So I think the first case is wrong, notably if modified, (such
that the last y is the overall maximal value (slightly larger sample):
> wilcox.test(1:7, 1/8+ c(9:4, 12))
Wilcoxon rank sum test
data: 1:7 and 1/8 + c(9:4, 12)
W = 6, p-value = 0.01748
alternative hypothesis: true location shift is not equal to 0
> wilcox.test(1:7, 1/8+ c(9:4, 10000))
Wilcoxon rank sum test
data: 1:7 and 1/8 + c(9:4, 10000)
W = 6, p-value = 0.01748
alternative hypothesis: true location shift is not equal to 0
> wilcox.test(1:7, 1/8+ c(9:4, Inf))
Wilcoxon rank sum test
data: 1:7 and 1/8 + c(9:4, Inf)
W = 6, p-value = 0.03497
alternative hypothesis: true location shift is not equal to 0
The Inf case should definitely give the same as the 10'000 case.
That's exactly one property of a robust statistic.
Thank you, Karolis, this is pretty embarrassing to only be detected now after 25+ years of R in use ...
The correct fix starts with replacing the is.finite() by !is.na() and keep the 'Inf' in the rank computations...
(but then probably also deal with the case of more than one Inf, notably the Inf - Inf "exception" which is not triggered by your example...)
---
Ben addressed the "rounding" / numerical issues unavoidable for
the other problems.
> 2. tolerance issues with paired=TRUE.
> wilcox.test(c(4, 3, 2), c(3, 2, 1), paired=TRUE)
> # ...
> # Warning: cannot compute exact p-value with ties
> wilcox.test(c(0.4,0.3,0.2), c(0.3,0.2,0.1), paired=TRUE)
> # ...
> # no warning
> 3. Always 'x observations are missing' when paired=TRUE
> wilcox.test(c(1,2), c(NA_integer_,NA_integer_), paired=TRUE)
> # ...
> # Error: not enough (finite) 'x' observations
> 4. No indication if normal approximation was used:
> # different numbers, but same "method" name
> wilcox.test(rnorm(10), exact=FALSE, correct=FALSE)
> wilcox.test(rnorm(10), exact=TRUE, correct=FALSE)
> From all of these I am pretty sure the 1st one is likely unintended,
> so attaching a small patch to adjust it. Can also try patching others if
> consensus is reached that the behavioiur has to be modified.
> Kind regards,
> Karolis Koncevičius.
> ---
> Index: wilcox.test.R
> ===================================================================
> --- wilcox.test.R (revision 77540)
> +++ wilcox.test.R (working copy)
> @@ -42,7 +42,7 @@
> if(paired) {
> if(length(x) != length(y))
> stop("'x' and 'y' must have the same length")
> - OK <- complete.cases(x, y)
> + OK <- is.finite(x) & is.finite(y)
> x <- x[OK] - y[OK]
> y <- NULL
> }
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list