[Rd] [Suggested patch] to fligner.test - constant values can produce significant results
Karolis K
k@ro||@@koncev|c|u@ @end|ng |rom gm@||@com
Fri Jun 21 17:00:36 CEST 2019
In specific cases fligner.test() can produce a small p-value even when both
groups have constant variance.
Here is an illustration:
fligner.test(c(1,1,2,2), c("a","a","b","b"))
# p-value = NA
But:
fligner.test(c(1,1,1,2,2,2), c("a","a","a","b","b","b"))
# p-value < 2.2e-16
This can potentially get dangerous if people perform lots of parallel tests
of this type (i.e. when doing a test for each gene in genomic studies).
Submitted a proposed patch that should solve the issue by producing an
error "data is essentially constant" - which is the same error message
found in t-test under similar conditions.
P.S. First time writing to this list. Read all the guides of posting, but
sorry in advance if I still missed any rules.
svn.diff:
Index: src/library/stats/R/fligner.test.R
===================================================================
--- src/library/stats/R/fligner.test.R (revision 76710)
+++ src/library/stats/R/fligner.test.R (working copy)
@@ -55,6 +55,8 @@
## Careful. This assumes that g is a factor:
x <- x - tapply(x,g,median)[g]
+ if (all(x == 0))
+ stop("data are essentially constant")
a <- qnorm((1 + rank(abs(x)) / (n + 1)) / 2)
STATISTIC <- sum(tapply(a, g, "sum")^2 / tapply(a, g, "length"))
---
Karolis Koncevičius
[[alternative HTML version deleted]]
More information about the R-devel
mailing list