[Rd] [Suggested patch] to fligner.test - constant values can produce significant results

Karolis K k@ro||@@koncev|c|u@ @end|ng |rom gm@||@com
Fri Jun 21 17:00:36 CEST 2019


In specific cases fligner.test() can produce a small p-value even when both
groups have constant variance.

Here is an illustration:

    fligner.test(c(1,1,2,2), c("a","a","b","b"))
    # p-value = NA

But:

    fligner.test(c(1,1,1,2,2,2), c("a","a","a","b","b","b"))
    # p-value < 2.2e-16

This can potentially get dangerous if people perform lots of parallel tests
of this type (i.e. when doing a test for each gene in genomic studies).

Submitted a proposed patch that should solve the issue by producing an
error "data is essentially constant" - which is the same error message
found in t-test under similar conditions.

P.S. First time writing to this list. Read all the guides of posting, but
sorry in advance if I still missed any rules.

svn.diff:

Index: src/library/stats/R/fligner.test.R
===================================================================
--- src/library/stats/R/fligner.test.R  (revision 76710)
+++ src/library/stats/R/fligner.test.R  (working copy)
@@ -55,6 +55,8 @@

     ## Careful. This assumes that g is a factor:
     x <- x - tapply(x,g,median)[g]
+    if (all(x == 0))
+      stop("data are essentially constant")

     a <- qnorm((1 + rank(abs(x)) / (n + 1)) / 2)
     STATISTIC <- sum(tapply(a, g, "sum")^2 / tapply(a, g, "length"))


---
Karolis Koncevičius

	[[alternative HTML version deleted]]



More information about the R-devel mailing list