[Rd] [Suggested patch] to fligner.test - constant values can produce significant results
Martin Maechler
m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Tue Jun 25 10:11:53 CEST 2019
>>>>> Karolis K
>>>>> on Fri, 21 Jun 2019 18:00:36 +0300 writes:
> In specific cases fligner.test() can produce a small p-value even when both
> groups have constant variance.
> Here is an illustration:
> fligner.test(c(1,1,2,2), c("a","a","b","b"))
> # p-value = NA
> But:
> fligner.test(c(1,1,1,2,2,2), c("a","a","a","b","b","b"))
> # p-value < 2.2e-16
> This can potentially get dangerous if people perform lots of parallel tests
> of this type (i.e. when doing a test for each gene in genomic studies).
I agree; this is really misleading and dangerously wrong.
> Submitted a proposed patch that should solve the issue by producing an
> error "data is essentially constant" - which is the same error message
> found in t-test under similar conditions.
I'm much less agreeing on that remedy (and also the solution for t.test()):
In many similar situations, it has been very fruitful to have R's
algorithms behave "generalized continuous"ly. I have defined
this as (something like)
"The value at infinity should correspond to the limit going to infinity"
(and also applying that to 1/0 == Inf which is the correct
limit in the case where the "0" is known to be non-negative,
as here for the variance/sd)
In this case (H0: variances in groups are equal), I'd argue that
H0 should *not* be rejected, and the "most correct" P-value to
be 1 . After all, both groups have the same variance, 0.
In the t.test() case (H0: group means are equal; variances are
equal (or not: that's optional var.equal = TRUE / FALSE):
When the two group variances are 0, there are 2 cases, and I
claim something like the following should happen by
"generalized continuity":
1) if the group means are equal H0 is not rejected (P = 1)
2) if the group means differ, H0 is clearly rejected (P = 0)
{where for '1)' I could also agree on being undecided and returning P = NaN}
Returning an error in this case, as t.test() has been doing,
seems a waste (loss of information) in my view.
But for now, let's not discuss t.test() but the "var tests"
(homogeneity of variances)
> P.S. First time writing to this list. Read all the guides of posting, but
> sorry in advance if I still missed any rules.
well, thank you, but your post is really "perfect" in all formal senses
(correct mailing list, reproducible example code, using plain
text, being polite ;-), even proposing a patch via diff )
==> really very well done and a role model for others!
Thank you indeed for raising the issue and proposing a patch.
We should discuss here ... i.e. hear other opinions etc.
Note that there ca 5 different such tests for homogeneity of variances
(fligner.test, bartlett.test, var.test, ansari.test, mood.test)
and the behavior of the other 4 tests should also be considered ..
> svn.diff:
> Index: src/library/stats/R/fligner.test.R
> ===================================================================
> --- src/library/stats/R/fligner.test.R (revision 76710)
> +++ src/library/stats/R/fligner.test.R (working copy)
> @@ -55,6 +55,8 @@
> ## Careful. This assumes that g is a factor:
> x <- x - tapply(x,g,median)[g]
> + if (all(x == 0))
> + stop("data are essentially constant")
> a <- qnorm((1 + rank(abs(x)) / (n + 1)) / 2)
> STATISTIC <- sum(tapply(a, g, "sum")^2 / tapply(a, g, "length"))
> ---
> Karolis Koncevičius
> [[alternative HTML version deleted]]
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list