[Rd] Confusion about ks.test() handling of ties and exact vs approximate results
Karolis Koncevičius
k@ro||@@koncev|c|u@ @end|ng |rom gm@||@com
Fri Apr 21 10:32:41 CEST 2023
Hello,
Today I was investigating ks.test() with two numerical arguments (x and y) and was left a bit confused about the policy behind handling ties.
I might be missing something, so sorry in advance, but here is what confuses me:
The documentation states: "The presence of ties always generates a warning, since continuous distributions do not generate them"
But when I run a test with ties there is no warning:
ks.test(1:4, 4:7)
However, when I specify that I do not want an exact test, there appears a warning saying that the computation will be approximate:
ks.test(1:4, 4:7, exact=FALSE)
# Warning: p-value will be approximate in the presence of ties
But isn’t specifying exact=FALSE already makes the test approximate?
I tried inspecting the source code for guidance but also was left a bit puzzled. In ks.test.R under if(is.numeric(y)) clause there is a variable called TIES that is set and changed, but is never used anywhere. Here are examples:
line 55 TIES <- FALSE
line 61 TIES <- TRUE
line 74 if (TIES)
line 75 z <- w
But later this z variable is not used as a variable in the code. It looks to me that this TIES variable can be deleted without affecting anything else.
What I gathered from the investigation is that probably now ties are handled by psmirnov() and for numeric x and y the computations are exact even with ties, however I am a bit puzzled about the warning for approximate values, when exact = FALSE is set anyway.
So my question - is everything currently OK with the code and the documentation?
More information about the R-devel
mailing list