[Rd] ks.test (Kolmogorov-Smirnov) (PR#7725)

jpnolan at american.edu jpnolan at american.edu
Mon Mar 14 06:12:27 CET 2005


Full_Name: John Nolan
Version: 2.0.1
OS: Win XP
Submission from: (NULL) (151.200.9.43)


I think there are two small bugs in the Kolmogorov Smirnov test routines.  

(1) In the R function ks.test, toward the bottom, the internal function pkstwo
does:

 p[IND] <- .C("pkstwo", as.integer(length(x)), p = as.double(x[IND]), 
                as.double(tol), PACKAGE = "stats")$p

Instead of length(x), shouldn't length(x[IND]) be passed?  If there are NAs,
length(x) > length(x[IND]) and the called C routine will try to process more
values than it should.


(2) In C routine ks.c, the function pkstwo describes two series it evaluates,
depending on whether x > 1 or x < 1.  The second case uses a formula described 
in the comments at the head of the function as

 *   = \sqrt{2\pi/x} \sum_{k=1}^\infty \exp(-(2k-1)^2\pi^2/(8x^2))

The code actually evaluates  \sqrt{2\pi}/x, not \sqrt{2\pi/x}.  I am not sure
which equation is right: both agree at x=1 so there is no easy test here.
If the equation is correct, the code should be changed to have w =
0.5*log(x[i]),
not w = log(x[i]).  If the code is right, change the equation in the comment.

John Nolan



More information about the R-devel mailing list