[R] Kolmogorov-Smirnov statistic
Gad Abraham
gabraham at csse.unimelb.edu.au
Sat Aug 29 16:02:50 CEST 2009
Hi,
More of a statistical question, I'm trying to understand the formulation
of the one-sample two-sided Kolmogorov-Smirnov statistic in
stats::ks.test(), testing against a uniform distribution.
Basically, it boils down to:
x <- rnorm(100)
n <- length(x)
z <- punif(sort(x)) - (0:(n - 1)) / n
max(z, 1 / n - z)
which is equivalent to the textbook definition
n <- length(x)
z <- punif(sort(x))
Dplus <- max(sapply(1:n, function(i) i / n - z[i]))
Dminus <- max(sapply(1:n, function(i) z[i] - (i - 1) / n))
max(Dplus, Dminus)
(See, e.g.,
http://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm, and
Durbin (1971) ``Distribution theory for tests based on the sample
distribution function'', p. 6)
Why does the definition of Dminus have an i-1 in the numerator instead
of i? I have a hunch it's got to do with right-continuity of the ecdf,
but perhaps someone can shed some light on it.
Thanks,
Gad
--
Gad Abraham
MEng Student, Dept. CSSE and NICTA
The University of Melbourne
Parkville 3010, Victoria, Australia
email: gabraham at csse.unimelb.edu.au
web: http://www.csse.unimelb.edu.au/~gabraham
More information about the R-help
mailing list