[Rd] apparently incorrect p-values from 2-sided Kolmogorov-Smirnov (PR#14157)

allwrigh at maths.ox.ac.uk allwrigh at maths.ox.ac.uk
Fri Dec 18 13:40:10 CET 2009


Dear Thomas, Right, thank you. Yes, I haven't looked at the source code 
(because I don't know C) but something like what you mention could 
well cause the kind of problems I am seeing: a loop being exectued one too 
few or one too many times. And yes, I think those quantities should be 
multiplied up by m*n to all become integers so we escape rounding error 
problems. David.
------------------------------------------------------------------------------
On Wed, 16 Dec 2009, tlumley at u.washington.edu wrote:

> On Tue, 15 Dec 2009, allwrigh at maths.ox.ac.uk wrote; (in part)
>
>> 
>> x<-1:5
>> y<-c(2.5,4.5)
>> ks.test(x,y)
>> 
>> The value of the D_2,5 statistic is calculated as 0.4 correctly, but the
>> p-value is stated by R as 1, though in fact it should be 20/21=0.9524
>
>
> What we seem to have here is a rounding error problem.
>
> In ks.c:psmirnov2x,  there is a double loop including
> 	    if(fabs(i / md - j / nd) > q)
> 		u[j] = 0;
>
> where md=2, nd=5, and q=3/10.
>
> Now,  to full precision  abs(1/2 - 4/5) > 3/10 is false, but at least on my 
> MacBook it is true in C double precision.
>
> I'm not sure why the loop is working with doubles, since multiplying by m*n 
> should make everything an integer.
>
>     -thomas
>
> Thomas Lumley			Assoc. Professor, Biostatistics
> tlumley at u.washington.edu	University of Washington, Seattle
>
>
>



More information about the R-devel mailing list