[R] Kolmogorov-Smirnov test and the plot of max distance between two ecdf curves

user1234 mehenderson11 at gmail.com
Fri Oct 5 13:15:07 CEST 2012


Rui, 

Your response nearly answered a similar question of mine except that I also
have ecdfs of different lengths.  

Do you know how I can adjust  x <- seq(min(loga, logb), max(loga, logb),
length.out=length(loga)) 
to account for this?  It must be in length.out() but I'm unsure how to
proceed.

Any advice is much appreciated.

-L


Rui Barradas wrote
> Hello,
> 
> Try the following.
> (i've changed the color of the first ecdf.)
> 
> 
> loga <- log10(a+1) # do this
> logb <- log10(b+1) # only once
> 
> f.a <- ecdf(loga)
> f.b <- ecdf(logb)
> # (2) max distance D
> 
> x <- seq(min(loga, logb), max(loga, logb), length.out=length(loga))
> x0 <- x[which( abs(f.a(x) - f.b(x)) == max(abs(f.a(x) - f.b(x))) )]
> y0 <- f.a(x0)
> y1 <- f.b(x0)
> 
> plot(f.a, verticals=TRUE, do.points=FALSE, col="blue")
> plot(f.b, verticals=TRUE, do.points=FALSE, col="green", add=TRUE)
> ## alternatine, use standard R plot of ecdf
> #plot(f.a, col="blue")
> #lines(f.b, col="green")
> 
> points(c(x0, x0), c(y0, y1), pch=16, col="red")
> segments(x0, y0, x0, y1, col="red", lty="dotted")
> ## alternative, down to x axis
> #segments(x0, 0, x0, y1, col="red", lty="dotted")
> 
> 
> Hope this helps,
> 
> Rui Barradas
> maxbre wrote
>> Hi all, 
>> 
>> given this example 
>> 
>> #start 
>> 
>> a<-c(0,70,50,100,70,650,1300,6900,1780,4930,1120,700,190,940, 
>>    
>> 760,100,300,36270,5610,249680,1760,4040,164890,17230,75140,1870,22380,5890,2430) 
>> length(a)
>> 
>> b<-c(0,0,10,30,50,440,1000,140,70,90,60,60,20,90,180,30,90, 
>>      3220,490,20790,290,740,5350,940,3910,0,640,850,260) 
>> length(b)
>> 
>> out<-ks.test(log10(a+1),log10(b+1)) 
>> 
>> # max distance D 
>> out$statistic 
>> 
>> f.a<-ecdf(log10(a+1)) 
>> f.b<-ecdf(log10(b+1)) 
>> 
>> plot(f.a, verticals=TRUE, do.points=FALSE, col="red") 
>> plot(f.b, verticals=TRUE, do.points=FALSE, col="green", add=TRUE) 
>> 
>> #inverse of ecdf a
>> x.a<-get("x", environment(f.a))
>> y.a<-get("y", environment(f.a))
>> 
>> # inverse of ecdf b
>> x.b<-get("x", environment(f.b))
>> y.b<-get("y", environment(f.b))
>> 
>> 
>> #end
>> 
>> I want to plot the max distance between the two ecdf curves as in the
>> above given chart
>> 
>> Is that possible and how? 
>> 
>> 
>> Thanks for your help
>> 
>> PS: this is an amended version of a previous thread (but no reply
>> followed) that I’ve deleted from Nabble repository because I realised it
>> was not enough clear (now I hope it’s a little better, sorry for that)





--
View this message in context: http://r.789695.n4.nabble.com/Kolmogorov-Smirnov-test-and-the-plot-of-max-distance-between-two-ecdf-curves-tp4631437p4645140.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list