[R] Kolmogorov-Smirnov GoF test

Sun May 20 12:05:13 CEST 2007

Hi!
I want to do a ks.test of some sample data, say "x", against a 
theoretical distribution, e.g. a Weibull.

So suppose we have these data:

set.seed(1);
x <- rweibull( 200, 1.3, 8.7 );

1. Is it better to do a 1-sample or a 2-sample test?

    ks.test( x, "pweibull", 1.3, 8.7 ); # 1-sample
    ks.test( x, rweibull( 200, 1.3, 8.7 ); # 2-samples

2. If I perform a 2 sample test, what I thought to do was using some 
kind of resampling from the theoretical distribution and then averages 
all ks statistic obtained on each sampling:

    n <- 1000; # number of resampling
    mean <- 0; # KS statistic mean
    sd <- 0; # KS statistc std-err
    for ( k in 1:n )
    {
      ks <- ks.test( x, rweribull( 200, 1.3, 8.7 ) );
      mean <- mean + ks.statistic;
      sd <- sd + ks.statistic^2;
    }
    ks.mean <- ks.mean/n
    ks.sd <- sqrt( (ks.sd - n*ks.mean^2)/(n-1) );

    # Calculate p-value with Marsaglia K(n,d) function (used by R)
    #p.value <- 1-K(200, ks.mean);

    cat( paste( "KS statistic: ", ks.mean ) );
    cat( paste( "Standard Error: ", ks.sd ) );
    cat( paste( "p-value: ", p.value ) );

Has this any sense?

Any other critic/suggestion is appreciated.

Thank you very much!

-- Marco