[R] How to interpret Kolmogorov-Smirnov stats

Fri Jul 29 12:43:45 CEST 2011

Hi,

Interpretation problem ! so what i did is by using the:

>fit1 <- fitdist(vectNorm,"beta")

Warning messages:
1: In dbeta(x, shape1, shape2, log) : NaNs produced
2: In dbeta(x, shape1, shape2, log) : NaNs produced
3: In dbeta(x, shape1, shape2, log) : NaNs produced
4: In dbeta(x, shape1, shape2, log) : NaNs produced
5: In dbeta(x, shape1, shape2, log) : NaNs produced
6: In dbeta(x, shape1, shape2, log) : NaNs produced

##Is this a real problem - the input contains of 900 data points of which 6
caused this message#

got the following shape parameters for my distribution:

>summary(fit1)

Fitting of the distribution ' beta ' by maximum likelihood 
Parameters : 
         estimate Std. Error
shape1   2.148779  0.1458042
shape2 810.067515 61.8608126
Loglikelihood:  1917.51   AIC:  -3831.02   BIC:  -3823.15 
Correlation matrix:
          shape1    shape2
shape1 1.0000000 0.8880194
shape2 0.8880194 1.0000000

now if i do :

>gofstat(fit1, print.test=TRUE)

Kolmogorov-Smirnov statistic:  0.06630064 
Kolmogorov-Smirnov test:  not rejected 
   The result of this test may be too conservative as it  
   assumes that the distribution parameters are known
Cramer-von Mises statistic:  0.3866663 
Crame-von Mises test: not calculated 
Anderson-Darling statistic:  2.820576 
Anderson-Darling test: not calculated 

So then what i did is I bootstrapped the data based on gathered parameters a
i b :

r_beta <- rbeta(378, 2.148779, 810.067515, ncp = 0);
ks.boot(vectNorm, r_beta, nboots=1000, alternative = c("two.sided", "less",
"greater"), print.level=0)

and got :
$ks.boot.pvalue
[1] 0.002

$ks

	Two-sample Kolmogorov-Smirnov test

data:  Tr and Co 
D = 0.1323, p-value = 0.002684
alternative hypothesis: two.sided 

$nboots
[1] 1000

attr(,"class")
[1] "ks.boot"

So I'm not a stats type of a guy so I need some reassurance that i did this
by the book. and also the part that confuses me and which i do not
understand is if Kolmogorov-Smirnov statistic reports the difference of 
0.06630064 which indicate that my data fits the beta quite well, why the
bootstrap rejects the hypothesis that both data sets come from the same
population. Or did i misunderstood something??? Please do correct me.

Furthermore, should I be worried that other two tests were not computed ?

Thank you

baxy

--
View this message in context: http://r.789695.n4.nabble.com/How-to-interpret-Kolmogorov-Smirnov-stats-tp3703655p3703655.html
Sent from the R help mailing list archive at Nabble.com.