# [R] How to interpret Kolmogorov-Smirnov stats

Greg Snow Greg.Snow at imail.org
Sat Jul 30 20:23:46 CEST 2011

```Which package is gofstat in?  can you show us your data, or some details about your data?

Note that the KS test (and all goodness of fit tests) are rule out tests, they can show that the data is unlikely to come from a distribution, but can never prove that it does come from a distribution.

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of baxy77
Sent: Friday, July 29, 2011 4:44 AM
To: r-help at r-project.org
Subject: [R] How to interpret Kolmogorov-Smirnov stats

Hi,

Interpretation problem ! so what i did is by using the:

>fit1 <- fitdist(vectNorm,"beta")

Warning messages:
1: In dbeta(x, shape1, shape2, log) : NaNs produced
2: In dbeta(x, shape1, shape2, log) : NaNs produced
3: In dbeta(x, shape1, shape2, log) : NaNs produced
4: In dbeta(x, shape1, shape2, log) : NaNs produced
5: In dbeta(x, shape1, shape2, log) : NaNs produced
6: In dbeta(x, shape1, shape2, log) : NaNs produced

##Is this a real problem - the input contains of 900 data points of which 6
caused this message#

got the following shape parameters for my distribution:

>summary(fit1)

Fitting of the distribution ' beta ' by maximum likelihood
Parameters :
estimate Std. Error
shape1   2.148779  0.1458042
shape2 810.067515 61.8608126
Loglikelihood:  1917.51   AIC:  -3831.02   BIC:  -3823.15
Correlation matrix:
shape1    shape2
shape1 1.0000000 0.8880194
shape2 0.8880194 1.0000000

now if i do :

>gofstat(fit1, print.test=TRUE)

Kolmogorov-Smirnov statistic:  0.06630064
Kolmogorov-Smirnov test:  not rejected
The result of this test may be too conservative as it
assumes that the distribution parameters are known
Cramer-von Mises statistic:  0.3866663
Crame-von Mises test: not calculated
Anderson-Darling statistic:  2.820576
Anderson-Darling test: not calculated

So then what i did is I bootstrapped the data based on gathered parameters a
i b :

r_beta <- rbeta(378, 2.148779, 810.067515, ncp = 0);
ks.boot(vectNorm, r_beta, nboots=1000, alternative = c("two.sided", "less",
"greater"), print.level=0)

and got :
\$ks.boot.pvalue
[1] 0.002

\$ks

Two-sample Kolmogorov-Smirnov test

data:  Tr and Co
D = 0.1323, p-value = 0.002684
alternative hypothesis: two.sided

\$nboots
[1] 1000

attr(,"class")
[1] "ks.boot"

So I'm not a stats type of a guy so I need some reassurance that i did this
by the book. and also the part that confuses me and which i do not
understand is if Kolmogorov-Smirnov statistic reports the difference of
0.06630064 which indicate that my data fits the beta quite well, why the
bootstrap rejects the hypothesis that both data sets come from the same
population. Or did i misunderstood something??? Please do correct me.

Furthermore, should I be worried that other two tests were not computed ?

Thank you

baxy

--
View this message in context: http://r.789695.n4.nabble.com/How-to-interpret-Kolmogorov-Smirnov-stats-tp3703655p3703655.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

```