[R] How to interpret Kolmogorov-Smirnov stats

Sat Jul 30 20:23:46 CEST 2011

Which package is gofstat in?  can you show us your data, or some details about your data?

Note that the KS test (and all goodness of fit tests) are rule out tests, they can show that the data is unlikely to come from a distribution, but can never prove that it does come from a distribution.

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of baxy77
Sent: Friday, July 29, 2011 4:44 AM
To: r-help at r-project.org
Subject: [R] How to interpret Kolmogorov-Smirnov stats

Hi,

Interpretation problem ! so what i did is by using the:

>fit1 <- fitdist(vectNorm,"beta")

Warning messages:
1: In dbeta(x, shape1, shape2, log) : NaNs produced
2: In dbeta(x, shape1, shape2, log) : NaNs produced
3: In dbeta(x, shape1, shape2, log) : NaNs produced
4: In dbeta(x, shape1, shape2, log) : NaNs produced
5: In dbeta(x, shape1, shape2, log) : NaNs produced
6: In dbeta(x, shape1, shape2, log) : NaNs produced

##Is this a real problem - the input contains of 900 data points of which 6
caused this message#

got the following shape parameters for my distribution:

>summary(fit1)

Fitting of the distribution ' beta ' by maximum likelihood 
Parameters : 
         estimate Std. Error
shape1   2.148779  0.1458042
shape2 810.067515 61.8608126
Loglikelihood:  1917.51   AIC:  -3831.02   BIC:  -3823.15 
Correlation matrix:
          shape1    shape2
shape1 1.0000000 0.8880194
shape2 0.8880194 1.0000000

now if i do :

>gofstat(fit1, print.test=TRUE)

Kolmogorov-Smirnov statistic:  0.06630064 
Kolmogorov-Smirnov test:  not rejected 
   The result of this test may be too conservative as it  
   assumes that the distribution parameters are known
Cramer-von Mises statistic:  0.3866663 
Crame-von Mises test: not calculated 
Anderson-Darling statistic:  2.820576 
Anderson-Darling test: not calculated 

So then what i did is I bootstrapped the data based on gathered parameters a
i b :

r_beta <- rbeta(378, 2.148779, 810.067515, ncp = 0);
ks.boot(vectNorm, r_beta, nboots=1000, alternative = c("two.sided", "less",
"greater"), print.level=0)

and got :
$ks.boot.pvalue
[1] 0.002

$ks

	Two-sample Kolmogorov-Smirnov test

data:  Tr and Co 
D = 0.1323, p-value = 0.002684
alternative hypothesis: two.sided 

$nboots
[1] 1000

attr(,"class")
[1] "ks.boot"

So I'm not a stats type of a guy so I need some reassurance that i did this
by the book. and also the part that confuses me and which i do not
understand is if Kolmogorov-Smirnov statistic reports the difference of 
0.06630064 which indicate that my data fits the beta quite well, why the
bootstrap rejects the hypothesis that both data sets come from the same
population. Or did i misunderstood something??? Please do correct me.

Furthermore, should I be worried that other two tests were not computed ?

Thank you

baxy

--
View this message in context: http://r.789695.n4.nabble.com/How-to-interpret-Kolmogorov-Smirnov-stats-tp3703655p3703655.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.