[R] Kolmogorov-Smirnov test
m.marcinmichal
m.marcinmichal at gmail.com
Thu Apr 28 23:53:39 CEST 2011
Hi,
thanks for response.
>> The Kolmogorov-Smirnov test is designed for distributions on continuous
>> variable, not discrete like the >> poisson. That is why you are getting
>> some of your warnings.
I read in "Fitting distributions whith R" Vito Ricci page 19 that: "...
Kolmogorov-Smirnov test is used to decide if a sample comes from a
population with a specific distribution. I can be applied both for discrete
(count) data and continuous binned (even if some Authors do not agree on
this point) and both for continuous variables" but in page 16 i read that
"... while the Kolmogorov-Smirnov and Anderson-Darling tests are restricted
to continuous distribution" and i was little confused, but try this test to
my discrete data.
Generally in first step, I try fit my data to discret or continuous
distribution (task: find distribution for emirical data). Question, Can I
approximate my discret data by the continuous distribution? I know that
sometmies we can poisson distribution approxime by the normal distribution.
But what happen if I use another distribution like log normall or gama?
I done another three tests - chi square test. But this tests return three
another results. Suppose that we have the same data i.e vectorSentence.
Test:
1. One
param <- fitdistr(vectorSentence, "poisson")
chisq.test(table(vectorSentence), p = dpois(1:9, lambda=param[[1]][1]),
rescale.p = TRUE)
X-squared = 272.8958, df = 8, p-value < 2.2e-16
2. Two
library(vcd)
gf <- goodfit(vectorSentence, type="poisson", method="MinChisq")
summary(gf)
X^2 df P(> X^2)
Pearson 404.3607 8 2.186332e-82
3. Three
fdistc <- fitdist(vectorSentence, "pois")
g<-gofstat(fdistc, print.test = TRUE)
Chi-squared statistic: 535.344
Degree of freedom of the Chi-squared distribution: 8
Chi-squared p-value: 1.824112e-110
Question which results is correct?
I know that I can reject null hipotesis: data don't come from poisson
distribution. But which result is correct?
For another side I trying to accomplish another problem:
1. Suppose that we have a reference data (dr) from some process (pr) which
save in vectorSentence.
2. Suppose that we have a two another sample data d1, d2 from another two
process p1, p2
3. We know that all data is discrete.
Task:
One: check if data d1, d2 is equal to reference data (dr) - this is not a
problem. I use a cdf, histogram, another mensure etc. chi square test. But
can I use Kolmogorov-Smirnov to test cumulative distribution function
hipotesis i.e F(d1) = F(d) for my data?
Two: find dr distributions discret or if possible continuous
Best
Marcin M.
--
View this message in context: http://r.789695.n4.nabble.com/Kolmogorov-Smirnov-test-tp3479506p3482349.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list