[R] Problems with ks.test()
Peter Ehlers
ehlers at ucalgary.ca
Mon Aug 1 10:25:29 CEST 2011
(I'm replying to your original post because your follow-up omits the
context.)
The K-S test is designed for continuous distributions. You have far
too many zeros in your data to get anything reasonable out of the
test. For your data, the K-S statistic is the difference in the
(e)cdfs at zero. Your results just show that this can be sensitive
to the degree of rounding used for the theoretical cdf.
Peter Ehlers
On 2011-07-29 02:07, Jochen1980 wrote:
> Hi,
>
> I got two data point vectors. Now I want to make a ks.test(). I you print
> both vectors you will see, that they fit pretty fine. Here is a picture:
> http://www.jochen-bauer.net/downloads/kstest-r-help-list-plot.png
>
> As you can see there is one histogram and moreover there is the gumbel
> density
> function plotted. Now I took to bin-mids and the bin-height for vector1 and
> computed the distribution-values to all bin-mids as vector2.
>
> I pass these two vectors to ks.test(). Are those the right vectors, if I
> want
> to decide afterwards, if my experiment-data is gumbel-distributed?
>
> Surprisingly the p-value changes tremendously if I calculate more digits out
> of
> my theoretical formula. If I round to 0 digits, p is 1, if I round to 4
> digits,
> p drops to 0 - how could this happen, I thought more digits will bring more
> accurate results?!
>
> XXXX Case 0 digits: XXXXXXXXXXXXXXXXXXXXXXXXXXX
> [1] 0 0 0 0 0 24 74 98 133 147 134 120 89 69 46 31 16
> 7
> [19] 7 3 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0
> [37] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0
> [55] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0
> [73] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0
> [91] 0 0 0 0 0 0 0 0 0 0
> [1] 0 0 0 0 1 10 49 113 160 168 147 113 81 55 37 24 15
> 10
> [19] 6 4 2 2 1 1 0 0 0 0 0 0 0 0 0 0 0
> 0
> [37] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0
> [55] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0
> [73] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0
> [91] 0 0 0 0 0 0 0 0 0 0
> [1] "Ergebnisse"
> [1] "Analyse der Eingangsdaten"
> [1] "Mean: 0.104537195"
> [1] "SAbw.: 0.0277657985898433"
> [1] "Parameter-Berechnung der Daten bei angenommener Gumbelverteilung"
> [1] "Mue: 0.0920411082987717"
> [1] "Beta: 0.0216489043196013"
> [1] "KS-Test -> 1000 Werte, 100 Bins, x: Klassenmitten, y1, y2 =
> Histogrammhöhen"
> [1] "KST D: 0.04"
> [1] "KST P: 1"
>
> XXX Case 4 digits: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> [1] 0 0 0 0 0 24 74 98 133 147 134 120 89 69 46 31 16
> 7
> [19] 7 3 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0
> [37] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0
> [55] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0
> [73] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0
> [91] 0 0 0 0 0 0 0 0 0 0
> [1] 0.000 0.000 0.000 0.006 0.622 10.094 49.271 112.776
> 160.174
> [10] 168.419 146.527 113.137 81.026 55.344 36.690 23.870 15.347
> 9.793
> [19] 6.220 3.939 2.490 1.572 0.992 0.625 0.394 0.248
> 0.157
> [28] 0.099 0.062 0.039 0.025 0.016 0.010 0.006 0.004
> 0.002
> [37] 0.002 0.001 0.001 0.000 0.000 0.000 0.000 0.000
> 0.000
> [46] 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
> 0.000
> [55] 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
> 0.000
> [64] 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
> 0.000
> [73] 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
> 0.000
> [82] 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
> 0.000
> [91] 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
> 0.000
> [100] 0.000
> [1] "Ergebnisse"
> [1] "Analyse der Eingangsdaten"
> [1] "Mean: 0.104537195"
> [1] "SAbw.: 0.0277657985898433"
> [1] "Parameter-Berechnung der Daten bei angenommener Gumbelverteilung"
> [1] "Mue: 0.0920411082987717"
> [1] "Beta: 0.0216489043196013"
> [1] "KS-Test -> 1000 Werte, 100 Bins, x: Klassenmitten, y1, y2 =
> Histogrammhöhen"
> [1] "KST D: 0.2"
> [1] "KST P: 0.0366"
>
> Thanks in advance for some help.
> Jochen
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Problems-with-ks-test-tp3703469p3703469.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list