[R] ks.test troubles

Uwe Ligges ligges at statistik.tu-dortmund.de
Sat Mar 8 21:55:07 CET 2008



Martin Kaffanke wrote:
> Am Samstag, den 08.03.2008, 19:34 +0100 schrieb Uwe Ligges:
>> Martin Kaffanke wrote:
>>> Hi there!
>>>
>>> I have two little different data.  One is a computer test on people, the
>>> other is a paper and pencil test.  two boxplots show me that the data is
>>> almost the same.
>>>
>>> So now I'd like to know if I could handle all data as one, by testing
>>> with ks.test:
>>>
>>> ====
>>>> ks.test(el$angststoer, fl$angststoer)
>>>         Two-sample Kolmogorov-Smirnov test
>>>
>>> data:  el$angststoer and fl$angststoer 
>>> D = 0.1413, p-value = 0.9112
>>> alternative hypothesis: two-sided 
>>>
>>> Warning message:
>>> In ks.test(el$angststoer, fl$angststoer) :
>>>   cannot compute correct p-values with ties
>>> ====
>>>
>>> Ok, so how can I get the p-value?
>>
>> You already got it, it is 0.9112, but since you have ties in your data, 
>> R warns you about it (it's not an error, just a warning). And indeed, 
>> you have some ties in fl$angststoer.
> 
> Thanks,
> But what exactly are ties? 

Theory says that there are no ties (german: Bindungen), hence R warns. 
But with p=0.9, I won't care about that at all. It might be interesting 
with p close to alpha.

In fact, I also would not "handle all data as one", but simply add a 
factor variable that distinguishes computer from pencil tests so you can 
inspect it in further analyses...

Uwe Ligges



> And how can I interpret the answers?  (I'm
> not sure about the german translation, and the translations don't give
> me a hint.
> 
> Thanks,
> Martin
> 
>> Uwe Ligges
>>
>>
>>
>>> I tried two tests:
>>> ====
>>>> ks.test(fl$angststoer, "dnorm")
>>>         One-sample Kolmogorov-Smirnov test
>>>
>>> data:  fl$angststoer 
>>> D = 0.8109, p-value < 2.2e-16
>>> alternative hypothesis: two-sided 
>>>
>>> Warning message:
>>> In ks.test(fl$angststoer, "dnorm") :
>>>   cannot compute correct p-values with ties
>>> ====
>>>
>>> so I see that this message in the first one, depends on fl$angststoer.
>>>
>>> The I have on this two vectors:
>>>
>>> ====
>>>> fl$angststoer
>>>  [1]  1.22184871 -0.30103000  1.00000000 -1.30103000  0.69897000
>>> -0.30103000
>>>  [7] -2.30103000 -1.00000000 -2.00000000  0.22184832 -1.77819468
>>> -0.30103000
>>> [13] -2.00000000 -0.30103000 -0.30103000  0.22184832 -0.90308999
>>> -1.14611935
>>> [19] -1.30103000 -3.20411998 -0.60205999 -2.25531594 -3.60205999
>>> -1.30103000
>>> [25] -2.30103000 -0.07918038 -2.14599777  0.74472745 -3.30103000
>>> -0.30103000
>>> [31] -0.30103000 -4.30103000 -0.60205999 -0.14612847 -1.30103000
>>> -1.30103000
>>> [37]  0.00000000 -0.17609234 -0.47711908 -1.77819468 -1.00000000
>>> -1.20411998
>>> [43] -0.07918038 -2.00000000 -2.00000000 -1.30103000
>>>> el$angststoer
>>>  [1] -2.2407100 -2.8601209 -0.5005659 -2.4007721 -0.3474336 -2.6653452
>>>  [7]  0.6548865 -1.6281751 -1.2940679 -0.1316566 -1.4541612 -1.6560206
>>> [13] -0.7441850  0.8219399  0.1746081 -1.2314248 -3.8910969  0.1328448
>>> [19] -1.8439508 -0.8833972 -0.4936052 -0.1664593 -0.8694749 -2.8253588
>>> ====
>>>
>>> Doesn't seem to be a problem?
>>>
>>> What can I do for a good computation?
>>>
>>> Thanks,
>>> Martin
>>>
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list