[R] cor.test() -> p-values may be incorrect due to tie
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Fri Mar 19 17:30:49 CET 2004
Jan Verbesselt <Jan.Verbesselt at agr.kuleuven.ac.be> writes:
> Hi R specialists,
>
> When testing the association between two time series the cor.test gives
> the following message...-> p-values may be incorrect due to tie
>
> What does it mean? (it is not described in the help)
It means what it says... The p-values in the test for rho=0 is based
on the assumption that the ranks are 1:n for both variables. In the
presence of ties (multiple x or y having the same value) we calculate
a modified rho, but we still use the same formula for the p-value.
There are really two issues: there's a nice theory that allows you to
calculate the exact p-value when ties are absent. This becomes much
harder when there are ties. However, there's also an asymptotic
approximation to a normal distribution, and I believe that that would
actually be rather easy to compute in the tied cases, but we don't do
that either.
I don't think you have anything to worry about with the example you
provide, though.
> > cor.test(Origi[,1],Origi[,2], alternative = c("two.sided"),method =
> c("spearman"), conf.level = 0.95)
>
> Spearman's rank correlation rho
>
> data: Origi[, 1] and Origi[, 2]
> S = 101457, p-value = < 2.2e-16
> alternative hypothesis: true rho is not equal to 0
> sample estimates:
> rho
> 0.8938577
>
> Warning message:
> p-values may be incorrect due to ties in: cor.test.default(Origi[, 1],
> Origi[, 2], alternative = c("two.sided"),
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list