[R] spearman rank correlation problem
William T Morgan
wmorgan at mitre.org
Mon Mar 15 22:37:08 CET 2004
Hello R gurus,
I want to calculate the Spearman rho between two ranked lists. I am
getting results with cor.test that differ in comparison to my own
spearman function:
> my.spearman
function(l1, l2) {
if(length(l1) != length(l2)) stop("lists must have same length")
r1 <- rank(l1)
r2 <- rank(l2)
dsq <- sapply(r1-r2,function(x) x^2)
1 - ((6 * sum(dsq)) / (length(l1) * (length(l1)^2 - 1)))
}
Perhaps I'm doing something wrong in that code, but it's a pretty
straightforward calculation, so it's hard to see what, especially with
rank() handling the ties correctly. One example difference:
> a
[1] 0.112761940 0.130260949 -0.010567817 -0.411906701 0.004588443
[6] -0.034337846 -0.148082981 -0.243724351 0.186690390 0.408983820
> b
[1] 8 13 14 15 5 7 8 2 19 19
> cor.test(a,b,method="spearman")
Spearman's rank correlation rho
data: a and b
S = 85, p-value = 0.1544
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho
0.4878139
Warning message:
p-values may be incorrect due to ties in: cor.test.default(a, b, method = "spearman")
> my.spearman(a,b)
[1] 0.4909091
Which, as you can see, isn't quite the same. And also:
> c
[1] 0 0 0 0 0 0 0 0 0 0
> cor.test(a,c,method="spearman")
Spearman's rank correlation rho
data: a and c
S = NA, p-value = NA
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho
NA
Warning message:
The standard deviation is zero in: cor(x, y, na.method, method == "kendall")
> my.spearman(a,c)
[1] 0.5
Any suggestions as to what I'm doing wrong?
Thanks in advance,
--
William Morgan
wmorgan at mitre dot org
More information about the R-help
mailing list