[Rd] bug in cor.test(method = "spearman")
savicky at cs.cas.cz
savicky at cs.cas.cz
Sat Jan 17 16:19:50 CET 2009
Dear R developers:
There is a possible bug in calculating the p-value
for Spearman's rank correlation.
Line 155 in file
R-patched/src/library/stats/R/cor.test.R
is
as.double(round(q) + lower.tail),
I think, it should be
as.double(round(q) + 2*lower.tail),
The reason is that round(q) is expected to be an even number
(the S statistic), so the next feasible value is round(q)+2.
For demonstrating the effect, consider the code
x <- 1:17
y <- c(8:12,1:7,13:17)
out1 <- cor.test(x,y,method="spearman")
out2 <- cor.test(x,rev(y),method="spearman")
rbind(
c(out1$statistic, out1$estimate, p.val=out1$p.value),
c(out2$statistic, out2$estimate, p.val=out2$p.value))
Output in R version 2.8.1 Patched (2009-01-16 r47630) is
S rho p.val
[1,] 420 0.4852941 0.04968029
[2,] 1212 -0.4852941 0.05032193
After correction
S rho p.val
[1,] 420 0.4852941 0.05032193
[2,] 1212 -0.4852941 0.05032193
Since 420 and 1212 are symmetric around n*(n^2-1)/6 = 816, the
correlation rho has the opposite value (this is correct) and
p-values should be the same (this is wrong before correction).
The exact p-value for n = 17 and S = 420 in 7 digit precision
is 0.05030371. Hence, the corrected value not only satisfies
symmetricity, but is indeed more accurate.
All the best, Petr.
More information about the R-devel
mailing list