[Rd] bug in cor.test(method = "spearman")

savicky at cs.cas.cz savicky at cs.cas.cz
Sat Jan 17 16:19:50 CET 2009


Dear R developers:

There is a possible bug in calculating the p-value
for Spearman's rank correlation.

Line 155 in file
  R-patched/src/library/stats/R/cor.test.R
is
     as.double(round(q) + lower.tail),
I think, it should be
     as.double(round(q) + 2*lower.tail),

The reason is that round(q) is expected to be an even number
(the S statistic), so the next feasible value is round(q)+2.

For demonstrating the effect, consider the code

  x <- 1:17
  y <- c(8:12,1:7,13:17)
  out1 <- cor.test(x,y,method="spearman")
  out2 <- cor.test(x,rev(y),method="spearman")
  rbind(
    c(out1$statistic, out1$estimate, p.val=out1$p.value),
    c(out2$statistic, out2$estimate, p.val=out2$p.value))

Output in R version 2.8.1 Patched (2009-01-16 r47630) is

          S        rho      p.val
  [1,]  420  0.4852941 0.04968029
  [2,] 1212 -0.4852941 0.05032193

After correction

          S        rho      p.val
  [1,]  420  0.4852941 0.05032193
  [2,] 1212 -0.4852941 0.05032193

Since 420 and 1212 are symmetric around n*(n^2-1)/6 = 816, the
correlation rho has the opposite value (this is correct) and
p-values should be the same (this is wrong before correction).

The exact p-value for n = 17 and S = 420 in 7 digit precision
is 0.05030371. Hence, the corrected value not only satisfies
symmetricity, but is indeed more accurate.

All the best, Petr.



More information about the R-devel mailing list