[Rd] kendall tau correlation test for ties: Potential error (PR#8076)
dkoschuetzki@gmx.de
dkoschuetzki at gmx.de
Thu Aug 18 13:43:29 CEST 2005
Full_Name: Dirk Koschuetzki
Version: 2.1.1
OS: source code
Submission from: (NULL) (194.94.136.34)
Hello,
>From the source code (R-2.1.1, file: .../R-2.1.1/src/library/stats/R/)
******************************
cor.test.default <-
function(x, y, alternative = c("two.sided", "less", "greater"),
method = c("pearson", "kendall", "spearman"), exact = NULL,
conf.level = 0.95, ...)
{
alternative <- match.arg(alternative)
method <- match.arg(method)
DNAME <- paste(deparse(substitute(x)), "and", deparse(substitute(y)))
if(length(x) != length(y))
stop("'x' and 'y' must have the same length")
OK <- complete.cases(x, y)
x <- x[OK]
y <- y[OK]
n <- length(x)
PVAL <- NULL
NVAL <- 0
conf.int <- FALSE
if(method == "pearson") {
// Omitted
}
else {
if(n < 2)
stop("not enough finite observations")
PARAMETER <- NULL
TIES <- (min(length(unique(x)), length(unique(y))) < n)
if(method == "kendall") {
method <- "Kendall's rank correlation tau"
names(NVAL) <- "tau"
r <- cor(x,y, method = "kendall")
ESTIMATE <- c(tau = r)
if(!is.finite(ESTIMATE)) { # all x or all y the same
ESTIMATE[] <- NA
STATISTIC <- c(T = NA)
PVAL <- NA
}
else {
if(is.null(exact))
exact <- (n < 50)
if(exact && !TIES) {
q <- round((r + 1) * n * (n - 1) / 4)
pkendall <- function(q, n) {
.C("pkendall",
length(q),
p = as.double(q),
as.integer(n),
PACKAGE = "stats")$p
}
PVAL <-
switch(alternative,
"two.sided" = {
if(q > n * (n - 1) / 4)
p <- 1 - pkendall(q - 1, n)
else
p <- pkendall(q, n)
min(2 * p, 1)
},
"greater" = 1 - pkendall(q - 1, n),
"less" = pkendall(q, n))
STATISTIC <- c(T = q)
} else {
STATISTIC <- c(z = r / sqrt((4 * n + 10) / (9 * n*(n-1))))
p <- pnorm(STATISTIC)
if(exact && TIES)
warning("Cannot compute exact p-value with ties")
}
}
} else {
// OMITTED
}
}
if(is.null(PVAL)) # for "pearson" only, currently
PVAL <- switch(alternative,
"less" = p,
"greater" = 1 - p,
"two.sided" = 2 * min(p, 1 - p))
RVAL <- list(statistic = STATISTIC,
parameter = PARAMETER,
p.value = as.numeric(PVAL),
estimate = ESTIMATE,
null.value = NVAL,
alternative = alternative,
method = method,
data.name = DNAME)
if(conf.int)
RVAL <- c(RVAL, list(conf.int = cint))
class(RVAL) <- "htest"
RVAL
}
*************
Please look at the computation of the p-value for Kendalls tau. There is an
assignment to "p" right above the warning. In the bottom of the function there
is a comment that for the pearson case we have to use the modification and set
PVAL.
The problem is:
* Either the comment is wrong because the modification should be done with
kendall too, or
* The variable PVAL has to be assigned in the kendall block.
I hope this is clear so far.
Please send me some comments, because I'm not sure if my observation is ok. And
currently I try to figure out the significance in the biserial case which of
course makes heavy use of the tied case.
Cheers,
Dirk
More information about the R-devel
mailing list