[Rd] about mantelhaen.test (PR#7779)
Kurt Hornik
Kurt.Hornik at wu-wien.ac.at
Thu Apr 7 16:46:22 CEST 2005
>>>>> chienyu writes:
> Full_Name: Chien-yu Peng
> Version: 2.0.1
> OS: Windows XP Professional
> Submission from: (NULL) (140.109.72.181)
> Dear all:
> Although I don't know you, I am thankful for your help.
> When I use the function mantelhaen.test for R x C x K (R, C > 2) table,
> the output is not the same as SAS's. I don't know that the result consist with
> one of SAS's. But it works correctly for 2 x 2 x K table.
> In addition, the function mantelhaen.test does not contain such as test for
> general association, test for row means scores differ, and test for nonzero
> correlation in SAS.
> Could you tell me "Is it something wrong in the function (mantelhaen.test) for R
> x C x K (R, C > 2) table"?
> Therefore, I modify part of the function such that output looks as the same as
> SASs.
> mh.test <- function(x) {
> if (any(apply(x, 3, sum) < 2))
> stop("sample size in each stratum must be > 1")
> I <- dim(x)[1]; J <- dim(x)[2]; K <- dim(x)[3]
> df_GMH <- (I - 1) * (J - 1); df_SMH <- I - 1; df_CSMH <- 1
> A_GMH <- cbind(diag(I-1), 0) %x% cbind(diag(J-1), 0)
> A_SMH <- cbind(diag(I-1), 0) %x% t(1:J)
> A_CSMH <- t(1:I) %x% t(1:J)
> Y_GMH <- matrix(0, nc = 1, nr = df_GMH)
> Y_SMH <- matrix(0, nc = 1, nr = df_SMH)
> Y_CSMH <- matrix(0, nc = 1, nr = df_CSMH)
> S_GMH <- matrix(0, nc = df_GMH, nr = df_GMH)
> S_SMH <- matrix(0, nc = df_SMH, nr = df_SMH)
> S_CSMH <- matrix(0, nc = df_CSMH, nr = df_CSMH)
> for(k in 1:K) {
> V <- NULL
> f <- x[, , k]
> ntot <- sum(f)
> p_ip <- apply(f, 1, sum) / ntot
> p_pj <- apply(f, 2, sum) / ntot
> m <- p_ip %x% p_pj * ntot
> V <- ntot^2 * ((diag(p_ip) - p_ip %*% t(p_ip)) %x% (diag(p_pj) - p_pj
> %*% t(p_pj))) / (ntot-1)
> Y_GMH <- Y_GMH + A_GMH %*% (c(t(f)) - m)
> Y_SMH <- Y_SMH + A_SMH %*% (c(t(f)) - m)
> Y_CSMH <- Y_CSMH + A_CSMH %*% (c(t(f)) - m)
> S_GMH <- S_GMH + A_GMH %*% V %*% t(A_GMH)
> S_SMH <- S_SMH + A_SMH %*% V %*% t(A_SMH)
> S_CSMH <- S_CSMH + A_CSMH %*% V %*% t(A_CSMH)
> }
> Q_GMH <- t(Y_GMH) %*% solve(S_GMH) %*% Y_GMH
> Q_SMH <- t(Y_SMH) %*% solve(S_SMH) %*% Y_SMH
> Q_CSMH <- t(Y_CSMH) %*% solve(S_CSMH) %*% Y_CSMH
> STATISTIC <- c(Q_CSMH, Q_SMH, Q_GMH)
> PARAMETER <- c(df_CSMH, df_SMH, df_GMH)
> PVAL <- pchisq(STATISTIC, PARAMETER, lower = FALSE)
> result <- cbind(STATISTIC, PARAMETER, PVAL)
> rownames(result) <- list("Nonzero Correlation", "Row Mean Scores Differ",
> "General Association")
> colnames(result) <- list("Statistic", "df", "p-value")
> return (result)
> }
> x <- array(c(23,23,20,24,7,10,13,10,2,5,5,6,18,18,13,9,6,6,13,15,1,2,2,2,8,12,11,7,6,4,6,7,3,4,2,4,12,15,14,13,9,3,8,6,1,2,3,4),
> dim = c(4,3,4))
> mh.test(x)
> Statistic df p-value
> Nonzero Correlation 6.34043 1 0.01180163
> Row Mean Scores Differ 6.59013 3 0.08617498
> General Association 10.59828 6 0.10161441
This is already fixed in R-devel, to be released as R 2.1.0 on Apr 18:
R> mantelhaen.test(x)
Cochran-Mantel-Haenszel test
data: x
Cochran-Mantel-Haenszel M^2 = 10.5983, df = 6, p-value = 0.1016
as you get, whereas in 2.0.1, quite incorrectly
Cochran-Mantel-Haenszel M^2 = 35.9735, df = 6, p-value = 2.790e-06
Best
-k
More information about the R-devel
mailing list