[Rd] about mantelhaen.test (PR#7779)
Thomas Lumley
tlumley at u.washington.edu
Thu Apr 7 17:35:10 CEST 2005
Thanks for the report and code.
There's two parts to this: a suggested enhancement to show components of
the (R-1)(C-1) df "general association" test that are useful for ordinal
variables, and a bug report on the test itself: R gives 35.9 and the
suggested code (and apparently SAS) give 10.6. I'll look into the bug.
To be a proper replacement for mantelhaen.test your code would need to
return an object of class "htest", the way the other test functions do.
If the function is going to compute the test statistics for ordinal data
It would also be better to have the option to test for differences in
column means rather than row means, and to specify scores.
-thomas
On Thu, 7 Apr 2005 chienyu at stat.sinica.edu.tw wrote:
> Full_Name: Chien-yu Peng
> Version: 2.0.1
> OS: Windows XP Professional
> Submission from: (NULL) (140.109.72.181)
>
>
> Dear all:
> Although I don't know you, I am thankful for your help.
> When I use the function mantelhaen.test for R x C x K (R, C > 2) table,
> the output is not the same as SAS's. I don't know that the result consist with
> one of SAS's. But it works correctly for 2 x 2 x K table.
> In addition, the function mantelhaen.test does not contain such as test for
> general association, test for row means scores differ, and test for nonzero
> correlation in SAS.
>
> Could you tell me "Is it something wrong in the function (mantelhaen.test) for R
> x C x K (R, C > 2) table"?
>
> Therefore, I modify part of the function such that output looks as the same as
> SASs.
>
> mh.test <- function(x) {
> if (any(apply(x, 3, sum) < 2))
> stop("sample size in each stratum must be > 1")
> I <- dim(x)[1]; J <- dim(x)[2]; K <- dim(x)[3]
> df_GMH <- (I - 1) * (J - 1); df_SMH <- I - 1; df_CSMH <- 1
> A_GMH <- cbind(diag(I-1), 0) %x% cbind(diag(J-1), 0)
> A_SMH <- cbind(diag(I-1), 0) %x% t(1:J)
> A_CSMH <- t(1:I) %x% t(1:J)
> Y_GMH <- matrix(0, nc = 1, nr = df_GMH)
> Y_SMH <- matrix(0, nc = 1, nr = df_SMH)
> Y_CSMH <- matrix(0, nc = 1, nr = df_CSMH)
> S_GMH <- matrix(0, nc = df_GMH, nr = df_GMH)
> S_SMH <- matrix(0, nc = df_SMH, nr = df_SMH)
> S_CSMH <- matrix(0, nc = df_CSMH, nr = df_CSMH)
> for(k in 1:K) {
> V <- NULL
> f <- x[, , k]
> ntot <- sum(f)
> p_ip <- apply(f, 1, sum) / ntot
> p_pj <- apply(f, 2, sum) / ntot
> m <- p_ip %x% p_pj * ntot
> V <- ntot^2 * ((diag(p_ip) - p_ip %*% t(p_ip)) %x% (diag(p_pj) - p_pj
> %*% t(p_pj))) / (ntot-1)
> Y_GMH <- Y_GMH + A_GMH %*% (c(t(f)) - m)
> Y_SMH <- Y_SMH + A_SMH %*% (c(t(f)) - m)
> Y_CSMH <- Y_CSMH + A_CSMH %*% (c(t(f)) - m)
> S_GMH <- S_GMH + A_GMH %*% V %*% t(A_GMH)
> S_SMH <- S_SMH + A_SMH %*% V %*% t(A_SMH)
> S_CSMH <- S_CSMH + A_CSMH %*% V %*% t(A_CSMH)
> }
> Q_GMH <- t(Y_GMH) %*% solve(S_GMH) %*% Y_GMH
> Q_SMH <- t(Y_SMH) %*% solve(S_SMH) %*% Y_SMH
> Q_CSMH <- t(Y_CSMH) %*% solve(S_CSMH) %*% Y_CSMH
> STATISTIC <- c(Q_CSMH, Q_SMH, Q_GMH)
> PARAMETER <- c(df_CSMH, df_SMH, df_GMH)
> PVAL <- pchisq(STATISTIC, PARAMETER, lower = FALSE)
> result <- cbind(STATISTIC, PARAMETER, PVAL)
> rownames(result) <- list("Nonzero Correlation", "Row Mean Scores Differ",
> "General Association")
> colnames(result) <- list("Statistic", "df", "p-value")
> return (result)
> }
>
> x <- array(c(23,23,20,24,7,10,13,10,2,5,5,6,18,18,13,9,6,6,13,15,1,2,2,2,8,12,11,7,6,4,6,7,3,4,2,4,12,15,14,13,9,3,8,6,1,2,3,4),
> dim = c(4,3,4))
>
> mh.test(x)
> Statistic df p-value
> Nonzero Correlation 6.34043 1 0.01180163
> Row Mean Scores Differ 6.59013 3 0.08617498
> General Association 10.59828 6 0.10161441
>
> ______________________________________________
> R-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
More information about the R-devel
mailing list