[Rd] p.adjust(<NA>s), was "Re: [BioC] limma and p-values"
Gordon Smyth
smyth at wehi.edu.au
Sun Jan 16 09:55:35 CET 2005
I append below a suggested update for p.adjust().
1. A new method "yh" for control of FDR is included which is valid for any
dependency structure. Reference is Benjamini, Y., and Yekutieli, D. (2001).
The control of the false discovery rate in multiple testing under
dependency. Annals of Statistics 29, 1165-1188.
2. I've re-named the "fdr" method to "bh" but kept "fdr" as a synonym for
backward compatability.
3. Upper case values for method "BH" or "YH" are also accepted.
4. p.adust() now preserves attributes like names for named vectors (as does
cumsum and friends for example).
5. p.adjust() now works columnwise on numeric data.frames (as does cumsum
and friends).
6. method="hommel" now works correctly even for n=2
7. The 'n' argument is removed. Setting this argument for any methods other
than "none" or "bonferroni" make the p-values indeterminate, and the
argument seems to be seldom used. (It isn't used in the R default
distribution.) I think trying to combine this argument with NAs would get
you into lots of hot water. For example, what does
p.adjust(c(NA,NA,0.05),n=2) mean? Which 2 values should be adjusted?
8. NAs are treated in na.exclude style. This is the correct approach for
most applications. The only other consistent thing you could do would be to
treat the NAs as if they all had value=1. But then you would have to
explain clearly that the values being returned are not actually the correct
adjusted p-values, which are unknown, but are the most conservative
possible values assuming the worst-case for the missing values. This would
become arbitrarily unreasonable as the number of NAs increases.
Gordon
p.adjust.methods <-
c("holm", "hochberg", "hommel", "bonferroni", "yh", "bh", "fdr", "none")
p.adjust <- function(p, method = p.adjust.methods) {
method <- match.arg(tolower(method),p.adjust.methods)
if(method=="fdr") method <- "bh"
if(is.data.frame(p)) {
if(length(p)) for(i in 1:length(p)) p[[i]] <-
Recall(p[[i]],method=method)
return(p)
}
porig <- p
notna <- !is.na(p)
p <- as.vector(p[notna])
n <- length(p)
if (n == 1) return(porig)
if (n == 2 && method=="hommel") method <- "hochberg"
porig[notna] <- switch(method,
holm = {
i <- 1:n
o <- order(p)
ro <- order(o)
pmin(1, cummax( (n - i + 1) * p[o] ))[ro]
},
hochberg = {
i <- n:1
o <- order(p, decreasing = TRUE)
ro <- order(o)
pmin(1, cummin( (n - i + 1) * p[o] ))[ro]
},
hommel = {
i <- 1:n
s <- sort(p, index = TRUE)
p <- s$x
ro <- order(s$ix)
q <- pa <- rep.int( min(n*p/(1:n)), n)
for (j in (n-1):2) {
q1 <- min(j*p[(n-j+2):n]/(2:j))
q[1:(n-j+1)] <- pmin( j*p[1:(n-j+1)], q1)
q[(n-j+2):n] <- q[n-j+1]
pa <- pmax(pa,q)
}
pmax(pa,p)[ro]
},
bh = {
i <- n:1
o <- order(p, decreasing = TRUE)
ro <- order(o)
pmin(1, cummin( n / i * p[o] ))[ro]
},
yh = {
i <- n:1
o <- order(p, decreasing = TRUE)
ro <- order(o)
q <- sum(1/(1:n))
pmin(1, cummin(q * n/i * p[o]))[ro]
},
bonferroni = pmin(n * p, 1),
none = p)
porig
}
More information about the R-devel
mailing list