[R] a question about sort and BH

R. Michael Weylandt michael.weylandt at gmail.com
Tue Oct 4 21:39:19 CEST 2011


On Mon, Oct 3, 2011 at 10:08 PM, chunjiang he <camelbbs at gmail.com> wrote:
> Hi,
>
> I have two questions want to ask.
>
> 1. If I have a matrix like this, and I want to figure out the rows whose
> value in the 3rd column are less than 0.05. How can I do it with R.
> hsa-let-7a--MBTD1    0.528239197    2.41E-05
> hsa-let-7a--APOBEC1    0.507869409    5.51E-05
> hsa-let-7a--PAPOLA    0.470451884    0.000221774
> hsa-let-7a--NF2    0.469280186    0.000231065
> hsa-let-7a--SLC17A5    0.454597978    0.000381713
> hsa-let-7a--THOC2    0.447714054    0.000479322
> hsa-let-7a--SMG7    0.444972282    0.000524129
>

Suppose your data is "d": then try which(d[,3] < 0.05)

> 2. I got the p.adjust.R from R source. In the method "BH", I am not clear
> with the code:
>           i <- lp:1L


# Just the same as seq(lp, 1 , by = -1)


>           o <- order(p, decreasing = TRUE)
>           ro <- order(o)
>           pmin(1, cummin( n / i * p[o] ))[ro]

# pmin does parallel minimums, p[o] is the same as sort(p) and
ordering by [ro] puts the outputted values in reverse order than the
went in.

As an exercise, I'd suggest you get the original paper, see how the
calculation is done there, implement it in R as best you can, even if
it seems loop-y, and refine it down to R Core's implementation. One of
the best ways I know to learn to think vectorwise.

Sorry I can't help more, but I don't know the method so I dont want to
read too much into the code and say something that I havent thought
through (Lord knows I do that enough on this list!!)

Michael

>


> How to explain the first and the fourth row.
> ====================p.adjust.R=======================================
> p.adjust.methods <-
>    c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none")
> p.adjust <- function(p, method = p.adjust.methods, n = length(p))
> {
>    ## Methods 'Hommel', 'BH', 'BY' and speed improvements contributed by
>    ## Gordon Smyth <smyth at wehi.edu.au>.
>    method <- match.arg(method)
>    if(method == "fdr") method <- "BH" # back compatibility
>    nm <- names(p)
>    p <- as.numeric(p); names(p) <- nm
>    p0 <- p
>    if(all(nna <- !is.na(p))) nna <- TRUE
>    p <- p[nna]
>    lp <- length(p)
>    stopifnot(n >= lp)
>    if (n <= 1) return(p0)
>    if (n == 2 && method == "hommel") method <- "hochberg"
>    p0[nna] <-
>  switch(method,
>        bonferroni = pmin(1, n * p),
>        holm = {
>     i <- seq_len(lp)
>     o <- order(p)
>     ro <- order(o)
>     pmin(1, cummax( (n - i + 1L) * p[o] ))[ro]
>        },
>        hommel = { ## needs n-1 >= 2 in for() below
>     if(n > lp) p <- c(p, rep.int(1, n-lp))
>     i <- seq_len(n)
>     o <- order(p)
>     p <- p[o]
>     ro <- order(o)
>     q <- pa <- rep.int( min(n*p/i), n)
>     for (j in (n-1):2) {
>         ij <- seq_len(n-j+1)
>         i2 <- (n-j+2):n
>         q1 <- min(j*p[i2]/(2:j))
>         q[ij] <- pmin(j*p[ij], q1)
>         q[i2] <- q[n-j+1]
>         pa <- pmax(pa,q)
>     }
>     pmax(pa,p)[if(lp < n) ro[1:lp] else ro]
>        },
>        hochberg = {
>     i <- lp:1L
>     o <- order(p, decreasing = TRUE)
>     ro <- order(o)
>     pmin(1, cummin( (n - i + 1L) * p[o] ))[ro]
>        },
>        BH = {
>     i <- lp:1L
>     o <- order(p, decreasing = TRUE)
>     ro <- order(o)
>     pmin(1, cummin( n / i * p[o] ))[ro]
>        },
>        BY = {
>     i <- lp:1L
>     o <- order(p, decreasing = TRUE)
>     ro <- order(o)
>     q <- sum(1L/(1L:n))
>     pmin(1, cummin(q * n / i * p[o]))[ro]
>        },
>        none = p)
>    p0
> }
> ============================================================
>
>
> I wrote a code to do my work in BH correction like the following:
>
> rm(list=ls())
> a<-read.csv("test.txt",sep="\t",header=F,quote="")
> b<-a[order(a[,3],decreasing=TRUE),]
> c<-p.adjust(b[,3],method="BH")
> b[,4]<-c
> write.table(b,"zz.txt",sep="\t")
>
> Is that right? Thanks for all.
>
> Jiang
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list