[Bioc-devel] vectorize default dist2 function in genefilter

Wolfgang Huber whuber at embl.de
Tue Mar 12 22:11:52 CET 2013


Dear James

Thank you. What would the saved time be (e.g. compared to the overall runtime of arrayQualityMetrics)? I would be surprised if the saving was worth the added complexity, but am always happy to be surprised.

A patch of the .R and .Rd file would be most welcome and expedite the change.

Btw, colSums apparently also works with 3-dim arrays, so both loops (over i and j) could be vectorised, however afaIcs at the cost of constructing an object of size nrow(x)^3 in memory, which might again break performance.

	Best wishes
	Wolfgang

Il giorno Mar 12, 2013, alle ore 4:43 PM, James F. Reid <reidjf at gmail.com> ha scritto:

> Dear bioc-devel,
> 
> the dist2 function in genefilter defined as:
> 
> dist2 <- function (x, fun = function(a, b) mean(abs(a - b), na.rm = TRUE), diagonal = 0) {
> 
>    if (!(is.numeric(diagonal) && (length(diagonal) == 1L)))
>        stop("'diagonal' must be a numeric scalar.")
>    res = matrix(diagonal, ncol = ncol(x), nrow = ncol(x))
>    colnames(res) = rownames(res) = colnames(x)
>    if (ncol(x) >= 2) {
>        for (j in 2:ncol(x)) for (i in 1:(j - 1)) res[i, j] = res[j,
>            i] = fun(x[, i], x[, j])
>    }
>    return(res)
> }
> 
> could have it's default function vectorized as:
> 
> res <- apply(x, 2, function(i) colMeans(abs(x - i), na.rm=TRUE))
> 
> to improve performance for example in the ArrayQualityMetrics package.
> 
> Best.
> James.
> 
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel



More information about the Bioc-devel mailing list