[R] efficient conversion of matrix column rows to list elements
Charles C. Berry
cberry at tajo.ucsd.edu
Wed Nov 17 21:10:42 CET 2010
On Wed, 17 Nov 2010, Chris Carleton wrote:
> Hi List,
>
> I'm hoping to get opinions for enhancing the efficiency of the following
> code designed to take a vector of probabilities (outcomes) and calculate a
> union of the probability space. As part of the union calculation, combn()
> must be used, which returns a matrix, and the parallelized version of
> lapply() provided in the multicore package requires a list. I've found that
> parallelization is very necessary for vectors of outcomes greater in length
> than about 10 or 15 elements, which is why I need to make use of multicore
> (and, therefore, convert the combn() matrix to a list). It would speed the
> process up if there was a more direct way to convert the columns of combn()
> to elements of a single list.
I think you are mistaken.
Is this what Rprof() tells you?
On my system, combn() is the culprit
> Rprof()
> outcomes <- 1:25
> nada <- replicate(200, {apply(combn(outcomes,2),2,column2list);NULL})
> Rprof(NULL)
> summaryRprof()
$by.self
self.time self.pct total.time total.pct
"combn" 0.64 61.54 0.70 67.31
"apply" 0.20 19.23 1.04 100.00
"FUN" 0.10 9.62 1.04 100.00
"!=" 0.04 3.85 0.04 3.85
"<" 0.02 1.92 0.02 1.92
"-" 0.02 1.92 0.02 1.92
"is.null" 0.02 1.92 0.02 1.92
And it hardly takes any time at that!
HTH,
Chuck
p.s. Isn't
as.data.frame( combn( outcomes, 2 ) )
or
combn(outcomes, 2, list )
good enough?
Any constructive suggestions will be greatly
> appreciated. Thanks for your consideration,
>
> C
>
> code:
> ------------
> unionIndependant <- function(outcomes) {
> intsctn <- c()
> column2list <- function(x){list(x)}
> pb <-
> ProgressBar(max=length(outcomes),stepLength=1,newlineWhenDone=TRUE)
> for (i in 2:length(outcomes)){
> increase(pb)
> outcomes_ <- apply(combn(outcomes,i),2,column2list)
> for (j in 1:length(outcomes_)){outcomes_[[j]] <-
> outcomes_[[j]][[1]]}
> outcomes_container <- mclapply(outcomes_,prod,mc.cores=3)
> intsctn[i] <- sum(unlist(outcomes_container))
> }
> intsctn <- intsctn[-1]
> return(sum(outcomes) - sum(intsctn[which(which((intsctn %in% intsctn))
> %% 2 == 1)]) + sum(intsctn[which(which((intsctn %in% intsctn)) %% 2 == 0)])
> + ((-1)^length(intsctn) * prod(outcomes)))
> }
> ------------
> PS This code has been tested on vectors of up to length(outcomes) == 25 and
> it should be noted that ProgressBar() requires the R.utils package.
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry Dept of Family/Preventive Medicine
cberry at tajo.ucsd.edu UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
More information about the R-help
mailing list