[R] formatC slow? (or how can I make this function faster?
hadley wickham
h.wickham at gmail.com
Mon Jan 23 04:42:24 CET 2006
I'm trying to convert a matrix of capture occasions to format that an
external program can read. The job is to basically take a row of
matrix, like
> smp[1,]
[1] 1 1 0 1 1 1 0 0 0 0
and convert it to the equivalent string "1101110000"
I'm having problems doing this in a speedy way. The simplest solution
(calc_history below, using apply, paste and collapse) takes about 2
seconds for a 10,000 x 10 matrix. I thought perhaps paste might be
building up the string in an efficient manner, so I tried using matrix
multiplication and formatC (as in calc_history2). This is about 25%
faster, but still seems slow.
smp <- matrix(rbinom(100000, 1, 0.5), nrow=10000)
calc_history <- function(smp) {
apply(smp, 1, paste, collapse="")
}
calc_history <- function(smp) {
mul <- 10 ^ ((ncol(smp)-1):0)
as.vector(formatC(smp %*% mul, format="d", width=ncol(smp), flag=0))
}
system.time(calc_history(smp))
system.time(calc_history2(smp))
Any ideas for improvement?
Thanks,
Hadley
More information about the R-help
mailing list