[R] formatC slow? (or how can I make this function faster?

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Jan 23 08:44:15 CET 2006


First, your timings seem slow: even my laptop is using 0.4 secs.  So the 
simple solution is to use a better computer.

I would just write such things in C.  The following runs in 0.01sec on my 
machine (timed by looping over it)

system.time(.Call("Cpaste", smp))

using

#include <R.h>
#include <Rinternals.h>

SEXP Cpaste(SEXP A)
{
     SEXP dims, ans;
     double *rA = REAL(A);
     int i, j, nr, nc;
     char buf[100], one[] = "1", zero[] = "0";

     dims = getAttrib(A, R_DimSymbol);
     nr = INTEGER(dims)[0]; nc = INTEGER(dims)[1];
     ans = allocVector(STRSXP, nr);
     for(i = 0; i < nr; i ++) {
 	buf[0] = '\0';
 	for(j = 0; j < nc; j++) strcat(buf, rA[i + nr*j] > 0 ? one : zero);
 	SET_STRING_ELT(ans, i, mkChar(buf));
     }
     return ans;
}

and perhaps that could be made more efficient by avoiding strcat but I 
would expect mkChar to be taking much of the time.


On Sun, 22 Jan 2006, hadley wickham wrote:

> I'm trying to convert a matrix of capture occasions to format that an
> external program can read.  The job is to basically take a row of
> matrix, like
>
>> smp[1,]
> [1] 1 1 0 1 1 1 0 0 0 0
>
> and convert it to the equivalent string "1101110000"
>
> I'm having problems doing this in a speedy way.  The simplest solution
> (calc_history below, using apply, paste and collapse) takes about 2
> seconds for a 10,000 x 10 matrix.   I thought perhaps paste might be
> building up the string in an efficient manner, so I tried using matrix
> multiplication and formatC (as in calc_history2).  This is about 25%
> faster, but still seems slow.
>
> smp <- matrix(rbinom(100000, 1, 0.5), nrow=10000)
>
> calc_history <- function(smp) {
> 	apply(smp, 1, paste, collapse="")
> }
>
> calc_history <- function(smp) {
> 	mul <- 10 ^ ((ncol(smp)-1):0)
> 	as.vector(formatC(smp %*% mul, format="d", width=ncol(smp), flag=0))
> }
>
> system.time(calc_history(smp))
> system.time(calc_history2(smp))
>
> Any ideas for improvement?
>
> Thanks,
>
> Hadley
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list