[Rd] faster base::sequence
Romain Francois
romain at r-enthusiasts.com
Sun Nov 28 09:45:35 CET 2010
Hello,
Based on yesterday's R-help thread (help: program efficiency), and
following Bill's suggestions, it appeared that sequence:
> sequence
function (nvec)
unlist(lapply(nvec, seq_len))
<environment: namespace:base>
could benefit from being written in C to avoid unnecessary memory
allocations.
I made this version using inline:
require( inline )
sequence_c <- local( {
fx <- cfunction( signature( x = "integer"), '
int n = length(x) ;
int* px = INTEGER(x) ;
int x_i, s = 0 ;
/* error checking */
for( int i=0; i<n; i++){
x_i = px[i] ;
/* this includes the check for NA */
if( x_i <= 0 ) error( "needs non negative integer" ) ;
s += x_i ;
}
SEXP res = PROTECT( allocVector( INTSXP, s ) ) ;
int * p_res = INTEGER(res) ;
for( int i=0; i<n; i++){
x_i = px[i] ;
for( int j=0; j<x_i; j++, p_res++)
*p_res = j+1 ;
}
UNPROTECT(1) ;
return res ;
' )
function( nvec ){
fx( as.integer(nvec) )
}
})
And here are some timings:
> x <- 1:10000
> system.time( a <- sequence(x ) )
utilisateur système écoulé
0.191 0.108 0.298
> system.time( b <- sequence_c(x ) )
utilisateur système écoulé
0.060 0.063 0.122
> identical( a, b )
[1] TRUE
> system.time( for( i in 1:10000) sequence(1:10) )
utilisateur système écoulé
0.119 0.000 0.119
>
> system.time( for( i in 1:10000) sequence_c(1:10) )
utilisateur système écoulé
0.019 0.000 0.019
I would write a proper patch if someone from R-core is willing to push it.
Romain
--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/9VOd3l : ZAT! 2010
|- http://bit.ly/c6DzuX : Impressionnism with R
`- http://bit.ly/czHPM7 : Rcpp Google tech talk on youtube
More information about the R-devel
mailing list