[R] python

Stefan Evert stefan.evert at uos.de
Sun Nov 22 12:37:37 CET 2009


> Sure, badly written R code does not perform as well as well written  
> python code or C code. On the other hand badly written python code  
> does not perform as well as well written R code.
>
> What happens when you try one of these :
>
> sum <- sum( 1:N )

R runs out of memory and crashes. :-)  I didn't tell you how big N is,  
did I?

But this is exactly the point I was trying to make (but perhaps not  
prominently enough).  In many cases, you can vectorize at least parts  
of your code or find a more efficient algorithm, which may be faster  
in R than a brute-force solution in C.  But sometimes, you just cannot  
avoid loops (let's not forget that all the forms of apply() are just  
loops and don't give much of a speed benefit over a for-loop),  
function calls, etc.; in this case, performance differences between  
interpreted languages can matter.

Personally, I'd never switch from R to Perl just for speed, though.

BTW, I also tried a vectorised algorithm in R, which calculates the  
sum above in a small number of chunks:

> N1 <- 50
> N2 <- 1000000
> N <- N1 * N2
> sum <- 0
>
> for (i in 1:N1) {
>         x <- as.numeric(i-1) * N2 + 1:N2
>         sum <- sum + sum(x)
> }

which gives

R/simple_count_vec.R              31.30 Mops/s  (50000000 ops in 1.60 s)

So an interpreted loop in Lua is still faster than this partially  
vectorized code in R:

>> lua/simple_count.lua 65.78 Mops/s (100000000 ops in 1.52 s)

As people on the SQLite mailing list always say: there's no general  
answer as to which language/implementation/query/... is faster and  
better.  You just have to test the different options for your specific  
application setting, and be prepared for one or two surprises.

Just in case this isn't obvious: If I rewrote matrix multiplication in  
C and linked this code into R, it would run much slower than if I just  
typed "A %*% B".

All the best,
Stefan




More information about the R-help mailing list