[R] python
Stefan Evert
stefan.evert at uos.de
Sun Nov 22 12:37:37 CET 2009
> Sure, badly written R code does not perform as well as well written
> python code or C code. On the other hand badly written python code
> does not perform as well as well written R code.
>
> What happens when you try one of these :
>
> sum <- sum( 1:N )
R runs out of memory and crashes. :-) I didn't tell you how big N is,
did I?
But this is exactly the point I was trying to make (but perhaps not
prominently enough). In many cases, you can vectorize at least parts
of your code or find a more efficient algorithm, which may be faster
in R than a brute-force solution in C. But sometimes, you just cannot
avoid loops (let's not forget that all the forms of apply() are just
loops and don't give much of a speed benefit over a for-loop),
function calls, etc.; in this case, performance differences between
interpreted languages can matter.
Personally, I'd never switch from R to Perl just for speed, though.
BTW, I also tried a vectorised algorithm in R, which calculates the
sum above in a small number of chunks:
> N1 <- 50
> N2 <- 1000000
> N <- N1 * N2
> sum <- 0
>
> for (i in 1:N1) {
> x <- as.numeric(i-1) * N2 + 1:N2
> sum <- sum + sum(x)
> }
which gives
R/simple_count_vec.R 31.30 Mops/s (50000000 ops in 1.60 s)
So an interpreted loop in Lua is still faster than this partially
vectorized code in R:
>> lua/simple_count.lua 65.78 Mops/s (100000000 ops in 1.52 s)
As people on the SQLite mailing list always say: there's no general
answer as to which language/implementation/query/... is faster and
better. You just have to test the different options for your specific
application setting, and be prepared for one or two surprises.
Just in case this isn't obvious: If I rewrote matrix multiplication in
C and linked this code into R, it would run much slower than if I just
typed "A %*% B".
All the best,
Stefan
More information about the R-help
mailing list