[R] Kudos for R team

Vele Samak v.samak at verizon.net
Sun Sep 22 03:22:29 CEST 2002


Hi,

I just wanted to give kudos to the R team for building a performance
oriented software for statistics. As someone who deals regularly with
tons of data speed in analysis is a priority. I'll mention one example
where the implementation of R and Splus differs so much that leaves you
no choice but to go with R. 

I my line of work, quantitative investment research, we often need to
create lots of different but realistic portfolios for backtesting of our
models. One function that we developed takes an existing portfolio and
generates N various portfolios which are similar to the starting
portfolio and meet certain constraints. The function is all S code and
uses couple of sapplys, a while loop at the core and some calls to
runif. Nothing special. It returns a matrix S x N (S is number of stocks
in the starting portfolio). 

We ran this function on the latest Splus, W2000, pIII 1GZ, 512MB RAM,
with e 500-stock portfolio (vector) and asked it to give us 2000
portfolios. It took close to 1 hour. Same code, same machine no changes,
R 1.3.1 and R 1.5.1: under 1 minute!!!

This isn't geeky fast, fast for bragging rights fast, 10%, 20% or even
100% faster. This is an order of magnitude faster. All in the sapply and
the while loop implementations, probably some cbinds in there, but you
can speculate. At this point, you have to wonder what goes on in Splus.
This kind of result also settles the question once and for all: nor why
R, but why splus? No reason at all. It used to be that I had a variety
of reasons to use R over splus for the past 2 years, but this example
outweights all other. 

Lest I get fired, I can't show you the code of the function since it is
considered proprietary. Although, I doubt you need much time and tought
to replicate something like this in S. At this point we moved the core
into c and run such tests with all kinds of group constraints in mere
seconds. 

Maybe most of you knew about this, but I just wanted to put this out
there. Another side issue, in case there are other believers in Sun
workstations out there: R 1.3.1 compiled on Sun 4-cpu workstation with
4GB RAM, don't know the model, but sure it's no more than 2 years old,
late 2000 model: same code, same process, mostly lots of regressions,
minor I/O to disk: 7-times slower than same execution on the
abovementioned PC. Go figure!

P.S. Don't want to start flame wars, I but don't make these statements
lightly, this is what I get paid to do and every resource and every
second in execution counts.

Enjoy,
--
Vele Samak
http://www.velesamak.com 


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list