[R-sig-hpc] What to Experiment With?

Sat Apr 21 00:16:47 CEST 2012

Dear R HPC experts:

I have about $5,000 to spend on building fast computer hardware to run
our problems.  if it works well, I may be able to scrounge up another
$10k/year to scale it up.  I do not have the resources to program very
complex algorithms, administer a full cluster, etc.  (the effective
programmer's rate here is about $50/hour and up, and I have severe
restrictions against hiring outsiders.)  the programs basically have
to work with minimum special tweaking.

There are no real-time needs.  Typically, I operate on historical CRSP
and Compustat data, which are about 1-5GB (depending on subset).  most
of what I am doing involves linear regressions.  I often need to
calculate Newey-West/Hansen-Hodrick/White adjusted standard errors,
and I often do need to sort and rank, calculate means and covariances.
 these are not highly sophisticated stats, but it entails lots of it.
most of what I do is embarrassingly parallel.

Now, I think in the $5k price range, I have a couple of options.
Roughly, the landscape seems to be:

* 1 dual-socket xeon i7 computers.
* 5 (desktop) i7 computers, networked (socket snow?).
* 1 i7 computer, with 1 nvidia Tesla card
* 1 i7 computers with 2-3 commodity graphics cards
     --- apparently, nvidia cripples the DP performance of its gamer
cards, so AMD should be a *lot* faster
     at the same price, but I only see the lm() routine in
nvidia-specific gputools.  then again, for Newey-West,
     I may have to resort to my own calculations, anyway.  is there
newey-west OLS code for AMD GPUs?

I would presume that an internal PCI bus is a lot faster than an
ethernet network, and a GPU could be faster than a CPU, but a GPU is
also less flexible.  Sigh...not sure.
what should I try?

/iaw

----
Ivo Welch (ivo.welch at gmail.com)