[R-sig-hpc] What to Experiment With?

M. Edward (Ed) Borasky znmeb at znmeb.net
Sat Apr 21 07:23:18 CEST 2012


On Fri, Apr 20, 2012 at 10:03 PM, ivo welch <ivo.welch at gmail.com> wrote:
> for swap, an SSD should do very nicely these days.  swapping used to
> be very costly, but I suspect that in the age of 500MB/s SSDs, this is
> no longer true.  in any case, an X79 motherboard with 8 ram slots
> costs about $250.  8GB costs about $50, so getting 64GB is about $400.
>   the i7 Sandy Bridge is another $400.  drives (SSD + HD) another
> $250.  so, around $1,500 per computer.   3 of those seem like a good
> idea.

For the hard drives, go with four rather than three if you can - RAID
10 will give you both mirroring for redundancy and striping for speed.

> to do this, though, I also still need to find a simple example of
> socket snow-type use of library(parallel).  I posted a question on
> plain r-help.  see it as a suggestion of something to add to the
> vignette or to ?parallel .  and thanks for parallel.  it's great.

Yeah, if you find a demo for 'parallel', let me know too.

> is standard ubuntu linux R and its main libraries now compiled with
> Intel AVX?  I hear this can double or triple the vector performance
> vs. SSE.

I'm pretty sure the Ubuntu (and Fedora and openSUSE) R packages are
compiled so they'll run on any amd64 / x86_64. But you can fairly
easily recompile them on your box from the distro's source packages
using "march=native" and link with Atlas linear algebra libraries.
It's not unreasonable to build ATLAS and R from source rather than
using the distro's binary packages. I've almost got that nailed down
for openSUSE myself.

> A long-run suggestion: with parallel in the core R, and Whit's cloud
> interface, we may soon be able to build a shared R community cluster.
> see, I wish I could rent my own computers out to an R community
> cluster when I do not need them, not in exchange for money but in
> exchange for computation credits when I need to use this R community
> cluster to do calculations myself.

Opani sort of does this, but I don't remember their business /
licensing model. I've got a project on Github to build a "Platform as
a Service" built on openSUSE Linux and R. See

http://susestudio.com/a/RQrRBY/computational-journalism-server

and

https://github.com/znmeb/Computational-Journalism-Server

for the details. I think the final 1.0 release next week will have the
R / Atlas recompilation scripts - all I need to do is verify that
Atlas isn't wrecking the numeric properties of the test cases.


-- 
Twitter: http://twitter.com/znmeb Computational Journalism Server
http://j.mp/compjournoserver

Data is the new coal - abundant, dirty and difficult to mine.



More information about the R-sig-hpc mailing list