[R-sig-hpc] distributed R on EC2, designing the software stack

Whit Armstrong armstrong.whit at gmail.com
Wed Apr 29 21:24:49 CEST 2009


you should contact Robert Grossman who just gave a presentation on
this topic at R/Finance in Chicago.

link: http://rinfinance.quantmod.com/speakers/

-Whit


On Wed, Apr 29, 2009 at 3:06 PM, Stephen J. Barr <stephenjbarr at gmail.com> wrote:
> Greetings,
>
> I am trying to get into distributed computing with R, but do not have
> access to a cluster. Therefore, I am trying to get distributed R
> running on Amazon's EC2. ( http://aws.amazon.com/ec2/ )
>
> For those of you who don't know, EC2 allows you to instantiate large
> numbers of computers, bundled with whatever OS and software
> configuration you want. From my survey of things, there are a lot of
> different options available for distributed computing. For my needs, I
> would just like to run simple Monte Carlo simulations, and other
> things that don't require a ton of inter-node communication.
>
> What I would like to do is put together a public AMI and a howto
> guide, such that it would be very easy for anyone to instantiate an
> N-node cluster and start with parallel computing. I would like to have
> a discussion/brainstorm over what the exact software stack should be.
>
> My initial thoughts were:
>
> 1) R 2.9.0 + OpenMPI + RMpi + Snowfall/sfCluster
>   - will Amazon's network work with OpenMPI. Perhaps it would be
> better to use PVM or something that is more tolerant to non-optimal
> network
>
> 2)  R 2.9.0 + "socket based communication" + Snowfall/sfCluster
>  - is this scalable
>
> 3)  R 2.9.0 + twisted + NetWorkSpaces
>   - not sure of Amazon's network supports broadcast mode, which is
> required by twisted
>
> 4) Biocep-R
>   - this looks like it has the functionality to do what I want, but a
> lot of other stuff as well.
>
> 5) RHIPE
>   - Hadoop is well supported by EC2. Perhaps this is the way to go.
> Seems like a very new package :)
>
> What are people's thoughts on what would be a good software stack with
> the constraint that it should be simple and run on EC2?
>
> Thanks,
> -stephen
> ==========================================
> Stephen J. Barr
> University of Washington
> WEB: www.econsteve.com
>
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>



More information about the R-sig-hpc mailing list