[R-sig-hpc] R on Amazon Web Service question

Takatsugu Kobayashi taquito2007 at gmail.com
Sun Jul 17 14:30:32 CEST 2011


I am a newbie to R/AWS and please forgive me my basic question. I
placed this thread in AWS Forum but haven't heard from the community

I was told by my boss at my company to set up the system in which to
process a large dataset and analyze it statistically. The input data
ranges from 5GB - 10GB.

The packages of AWS products I can think of inclues

- EC2 high memory instance for data processing
- RDS Oracle DB for data processing


- EC2 medium-large CPU instances for data analysis with R
- Map Reduce for data analysis with R


- S3

Now I have two questions regarding R and AWS:

1. Is ONE medium-high CPU instance enough for statistical-analysis of
that large dataset/
2. Should I subscribe the same number of MapReduce services as those
of EC2 instances?

Thanks in Advance.


More information about the R-sig-hpc mailing list