[R] Resources for utilizing multiple processors

Mike Marchywka marchywka at hotmail.com
Thu Jun 9 12:10:24 CEST 2011










----------------------------------------
> From: rjeffries at ucla.edu
> Date: Wed, 8 Jun 2011 20:54:45 -0700
> To: r-help at r-project.org
> Subject: [R] Resources for utilizing multiple processors
>
> Hello,
>
> I know of some various methods out there to utilize multiple processors but
> am not sure what the best solution would be. First some things to note:
> I'm running dependent simulations, so direct parallel coding is out
> (multicore, doSnow, etc).

the
> *nix languages.

Well, for the situation below you seem to want a function
server. You could consider Rapache and just write this like a big
web application. A web server, like a DB, is not the first thing
you think of with high performance computing but if your computationally
intenstive tasks are in native code this could be a reasoanble
overhead that requires little learning. 

If you literally means cores instead of machines keep in mind
that cores can end up fighting over resources, like memory
( this cites IEEE article with cores making things worse
in non-contrived case)

http://lists.boost.org/boost-users/2008/11/42263.php


I think people have mentioned some classes like bigmemory, I forget
the names exactly, that let you handle larger things. Launching a bunch
of threads and letting VM thrash can easily make things slower quickly.

I guess a better approach would be to get an implementation that is
block oriented and you can do the memory/file stuff in R until
they get a data frame that uses disk transparently and with hints on
expected access patterns ( prefetch etc). 



>
> My main concern deals with Multiple analyses on large data sets. By large I
> mean that when I'm done running 2 simulations R is using ~3G of RAM, the
> remaining ~3G is chewed up when I try to create the Gelman-Rubin statistic
> to compare the two resulting samples, grinding the process to a halt. I'd
> like to have separate cores simultaneously run each analysis. That will save
> on time and I'll have to ponder the BGR calculation problem another way. Can
> R temporarily use HD space to write calculations to instead of RAM?
>
> The second concern boils down to whether or not there is a way to split up
> dependent simulations. For example at iteration (t) I feed a(t-2) into FUN1
> to generate a(t), then feed a(t), b(t-1) and c(t-1) into FUN2 to simulate
> b(t) and c(t). I'd love to have one core run FUN1 and another run FUN2, and
[[elided Hotmail spam]]
>
>
> So if anyone has any suggestions as to a direction I can look into, it would
> be appreciated.
>
>
> Robin Jeffries
> MS, DrPH Candidate
> Department of Biostatistics
> UCLA
> 530-633-STAT(7828)
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
 		 	   		  


More information about the R-help mailing list