[R] using foreach (parallel processing)

Mike Marchywka marchywka at hotmail.com
Thu Dec 2 12:11:02 CET 2010








----------------------------------------
> Date: Thu, 2 Dec 2010 11:06:14 +0100
> From: r.m.krug at gmail.com
> To: santosh.srinivas at gmail.com
> CC: r-help at r-project.org
> Subject: Re: [R] using foreach (parallel processing)
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 12/02/2010 10:56 AM, Santosh Srinivas wrote:
> > Hello group,
>
> Hi
> >
> > I am experimenting with parallel processing on my quad core Win 7 32
> > bit machine. Using these packages for the first time.
> >
> > I can see all my processor running at full performance when I use a
> > smaller dataset
[...]
> >
> > PROBLEM: However, when I do the same but with optData.df <- pristine
> > ... which has about 3.8 million options data ... the cores do not seem
> > to be fully utilized (they seem to run at 25%).
> >
> > I noticed some slight delay before the processing starts running ...
> > when I did with the 100k dataset ... do i need to wait longer for any
> > allocations to be done?
>
> Communication to setup the threads ould definitly take some time.
> So why don't you try to increase from 100.000 to 1.000.000 and see how
> long it takes to initialize the threads?
>
> You are not mentioning how long you wait?

Recent 'dohs releases including windohs 7 have more task manager capabilities
although afaik it is still hard to reduce the display to text for easy sharing.
One thing to look at is disk usage and page faults. Again, it is easy for IO
to take longer than processing. Usually cores end up fighting
with each other for memory eventually causing VM thrashing.

I posted a link to IEEE blurb here, showing non-monotonic
performance results as a function of cores used ( I can't remember
now if this was cores or processors but you get the idea ),

http://lists.boost.org/boost-users/2008/11/42263.php

You can generally expect peformance gains if each core
is off doing its own thing, not competing with the others
for memory or disk or other limited resources.




>
> Cheers,
>
> Rainer
>

 		 	   		  


More information about the R-help mailing list