[R-sig-hpc] [R-SIG-Mac] Grand Central Dispatch (simple loop optimization)

Thu Sep 17 22:16:58 CEST 2009

on my system (2 x 2.93 quad core Nehalem
with hyper-threading, so 16 threads max, 16GB RAM,
10.6.1, 64bit kernel, 64bit R)

 > system.time(threads(100000,1000,"omp"))
    user  system elapsed
  10.249   0.009   0.662
 > system.time(threads(100000,1000,"gcd"))
    user  system elapsed
  10.208   0.008   0.668
 > system.time(threads(100000,1000,"dcg"))
    user  system elapsed
   8.731   0.005   8.738

so omp == gcd, but for more complicated tasks the
tighter integration may favor gcd

comparing harpertown and nehalem --> surprising
difference (kernel ? hyper-threading ?)

i have no idea how the open-sourced gcd works on
non-mac hardware

code is downloadable using webdav from
public.me.com/jdeleeuw/software/threads
or using afp://gifi.stat.ucla.edu from
the deleeuw public directory

On Sep 17, 2009, at 12:35 , Simon Urbanek wrote:

> On Sep 17, 2009, at 15:20 , Simon Urbanek wrote:
>
>> Jan,
>>
>> thanks for sharing this. This is really interesting. We have been  
>> contemplating using GCD for R (mainly pnmath) but at the time OMP  
>> was faster. However, GCD got apparently really good in the meantime:
>>
>> > system.time(threads(100000,1000,"omp_try"))
>>  user  system elapsed
>> 9.671   0.009   2.441
>> > system.time(threads(100000,1000,"gcd_try"))
>>  user  system elapsed
>> 9.592   0.004   2.410
>> > system.time(threads(100000,1000,"dcg_try"))
>>  user  system elapsed
>> 9.784   0.003   9.788
>>
>> [This is on Harpertown 2.66GHz quad core]
>>
>> So GCD is surprisingly just a hair faster than OMP (also surprising  
>> to me is that using more threads than cores make OMP faster - the  
>> above is with 16 threads).
>>
>
> Actually, with schedule(dynamic) the gap is almost at the level of  
> the measurement error:
>
> > system.time(threads(100000,1000,"omp_try"))
>   user  system elapsed
>  9.614   0.006   2.420
> > system.time(threads(100000,1000,"gcd_try"))
>   user  system elapsed
>  9.586   0.005   2.409
>
> -- the OMP line (to be placed before the for() loop) is#pragma omp  
> parallel for default(shared) private(i) schedule(dynamic)
>
> Cheers,
> Simon
>
>
>>
>> On Sep 17, 2009, at 14:24 , Jan de Leeuw wrote:
>>
>>> a) Obviously OpenMP is more portable. Even on a Mac I had to use  
>>> Apple's gcc in this case
>>> (I normally use the GNU gcc-trunk).
>>>
>>> b) GCD does not require specifying the number of threads -- it  
>>> determines it at runtime.
>>>
>>> c) Coding is simpler.
>>>
>>
>> I would not say - OMP takes just one #pragma - no need to change  
>> your code whereas GCD requires several special function calls...  
>> However, OMP is more limited in the kind of things you can do.
>>
>> Cheers,
>> Simon
>>
>>
>>> d) Since GCD is at a lower OS level than OpenMP, it will probably  
>>> handle resource allocation
>>> better. But my small example, on an otherwise idle Mac Pro (16  
>>> cores, 32 GB of RAM), does
>>> not really highlight that.
>>>
>>> e) For more info, and some OpenMP comparisons, see
>>>
>>> http://www.macresearch.org/cocoa-scientists-xxxi-all-aboard-grand-central
>>> http://arstechnica.com/apple/reviews/2009/08/mac-os-x-10-6.ars/12
>>>
>>> To quote Syracuse
>>>
>>> "Write your application as usual, but if there's any part of its  
>>> operation that can
>>> reasonably be expected to take more than a few seconds to  
>>> complete, then for the love of Zarzycki,
>>> get it off the main thread!"
>>>
>>> On Sep 17, 2009, at 11:03 , Saptarshi Guha wrote:
>>>
>>>> Nice, how does this compare when using OpenMP?
>>>> How does it compare when several other core hungry processes are  
>>>> running?( GC is supposed to nicely handle resource allocation,  
>>>> does OpenMP compete with the other processes?).
>>>>
>>>> Regards
>>>> Saptarshi
>>>>
>>>>
>>>
>>> ===
>>> Jan de Leeuw; Distinguished Professor and Chair, UCLA Department  
>>> of Statistics;
>>> Director: UCLA Center for Environmental Statistics (CES);
>>> Editor: Journal of Multivariate Analysis, Journal of Statistical  
>>> Software;
>>> US mail: 8125 Math Sciences Bldg, Box 951554, Los Angeles, CA  
>>> 90095-1554
>>> phone (310)-825-9550;  fax (310)-206-5658;  email: deleeuw at stat.ucla.edu
>>> .mac: jdeleeuw ++++++  aim: deleeuwjan ++++++ skype: j_deleeuw
>>> homepages: http://gifi.stat.ucla.edu ++++++ http://www.cuddyvalley.org
>>> -------------------------------------------------------------------------------------------------
>>>        No matter where you go, there you are. --- Buckaroo Banzai
>>>                 http://gifi.stat.ucla.edu/sounds/nomatter.au
>>>
>>> _______________________________________________
>>> R-SIG-Mac mailing list
>>> R-SIG-Mac at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mac
>>>
>>>
>>
>> _______________________________________________
>> R-SIG-Mac mailing list
>> R-SIG-Mac at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mac
>>
>>
>
>

===
Jan de Leeuw; Distinguished Professor and Chair, UCLA Department of  
Statistics;
Director: UCLA Center for Environmental Statistics (CES);
Editor: Journal of Multivariate Analysis, Journal of Statistical  
Software;
US mail: 8125 Math Sciences Bldg, Box 951554, Los Angeles, CA 90095-1554
phone (310)-825-9550;  fax (310)-206-5658;  email: deleeuw at stat.ucla.edu
.mac: jdeleeuw ++++++  aim: deleeuwjan ++++++ skype: j_deleeuw
homepages: http://gifi.stat.ucla.edu ++++++ http://www.cuddyvalley.org

-------------------------------------------------------------------------------------------------
           No matter where you go, there you are. --- Buckaroo Banzai
                    http://gifi.stat.ucla.edu/sounds/nomatter.au