[R-sig-hpc] socket cluster/gotoblas2 configuration confusion

Claudia Beleites claudia.beleites at ipht-jena.de
Mon Nov 28 17:59:59 CET 2011


  > system(sprintf('taskset -p 0xffffffff %d', Sys.getpid()))
works for me, too.

BTW: I didn't find the previous questions - but maybe I shouldn't have 
restricted to gotoblas2.

Thinking about the behaviour I observed, the interesting thing is that 
this restriction can only be half of the truth - but I still don't have 
any idea where it comes from.

- With the execution restricted to core 1(indicated by cpus_allowed), I 
could still use multiple cores via multicore.
- snow socket clusters sometimes used one core per cluster node, but not 
more. Sometimes, really only core 1 is used (for all cluster nodes:  top 
tells me e.g. that 12 processes consume 9 % of CPU usage each and only 
core 1 is working).


Claudia




Am 28.11.2011 17:19, schrieb Stephen Weston:
> I ran some tests on my Linux machine, and I was able to
> undo a cpuset restriction from within an R session using:
>
>      >  system(sprintf('taskset -p 0xffffffff %d', Sys.getpid()))
>
> I was also able to start an unrestricted R session from a
> restricted shell session using:
>
>      $ taskset 0xffffffff R
>
> or
>
>      $ numactl -C 0-3 R
>
> I have no idea how your R sessions are becoming restricted, so
> I have no idea if this will work, but it's worth a try.
>
> - Steve
>
>
> On Thu, Nov 24, 2011 at 8:32 AM, Claudia Beleites
> <claudia.beleites at ipht-jena.de>  wrote:
>> Dear Steve,
>>
>>> This is really a shot in the dark, but you could try executing:
>>>
>>>    clusterEvalQ(cl, readLines(sprintf('/proc/%d/status', Sys.getpid())))
>>>
>>> and look for the lines that mention "Cpus_allowed".  It's
>>> conceivable that your snow workers have been restricted to
>>> execute on a subset of the node's cores.  But that seems
>>> rather unlikely since you're using a socket cluster.
>>
>> It's getting even more weird - I don't seem to be able to reproduce the
>> behaviour ...
>>
>> I was able (twice, but not reproducibly) to get it working as I want:
>>
>> I opened terminal (xfce4-term) on my desktop and log into the server with
>> ssh -X -C claudia at 172.17.42.86
>> and start an R session there,
>> all worked well: 2 workers using 6 cores each for the multiplication.
>>
>> proc status output:
>>
>>> cat (readLines(sprintf('/proc/%d/status', Sys.getpid())), sep = "\n")
>> Name:   R
>> State:  R (running)
>> SleepAVG:       98%
>> Tgid:   31571
>> Pid:    31571
>> PPid:   7664
>> TracerPid:      0
>> Uid:    508     508     508     508
>> Gid:    509     509     509     509
>> FDSize: 256
>> Groups: 509
>> VmPeak:   560904 kB
>> VmSize:   560904 kB
>> VmLck:         0 kB
>> VmHWM:    276500 kB
>> VmRSS:    276496 kB
>> VmData:   405528 kB
>> VmStk:       140 kB
>> VmExe:      2856 kB
>> VmLib:     18248 kB
>> VmPTE:       916 kB
>> StaBrk: 19587000 kB
>> Brk:    22682000 kB
>> StaStk: 7fffabd50130 kB
>> Threads:        6
>> SigQ:   1/79872
>> SigPnd: 0000000000000000
>> ShdPnd: 0000000000000000
>> SigBlk: 0000000000000000
>> SigIgn: 0000000000000000
>> SigCgt: 0000000180001e4a
>> CapInh: 0000000000000000
>> CapPrm: 0000000000000000
>> CapEff: 0000000000000000
>> Cpus_allowed:
>> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00ffffff
>> Mems_allowed:   00000000,00000001
>>
>> For the other processes,
>> either via
>> * terminal ->  ssh ->  R
>> * terminal ->  emacs ->  ess ->  R
>> * terminal ->  ssh ->  xfce4-panel ->  terminal ->  R
>> * terminal ->  ssh ->  xfce4-panel ->  emacs ->  ess ->  R
>> I sometimes get 2 cores working in parallel as shown by the cpu graph
>> applet, sometimes the applet indicates only one core and the snow timing
>> plot indicates both workers worked at the same time, but took twice as long
>> as system.time of the matrix multiplication
>>
>> The proc status is different for those:
>>
>>
>>> cat (readLines(sprintf('/proc/%d/status', Sys.getpid())), sep = "\n")
>> Name:   R
>> State:  R (running)
>> SleepAVG:       98%
>> Tgid:   2983
>> Pid:    2983
>> PPid:   2956
>> TracerPid:      0
>> Uid:    508     508     508     508
>> Gid:    509     509     509     509
>> FDSize: 256
>> Groups: 509
>> VmPeak:   560768 kB
>> VmSize:   545148 kB
>> VmLck:         0 kB
>> VmHWM:    342960 kB
>> VmRSS:    327336 kB
>> VmData:   389768 kB
>> VmStk:       144 kB
>> VmExe:      2856 kB
>> VmLib:     18248 kB
>> VmPTE:      1012 kB
>> StaBrk: 121ab000 kB
>> Brk:    1a342000 kB
>> StaStk: 7fffa9d4aa10 kB
>> Threads:        6
>> SigQ:   2/79872
>> SigPnd: 0000000000000000
>> ShdPnd: 0000000000000000
>> SigBlk: 0000000000000000
>> SigIgn: 0000000000000004
>> SigCgt: 0000000180001e4a
>> CapInh: 0000000000000000
>> CapPrm: 0000000000000000
>> CapEff: 0000000000000000
>> Cpus_allowed:
>> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
>> Mems_allowed:   00000000,00000001
>>
>> So the last part of cpus_allowed is 00000001 instead of 00ffffff.
>>
>> What exactly does that tell me? The man page was not particularly
>> enlightning...
>>
>> How can I change that restriction?
>>
>> Thanks a lot for your help,
>>
>> Claudia
>>
>> PS: I have to leave soon for a seminar over the weekend, so I won't be able
>> to try out things again before Monday.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>>
>>> - Steve
>>>
>>>
>>> On Wed, Nov 23, 2011 at 11:09 AM, Claudia Beleites
>>> <claudia.beleites at ipht-jena.de>    wrote:
>>>>
>>>> Steve,
>>>>
>>>>> You don't say how you set GOTO_NUM_THREADS to 6.
>>>>
>>>> sorry, I forgot to tell you all:
>>>>
>>>> I used
>>>> export GOTO_NUM_THREADS=6
>>>> in the shell before starting R.
>>>>
>>>> and I did check by
>>>>>
>>>>>       clusterEvalQ(cl, system ("echo $GOTO_NUM_THREADS"))
>>>>
>>>> which gave me 6 for both workers.
>>>>
>>>> so does:
>>>>>
>>>>>    clusterEvalQ(cl, Sys.getenv('GOTO_NUM_THREADS'))
>>>>
>>>> [[1]]
>>>> [1] "6"
>>>>
>>>> [[2]]
>>>> [1] "6"
>>>>
>>>> I did not know the Sys.getenv/Sys.setenv functions, though.
>>>>
>>>> Thanks,
>>>>
>>>> Claudia
>>>>
>>>>> You
>>>>> might want to verify that it did get set in each of the snow worker
>>>>> processes by using the command:
>>>>>
>>>>>      clusterEvalQ(cl, Sys.getenv('GOTO_NUM_THREADS'))
>>>>>
>>>>> If it returns any empty strings in the resulting list, then the
>>>>> environment variable is not set in the corresponding worker.
>>>>>
>>>>> You probably should set this variable through an appropriate
>>>>> shell startup file, but you could at least temporarily use:
>>>>>
>>>>>      clusterEvalQ(cl, Sys.setenv(GOTO_NUM_THREADS=6))
>>>>
>>>>
>>>>>
>>>>> - Steve
>>>>>
>>>>>
>>>>> On Wed, Nov 23, 2011 at 9:44 AM, Claudia Beleites
>>>>> <claudia.beleites at ipht-jena.de>      wrote:
>>>>>>
>>>>>> Dear all,
>>>>>>
>>>>>> I'm just doing my first steps with parallelized calculations and got
>>>>>> quite
>>>>>> confused.
>>>>>>
>>>>>> Here's what I want, what I have and what I did:
>>>>>>
>>>>>> - I want to parallelize calculations on a Centos server with 2 x 6
>>>>>> cores
>>>>>> and
>>>>>> 8 GB RAM (it is actually part of a cluster, but I have access only to
>>>>>> this
>>>>>> node, and the other nodes do not (yet) have R installed).
>>>>>>
>>>>>> - My Data is too large to work with in one piece.
>>>>>> But it comes in separate files of suitable size: I can work nicely with
>>>>>> 2
>>>>>> to
>>>>>> 3 samples in memory at the same time.
>>>>>>
>>>>>> - So my idea was to start up a snow socket cluster with 2 or 3 workers.
>>>>>>
>>>>>> - In addition I want to use an optimized and blas. Linear algebra is
>>>>>> only
>>>>>> a
>>>>>> small part of the analysis so it does make sense to have the socket
>>>>>> cluster
>>>>>> with as many workers as possible and have the linear algebra parts use
>>>>>> up
>>>>>> to
>>>>>> n / nworkers cores.
>>>>>>
>>>>>> So I built R 2.14.0 using gotoblas2 and set $GOTO_NUM_THREADS to 6.
>>>>>> Matrix
>>>>>> multiplication in a fresh R session now is much faster and CPU usage
>>>>>> shows
>>>>>> the expected 6 cores working:
>>>>>>
>>>>>>> system.time ({m<- matrix (1:9e6, 3e3); m%*%m; NULL})
>>>>>>
>>>>>>        User      System verstrichen
>>>>>>       5.219       0.126       1.111
>>>>>>
>>>>>> However, the socket clusters seem not to use the GOTO_NUM_THREADS:
>>>>>>>
>>>>>>> library (snow)
>>>>>>> cl<- makeCluster(2,type="SOCK")
>>>>>>> tm<- snow.time(clusterEvalQ(cl, {m<- matrix (1:9e6, 3e3); m%*%m;
>>>>>>> NULL}))
>>>>>>> tm
>>>>>>
>>>>>> elapsed    send receive  node 1  node 2
>>>>>>   9.553   0.001   0.010   9.510   9.543
>>>>>>>
>>>>>>> tm$data
>>>>>>
>>>>>> [[1]]
>>>>>>      send_start send_end recv_start recv_end exec
>>>>>> [1,]          0    0.001      9.511    9.512 9.51
>>>>>>
>>>>>> [[2]]
>>>>>>      send_start send_end recv_start recv_end  exec
>>>>>> [1,]      0.001    0.001      9.544    9.553 9.543
>>>>>>
>>>>>>> tm$elapsed
>>>>>>
>>>>>> elapsed
>>>>>>   9.553
>>>>>>>
>>>>>>
>>>>>> CPU usage shows 2 cores working, and the times correspond to that.
>>>>>>
>>>>>> What configuration do I need to do in order to make the blas use more
>>>>>> threads for the worker processes? Anything else I should do
>>>>>> differently?
>>>>>>
>>>>>>
>>>>>>> sessionInfo ()
>>>>>>
>>>>>> R version 2.14.0 (2011-10-31)
>>>>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>>>>
>>>>>> locale:
>>>>>>   [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C
>>>>>>   [3] LC_TIME=de_DE.UTF-8        LC_COLLATE=de_DE.UTF-8
>>>>>>   [5] LC_MONETARY=de_DE.UTF-8    LC_MESSAGES=de_DE.UTF-8
>>>>>>   [7] LC_PAPER=C                 LC_NAME=C
>>>>>>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>>>>> [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
>>>>>>
>>>>>> attached base packages:
>>>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>>>
>>>>>> other attached packages:
>>>>>> [1] snow_0.3-8
>>>>>>
>>>>>> Additional questions:
>>>>>> - Is there some command like sessionInfo () that yields information
>>>>>> about
>>>>>> the blas (particularly NUM_THREADS)?
>>>>>> - Is there some command that I can use to tell the blas how many
>>>>>> threads
>>>>>> to
>>>>>> use during an R session? Can I set environment variables from within R?
>>>>>> Searching didn't help as I got only info about R environments...) Would
>>>>>> that
>>>>>> actually help here?
>>>>>>
>>>>>> Thanks a lot for your help.
>>>>>>
>>>>>> Claudia
>>>>>>
>>>>>> --
>>>>>> Claudia Beleites
>>>>>> Spectroscopy/Imaging
>>>>>> Institute of Photonic Technology
>>>>>> Albert-Einstein-Str. 9
>>>>>> 07745 Jena
>>>>>> Germany
>>>>>>
>>>>>> email: claudia.beleites at ipht-jena.de
>>>>>> phone: +49 3641 206-133
>>>>>> fax:   +49 2641 206-399
>>>>>>
>>>>>> _______________________________________________
>>>>>> R-sig-hpc mailing list
>>>>>> R-sig-hpc at r-project.org
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>>>>>>
>>>>
>>>>
>>>> --
>>>> Claudia Beleites
>>>> Spectroscopy/Imaging
>>>> Institute of Photonic Technology
>>>> Albert-Einstein-Str. 9
>>>> 07745 Jena
>>>> Germany
>>>>
>>>> email: claudia.beleites at ipht-jena.de
>>>> phone: +49 3641 206-133
>>>> fax:   +49 2641 206-399
>>>>
>>>>
>>>>
>>>>
>>>>
>>
>>
>> --
>> Claudia Beleites
>> Spectroscopy/Imaging
>> Institute of Photonic Technology
>> Albert-Einstein-Str. 9
>> 07745 Jena
>> Germany
>>
>> email: claudia.beleites at ipht-jena.de
>> phone: +49 3641 206-133
>> fax:   +49 2641 206-399
>>
>>
>>
>>
>>


-- 
Claudia Beleites
Spectroscopy/Imaging
Institute of Photonic Technology
Albert-Einstein-Str. 9
07745 Jena
Germany

email: claudia.beleites at ipht-jena.de
phone: +49 3641 206-133
fax:   +49 2641 206-399



More information about the R-sig-hpc mailing list