[R-sig-hpc] socket cluster/gotoblas2 configuration confusion
Stephen Weston
stephen.b.weston at gmail.com
Mon Nov 28 17:19:18 CET 2011
I ran some tests on my Linux machine, and I was able to
undo a cpuset restriction from within an R session using:
> system(sprintf('taskset -p 0xffffffff %d', Sys.getpid()))
I was also able to start an unrestricted R session from a
restricted shell session using:
$ taskset 0xffffffff R
or
$ numactl -C 0-3 R
I have no idea how your R sessions are becoming restricted, so
I have no idea if this will work, but it's worth a try.
- Steve
On Thu, Nov 24, 2011 at 8:32 AM, Claudia Beleites
<claudia.beleites at ipht-jena.de> wrote:
> Dear Steve,
>
>> This is really a shot in the dark, but you could try executing:
>>
>> clusterEvalQ(cl, readLines(sprintf('/proc/%d/status', Sys.getpid())))
>>
>> and look for the lines that mention "Cpus_allowed". It's
>> conceivable that your snow workers have been restricted to
>> execute on a subset of the node's cores. But that seems
>> rather unlikely since you're using a socket cluster.
>
> It's getting even more weird - I don't seem to be able to reproduce the
> behaviour ...
>
> I was able (twice, but not reproducibly) to get it working as I want:
>
> I opened terminal (xfce4-term) on my desktop and log into the server with
> ssh -X -C claudia at 172.17.42.86
> and start an R session there,
> all worked well: 2 workers using 6 cores each for the multiplication.
>
> proc status output:
>
>> cat (readLines(sprintf('/proc/%d/status', Sys.getpid())), sep = "\n")
> Name: R
> State: R (running)
> SleepAVG: 98%
> Tgid: 31571
> Pid: 31571
> PPid: 7664
> TracerPid: 0
> Uid: 508 508 508 508
> Gid: 509 509 509 509
> FDSize: 256
> Groups: 509
> VmPeak: 560904 kB
> VmSize: 560904 kB
> VmLck: 0 kB
> VmHWM: 276500 kB
> VmRSS: 276496 kB
> VmData: 405528 kB
> VmStk: 140 kB
> VmExe: 2856 kB
> VmLib: 18248 kB
> VmPTE: 916 kB
> StaBrk: 19587000 kB
> Brk: 22682000 kB
> StaStk: 7fffabd50130 kB
> Threads: 6
> SigQ: 1/79872
> SigPnd: 0000000000000000
> ShdPnd: 0000000000000000
> SigBlk: 0000000000000000
> SigIgn: 0000000000000000
> SigCgt: 0000000180001e4a
> CapInh: 0000000000000000
> CapPrm: 0000000000000000
> CapEff: 0000000000000000
> Cpus_allowed:
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00ffffff
> Mems_allowed: 00000000,00000001
>
> For the other processes,
> either via
> * terminal -> ssh -> R
> * terminal -> emacs -> ess -> R
> * terminal -> ssh -> xfce4-panel -> terminal -> R
> * terminal -> ssh -> xfce4-panel -> emacs -> ess -> R
> I sometimes get 2 cores working in parallel as shown by the cpu graph
> applet, sometimes the applet indicates only one core and the snow timing
> plot indicates both workers worked at the same time, but took twice as long
> as system.time of the matrix multiplication
>
> The proc status is different for those:
>
>
>> cat (readLines(sprintf('/proc/%d/status', Sys.getpid())), sep = "\n")
> Name: R
> State: R (running)
> SleepAVG: 98%
> Tgid: 2983
> Pid: 2983
> PPid: 2956
> TracerPid: 0
> Uid: 508 508 508 508
> Gid: 509 509 509 509
> FDSize: 256
> Groups: 509
> VmPeak: 560768 kB
> VmSize: 545148 kB
> VmLck: 0 kB
> VmHWM: 342960 kB
> VmRSS: 327336 kB
> VmData: 389768 kB
> VmStk: 144 kB
> VmExe: 2856 kB
> VmLib: 18248 kB
> VmPTE: 1012 kB
> StaBrk: 121ab000 kB
> Brk: 1a342000 kB
> StaStk: 7fffa9d4aa10 kB
> Threads: 6
> SigQ: 2/79872
> SigPnd: 0000000000000000
> ShdPnd: 0000000000000000
> SigBlk: 0000000000000000
> SigIgn: 0000000000000004
> SigCgt: 0000000180001e4a
> CapInh: 0000000000000000
> CapPrm: 0000000000000000
> CapEff: 0000000000000000
> Cpus_allowed:
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
> Mems_allowed: 00000000,00000001
>
> So the last part of cpus_allowed is 00000001 instead of 00ffffff.
>
> What exactly does that tell me? The man page was not particularly
> enlightning...
>
> How can I change that restriction?
>
> Thanks a lot for your help,
>
> Claudia
>
> PS: I have to leave soon for a seminar over the weekend, so I won't be able
> to try out things again before Monday.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>>
>> - Steve
>>
>>
>> On Wed, Nov 23, 2011 at 11:09 AM, Claudia Beleites
>> <claudia.beleites at ipht-jena.de> wrote:
>>>
>>> Steve,
>>>
>>>> You don't say how you set GOTO_NUM_THREADS to 6.
>>>
>>> sorry, I forgot to tell you all:
>>>
>>> I used
>>> export GOTO_NUM_THREADS=6
>>> in the shell before starting R.
>>>
>>> and I did check by
>>>>
>>>> clusterEvalQ(cl, system ("echo $GOTO_NUM_THREADS"))
>>>
>>> which gave me 6 for both workers.
>>>
>>> so does:
>>>>
>>>> clusterEvalQ(cl, Sys.getenv('GOTO_NUM_THREADS'))
>>>
>>> [[1]]
>>> [1] "6"
>>>
>>> [[2]]
>>> [1] "6"
>>>
>>> I did not know the Sys.getenv/Sys.setenv functions, though.
>>>
>>> Thanks,
>>>
>>> Claudia
>>>
>>>> You
>>>> might want to verify that it did get set in each of the snow worker
>>>> processes by using the command:
>>>>
>>>> clusterEvalQ(cl, Sys.getenv('GOTO_NUM_THREADS'))
>>>>
>>>> If it returns any empty strings in the resulting list, then the
>>>> environment variable is not set in the corresponding worker.
>>>>
>>>> You probably should set this variable through an appropriate
>>>> shell startup file, but you could at least temporarily use:
>>>>
>>>> clusterEvalQ(cl, Sys.setenv(GOTO_NUM_THREADS=6))
>>>
>>>
>>>>
>>>> - Steve
>>>>
>>>>
>>>> On Wed, Nov 23, 2011 at 9:44 AM, Claudia Beleites
>>>> <claudia.beleites at ipht-jena.de> wrote:
>>>>>
>>>>> Dear all,
>>>>>
>>>>> I'm just doing my first steps with parallelized calculations and got
>>>>> quite
>>>>> confused.
>>>>>
>>>>> Here's what I want, what I have and what I did:
>>>>>
>>>>> - I want to parallelize calculations on a Centos server with 2 x 6
>>>>> cores
>>>>> and
>>>>> 8 GB RAM (it is actually part of a cluster, but I have access only to
>>>>> this
>>>>> node, and the other nodes do not (yet) have R installed).
>>>>>
>>>>> - My Data is too large to work with in one piece.
>>>>> But it comes in separate files of suitable size: I can work nicely with
>>>>> 2
>>>>> to
>>>>> 3 samples in memory at the same time.
>>>>>
>>>>> - So my idea was to start up a snow socket cluster with 2 or 3 workers.
>>>>>
>>>>> - In addition I want to use an optimized and blas. Linear algebra is
>>>>> only
>>>>> a
>>>>> small part of the analysis so it does make sense to have the socket
>>>>> cluster
>>>>> with as many workers as possible and have the linear algebra parts use
>>>>> up
>>>>> to
>>>>> n / nworkers cores.
>>>>>
>>>>> So I built R 2.14.0 using gotoblas2 and set $GOTO_NUM_THREADS to 6.
>>>>> Matrix
>>>>> multiplication in a fresh R session now is much faster and CPU usage
>>>>> shows
>>>>> the expected 6 cores working:
>>>>>
>>>>>> system.time ({m<- matrix (1:9e6, 3e3); m%*%m; NULL})
>>>>>
>>>>> User System verstrichen
>>>>> 5.219 0.126 1.111
>>>>>
>>>>> However, the socket clusters seem not to use the GOTO_NUM_THREADS:
>>>>>>
>>>>>> library (snow)
>>>>>> cl<- makeCluster(2,type="SOCK")
>>>>>> tm<- snow.time(clusterEvalQ(cl, {m<- matrix (1:9e6, 3e3); m%*%m;
>>>>>> NULL}))
>>>>>> tm
>>>>>
>>>>> elapsed send receive node 1 node 2
>>>>> 9.553 0.001 0.010 9.510 9.543
>>>>>>
>>>>>> tm$data
>>>>>
>>>>> [[1]]
>>>>> send_start send_end recv_start recv_end exec
>>>>> [1,] 0 0.001 9.511 9.512 9.51
>>>>>
>>>>> [[2]]
>>>>> send_start send_end recv_start recv_end exec
>>>>> [1,] 0.001 0.001 9.544 9.553 9.543
>>>>>
>>>>>> tm$elapsed
>>>>>
>>>>> elapsed
>>>>> 9.553
>>>>>>
>>>>>
>>>>> CPU usage shows 2 cores working, and the times correspond to that.
>>>>>
>>>>> What configuration do I need to do in order to make the blas use more
>>>>> threads for the worker processes? Anything else I should do
>>>>> differently?
>>>>>
>>>>>
>>>>>> sessionInfo ()
>>>>>
>>>>> R version 2.14.0 (2011-10-31)
>>>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>>>
>>>>> locale:
>>>>> [1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C
>>>>> [3] LC_TIME=de_DE.UTF-8 LC_COLLATE=de_DE.UTF-8
>>>>> [5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=de_DE.UTF-8
>>>>> [7] LC_PAPER=C LC_NAME=C
>>>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>>>> [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
>>>>>
>>>>> attached base packages:
>>>>> [1] stats graphics grDevices utils datasets methods base
>>>>>
>>>>> other attached packages:
>>>>> [1] snow_0.3-8
>>>>>
>>>>> Additional questions:
>>>>> - Is there some command like sessionInfo () that yields information
>>>>> about
>>>>> the blas (particularly NUM_THREADS)?
>>>>> - Is there some command that I can use to tell the blas how many
>>>>> threads
>>>>> to
>>>>> use during an R session? Can I set environment variables from within R?
>>>>> Searching didn't help as I got only info about R environments...) Would
>>>>> that
>>>>> actually help here?
>>>>>
>>>>> Thanks a lot for your help.
>>>>>
>>>>> Claudia
>>>>>
>>>>> --
>>>>> Claudia Beleites
>>>>> Spectroscopy/Imaging
>>>>> Institute of Photonic Technology
>>>>> Albert-Einstein-Str. 9
>>>>> 07745 Jena
>>>>> Germany
>>>>>
>>>>> email: claudia.beleites at ipht-jena.de
>>>>> phone: +49 3641 206-133
>>>>> fax: +49 2641 206-399
>>>>>
>>>>> _______________________________________________
>>>>> R-sig-hpc mailing list
>>>>> R-sig-hpc at r-project.org
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>>>>>
>>>
>>>
>>> --
>>> Claudia Beleites
>>> Spectroscopy/Imaging
>>> Institute of Photonic Technology
>>> Albert-Einstein-Str. 9
>>> 07745 Jena
>>> Germany
>>>
>>> email: claudia.beleites at ipht-jena.de
>>> phone: +49 3641 206-133
>>> fax: +49 2641 206-399
>>>
>>>
>>>
>>>
>>>
>
>
> --
> Claudia Beleites
> Spectroscopy/Imaging
> Institute of Photonic Technology
> Albert-Einstein-Str. 9
> 07745 Jena
> Germany
>
> email: claudia.beleites at ipht-jena.de
> phone: +49 3641 206-133
> fax: +49 2641 206-399
>
>
>
>
>
More information about the R-sig-hpc
mailing list