[R-sig-hpc] SNOW Hybrid Cluster in R, Network problems
M.Seilmayer at hzdr.de
Tue Jul 3 11:31:30 CEST 2012
Hi all of you,
I successfully created a hybrid cluster of several Windows and Linux
machines using snow and MPICH2. Basically I setup a SOCK - Cluster. To
start the Rscript processes on each machine MPICH2 comes in the game.
Because it is Platform independent one can start processes on both OS,
Win and Linux more or less remote. I know SSH is possible on Linux, but
I'd like to have a clean solution for Windows too.
The first problem I have now is the following. With starting programming
scripts running parallel code I noticed, that parLapply and all the
others used to distribute many data to the nodes, IF these functions are
called in a subroutine of a Script. Calling them from "Console" the
Networkload is minimal, calling a function witch then calls parLapply
causes big load on the Network. Now I have an big array to calculate an
all the traffic slowing it down. I tried to read the R-Code in parApply
and deeper, but can't find a useful hint.
The secound Problem is connected to the first. Two of four Windows
mashines starting at 100 Mbit/s and collapses to 2.8 Mbit/s after 1 s.
Now imagine snow tries to transfer many data... this slowes down the
hole process enormous. So why the data transfer breaks down? I checked
the cables, switches, firewalls and all what is related to physical
networking. Nothing, everything is fine. One could transfer Files via
FTP on the communication ports of R (10187) without any restriction. 100
Mbit/s is absolutely possible. So my opinion is, that this must be an
other software problem, maybe in R itsself?!
Many thanks for any idea!
Dipl.-Ing. Martin Seilmayer
Helmholtz-Zentrum Dresden-Rossendorf e. V.
Institut fuer Fluiddynamik
Bautzner Landstraße 400
01328 Dresden, Germany
@fon: +49 351 260 3165
@fax: +49 351 260 12969
More information about the R-sig-hpc