[R-sig-hpc] Difference between PSOCK and MPI?

Norm Matloff matloff at cs.ucdavis.edu
Thu Apr 11 22:56:30 CEST 2013


I think Simon has a good point here.  MPI is great, but it can be very
tricky to set up and interface to R.

Other than the real power users, who won't use snow/parallel for their
power usage anyway, I think PSOCK is just fine.

Even we fans of parallel programming don't necessarily want to squeeze
every last ounce of speed from a program, at major expense of
convenience.  Note that even with MPI, there are various versions and
configurations to tweak if one really wants to do so; my guess is that
most people don't do so.

Norm

On Thu, Apr 11, 2013 at 04:43:45PM -0400, Simon Urbanek wrote:
> 
> On Apr 11, 2013, at 2:23 PM, Marius Hofert wrote:
> 
> > 
> > 
> > Dirk Eddelbuettel <edd at debian.org> writes:
> > 
> >> On 9 April 2013 at 23:24, Marius Hofert wrote:
> >> | What are the main differences (advantages/drawbacks) between parallel's makeCluster(,
> >> | type="PSOCK") and makeCluster(, type="MPI")? According to
> >> | http://cran.r-project.org/web/views/HighPerformanceComputing.html, MPI has
> >> | become the 'standard', although the default type of makeCluster() is "PSOCK". Is
> >> | "PSOCK" more compatible in that it does not need an installation such as
> >> | (open)MPI? slower/faster/no difference?
> >> 
> >> You don't want PSOCK.  
> >> 
> >> It is the lowest common demoninator which even works (for various definitions
> >> of "work") on Windoze.
> > 
> > Hi Dirk,
> > 
> > thanks for helping. Indeed, we don't want that. The reason I was asking for is
> > the following: I am writing a package (jointly with Martin) where we make use of
> > makeCluster(). The 'natural' default would be to use the default of the
> > 'type' argument ("PSOCK") and leave it up to the user to decide if he wants to replace
> > it by "MPI". On the other hand, one could make "MPI" the default advocating
> > 'good practice'. Since we are unsure, I thought I ask about the differences
> > between the two to get a better feeling for what a good default would be.
> > 
> 
> One thing to consider for a package is that MPI is probably not available in the majority of cases. In fact, MPI back-end is not even available in parallel by default (it routes to "snow" strangely ...). Although PSOCK is not the best performant back-end it is reasonably easy to setup so I would certainly keep it as default. If someone knows how to setup MPI then they do know how to set the optional argument, not the other way around ;).
> 
> Cheers,
> Simon
> 
> 
> 
> >> 
> >> 
> >> As your mail address reveals that you are coming from a serious place, 
> > 
> > :-)
> > 
> >> you should look into MPI.
> > 
> > we always used it, that's why it became the (internal) default so far. 
> > 
> > Cheers,
> > 
> > Marius
> > 
> > 
> >> Or just use N core machines, where N is a big as your grant allows.
> >> 
> >> Dirk
> > 
> > -- 
> > ETH Zurich
> > Dr. Marius Hofert
> > RiskLab, Department of Mathematics
> > HG E 65.2
> > Rämistrasse 101
> > 8092 Zurich
> > Switzerland
> > 
> > Phone +41 44 632 2423
> > http://www.math.ethz.ch/~hofertj
> > GPG key fingerprint 8EF4 5842 0EA2 5E1D 3D7F  0E34 AD4C 566E 655F 3F7C
> > _______________________________________________
> > R-sig-hpc mailing list
> > R-sig-hpc at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
> 
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc



More information about the R-sig-hpc mailing list