[R-sig-hpc] Using parallel package on Windows

Ross Boylan ross at biostat.ucsf.edu
Sat Jun 8 03:17:39 CEST 2013


makePSOCKcluster from the parallel package is supposed to set up a 
cluster, even on Windows.  Can anyone tell me exactly how to make that 
work when multiple networked machines are involved?

It would be possible, though not ideal, to start R on the remote machine 
manually if there is a way to hook it up to a cluster with my local machine.

The main problem I see is getting R to launch on the remote machine and 
execute an appropriate script.  I assume the script has been written, 
but I don't know where it is.  Maybe parallel invokes Rscript 
appropriately without additional intervention?  Maybe R could run as a 
Windows service?  The documentation says the package uses ssh by 
default, but there is no such command at the Windows command prompt (the 
docs allude to putty on windows).  I do have cygwin installed, but have 
never been need to attempt an ssh server on windows.

The documentation 
(file:///C:/Users/rdboylan/Documents/R/R-2.15.3/library/parallel/doc/parallel.pdf) 
with parallel alludes to using system("Rscript") to kick things off, but 
that seems more a hint than a description of what to do.

My searches have found similar questions asked multiple times, but no 
specific answers.  I've seen several assertions that makePSOCKcluster 
will work on Windows, many discussions assuming all jobs will be on one 
machine, advice to use various other packages (much of that from before 
the existence of parallel), and the suggestion to use Amazon's cluster 
instead.

I've used rmpi on linux clusters a fair amount, but Windows and the 
current parallel package are relatively unfamiliar to me.  I am also 
looking to minimize the need to perform sysadmin level actions; for 
starters I'd like to avoid installing MPI if that's possible, or even 
installing cygwin on the remote system.

Thanks.
Ross Boylan



More information about the R-sig-hpc mailing list