[R-sig-hpc] Using parallel package on Windows
Ross Boylan
ross at biostat.ucsf.edu
Sat Jun 8 03:17:39 CEST 2013
makePSOCKcluster from the parallel package is supposed to set up a
cluster, even on Windows. Can anyone tell me exactly how to make that
work when multiple networked machines are involved?
It would be possible, though not ideal, to start R on the remote machine
manually if there is a way to hook it up to a cluster with my local machine.
The main problem I see is getting R to launch on the remote machine and
execute an appropriate script. I assume the script has been written,
but I don't know where it is. Maybe parallel invokes Rscript
appropriately without additional intervention? Maybe R could run as a
Windows service? The documentation says the package uses ssh by
default, but there is no such command at the Windows command prompt (the
docs allude to putty on windows). I do have cygwin installed, but have
never been need to attempt an ssh server on windows.
The documentation
(file:///C:/Users/rdboylan/Documents/R/R-2.15.3/library/parallel/doc/parallel.pdf)
with parallel alludes to using system("Rscript") to kick things off, but
that seems more a hint than a description of what to do.
My searches have found similar questions asked multiple times, but no
specific answers. I've seen several assertions that makePSOCKcluster
will work on Windows, many discussions assuming all jobs will be on one
machine, advice to use various other packages (much of that from before
the existence of parallel), and the suggestion to use Amazon's cluster
instead.
I've used rmpi on linux clusters a fair amount, but Windows and the
current parallel package are relatively unfamiliar to me. I am also
looking to minimize the need to perform sysadmin level actions; for
starters I'd like to avoid installing MPI if that's possible, or even
installing cygwin on the remote system.
Thanks.
Ross Boylan
More information about the R-sig-hpc
mailing list