[R-sig-hpc] Parallel File System support in R (e.g. GPFS)

bart bart at njn.nl
Fri Feb 17 16:33:34 CET 2012


Hi Jonathan

Have a look at the raster package it has support for dealing with large 
rasters. Some functions have an implementation for using snow clusters. 
for others you maybe need to take some more effort to make them parallel 
or process small chunks in parallel. I'm not familiar with using GPFS 
support.

Bart

On 02/17/2012 04:20 PM, Jonathan Greenberg wrote:
> R-sig-hpc'ers:
>
> I've started running R on a large cluster at my university, which uses the
> IBM GPFS parallel file system.  I'm wondering if there is any support
> within R for parallel writes to a single file or if there are any
> suggestions on to the implement, say, writing to a large binary file
> representing an image.  The parallelization I'm thinking of is:
>
> given an image of x by y columns and rows represented by a flat binary
> file, process chunks of this image on different cpus/nodes, then write the
> results to a single file.  The alternative is to write each chunk out
> separately then "mosaic" them back together, but this would involve
> reading/writing the data twice, and this process is going to be an I/O
> intensive one.  Thoughts?
>
> --j
>



More information about the R-sig-hpc mailing list