[R-sig-hpc] Parallel File System support in R (e.g. GPFS)

George Ostrouchov ostrouchovg at ornl.gov
Fri Feb 17 20:20:28 CET 2012


Jonathan,

also, if you are familiar with NetCDF files, we would be glad to have 
you test drive our pncdf package when it is ready. Let me know.

If you are interested in running on some NSF funded resources and 
getting help with your parallel R code, consider 
http://rdav.nics.tennessee.edu/. Resource applications are handled 
through XSEDE https://portal.xsede.org/. I am interested in getting more 
R users on the RDAV system. It is a 1024 core shared memory system that 
is instrumented to use most of R's parallel capabilities.

George

On 2/17/12 10:20 AM, Jonathan Greenberg wrote:
> R-sig-hpc'ers:
>
> I've started running R on a large cluster at my university, which uses the
> IBM GPFS parallel file system.  I'm wondering if there is any support
> within R for parallel writes to a single file or if there are any
> suggestions on to the implement, say, writing to a large binary file
> representing an image.  The parallelization I'm thinking of is:
>
> given an image of x by y columns and rows represented by a flat binary
> file, process chunks of this image on different cpus/nodes, then write the
> results to a single file.  The alternative is to write each chunk out
> separately then "mosaic" them back together, but this would involve
> reading/writing the data twice, and this process is going to be an I/O
> intensive one.  Thoughts?
>
> --j
>

-- 
George Ostrouchov, Ph.D.
Scientific Data Group
Computer Science and Mathematics Division
Oak Ridge National Laboratory

and

Remote Data Analysis and Visualization Center
National Institute for Computational Sciences
The University of Tennessee

http://www.csm.ornl.gov/~ost



More information about the R-sig-hpc mailing list