[R-sig-hpc] ff and parallel computing (multiple nodes, shared disk)
atp at piskorski.com
Sat Nov 14 11:02:57 CET 2009
On Thu, Nov 12, 2009 at 04:29:34PM -0200, Benilton Carvalho wrote:
> I wrote my own code to use NetCDF, which doesn't perform well when I
> need random access to the data.
What sort of I/O numbers do you actually see?
You're hitting a single shared disk server with random access IO
requests from multiple nodes? If so, isn't that probably the problem
right there? Random access is a disk speed killer. I wouldn't expect
playing with NetCDF vs. SQLite vs. ff vs. bigmemory to make much
difference. Things I'd expect might help in that case would be:
- Massively faster shared disk I/O (hardware upgrade).
- Moving I/O to the slave nodes.
- Perhaps running an RDBMS that knows how to better optimize incoming
client I/O requests.
Or is your situation a bit different than the original poster's, and
your code is I/O limited even with just one node?
Andrew Piskorski <atp at piskorski.com>
More information about the R-sig-hpc