[Bioc-sig-seq] Parallel version of the Biostrings::read.DNAStringSet and write.XStringSet functions ?

Steve Lianoglou mailinglist.honeypot at gmail.com
Thu Mar 4 01:57:10 CET 2010


> In general no, these (and other R) functions are not parallelized. The usual
> strategy would be to write a script that operates on one file or other
> 'chunk' of data, and then use one of snow ('easiest'), multicore (best for
> multiple core on a linux computer), or Rmpi (computation distributed across
> clusters) to do a version of 'lapply' (e.g., mclapply, mpi.parLapply) that
> is distributed across cores / nodes.

I'd just add that I think the foreach package w/ its various backends
(eg. doMC for using multicore as its "parallelization strategy") is
actually the easiest.

But I guess that's a matter of taste :-)

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioc-sig-sequencing mailing list