[BioC] multicore Vignette or HowTo??

Edwin Groot edwin.groot at biologie.uni-freiburg.de
Wed Oct 20 16:24:25 CEST 2010


On Mon, 18 Oct 2010 09:34:31 -0700
 Tim Triche <tim.triche at gmail.com> wrote:
> This may or may not help, but for truly independent calculations
> (e.g.
> reading and normalizing a pile of arrays) I find that writing a
> function to
> do the task end-to-end and then handing it off to
> mclapply(list.of.keys,
> function) typically results in a near-linear speedup.  However,
> multicore is
> really not the most elegant way to do that sort of thing.  If you
> look at
> what Benilton Carvalho has done in the 'oligo' package, you will see
> a far
> more memory-efficient approach using the 'ff' and 'bit' packages to
> share a
> (supercached) flat-file image to successively stride through the
> chunks of
> data.
> 
> Anyways, I'm dumb so I just use mclapply() and keep my memory image
> small,
> run gc() a lot, and mull over using 'oligo'.
> 

Hello Tim

Well, I am dumber. How do I set up my data so that your suggestion of
mclapply(list.of.keys, function) would work under the multicore
package?
My inkling is if I had 20 scanner files, and 4 CPU cores, it would have
something to do with a list of 4 vectors of length 5 elements each. How
does such a code look like?
Thanks for the gc() tip.

Edwin

> 
> 
> On Mon, Oct 18, 2010 at 9:05 AM, Edwin Groot <
> edwin.groot at biologie.uni-freiburg.de> wrote:
> 
> > Hello all,
> > I have difficulty getting the multicore package doing what it
> promises.
> > Does anybody have a benchmark that demonstrates something intensive
> > with and without multicore assistance?
> > I have a dual dual-core Xeon, and $ top tells me all R can squeeze
> from
> > my Linux system is 25% us. Here is my example:
> >
<snip>
-- 
Dr. Edwin Groot, postdoctoral associate
AG Laux
Institut fuer Biologie III
Schaenzlestr. 1
79104 Freiburg, Deutschland
+49 761-2032945



More information about the Bioconductor mailing list