[Bioc-devel] BiocParallel: BatchJobs backend (Was: Re: BiocParallel)

Fri Jun 7 02:48:29 CEST 2013

Great - this looks promising already.

What's your test system(s), beyond standard SSH and multicore
clusters?  I'm on a Torque/PBS system.

I'm happy to test, give feedback etc.  I don't see an 'Issues' tab on
the GitHub page.  Michel, how do you prefer to get feedback?

/Henrik

On Thu, Jun 6, 2013 at 5:21 PM, Michael Lawrence
<lawrence.michael at gene.com> wrote:
> And here is the on-going development of the backend:
> https://github.com/mllg/BiocParallel/tree/batchjobs
>
> Not sure how well it's been tested.
>
> Kudos to Michel Lang for making so much progress so quickly.
>
> Michael
>
> On Thu, Jun 6, 2013 at 1:59 PM, Dan Tenenbaum <dtenenba at fhcrc.org> wrote:
>>
>> On Thu, Jun 6, 2013 at 1:56 PM, Henrik Bengtsson <hb at biostat.ucsf.edu>
>> wrote:
>> > Hi, I'd like to pick up the discussion on a BatchJobs backend for
>> > BiocParallel where it was left back in Dec 2012 (Bioc-devel thread
>> > 'BiocParallel'
>> > [https://stat.ethz.ch/pipermail/bioc-devel/2012-December/003918.html]).
>> >
>> > Florian, would you mind sharing your BatchJobs backend code?  Is it
>> > independent of BiocParallel and/or have you tried it with the most
>> > recent BiocParallel implementation
>> > [https://github.com/Bioconductor/BiocParallel/]?
>> >
>>
>> You should be aware that there is  Google Summer of Code project in
>> progress to address this.
>>
>> http://www.bioconductor.org/developers/gsoc2013/ (towards the bottom)
>>
>> Dan
>>
>>
>> > /Henrik
>> >
>> > On Tue, Dec 4, 2012 at 12:38 PM, Henrik Bengtsson <hb at biostat.ucsf.edu>
>> > wrote:
>> >> Thanks.
>> >>
>> >> On Tue, Dec 4, 2012 at 3:47 AM, Vincent Carey
>> >> <stvjc at channing.harvard.edu> wrote:
>> >>> I have been booked up so no chance to deploy but I do have access to
>> >>> SGE and
>> >>> LSF so will try and will report ASAP.
>> >>
>> >> ...and I'll try it out on PBS (... but I most likely won't have time
>> >> to do this until the end of the year).
>> >>
>> >> Henrik
>> >>
>> >>>
>> >>>
>> >>> On Tue, Dec 4, 2012 at 4:08 AM, Hahne, Florian
>> >>> <florian.hahne at novartis.com>
>> >>> wrote:
>> >>>>
>> >>>> Hi Henrik,
>> >>>> I have now come up now with a relatively generic version of this
>> >>>> SGEcluster approach. It does indeed use BatchJobs under the hood and
>> >>>> should thus support all available cluster queues, assuming that the
>> >>>> necessary batchJobs routines are available. I could only test this on
>> >>>> our
>> >>>> SGE cluster, but Vince wanted to try other queuing systems. Not sure
>> >>>> how
>> >>>> far he got. For now the code is wrapped in a little package called
>> >>>> Qcluster with some documentation. If you want to I can send you a
>> >>>> version
>> >>>> in a separate mail. Would be good to test this on other systems, and
>> >>>> I am
>> >>>> sure there remain some bugs that need to be ironed out. In particular
>> >>>> the
>> >>>> fault tolerance you mentioned needs to be addressed properly.
>> >>>> Currently
>> >>>> the code may leave unwanted garbage if things fail in the wrong
>> >>>> places
>> >>>> because all the communication is file-based.
>> >>>> Martin, I'll send you my updated version in case you want to include
>> >>>> this
>> >>>> in biocParallel for others to contribute.
>> >>>> Florian
>> >>>> --
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> On 12/4/12 5:46 AM, "Henrik Bengtsson" <hb at biostat.ucsf.edu> wrote:
>> >>>>
>> >>>> >Picking up this thread in lack of other places (= were should
>> >>>> >BiocParallel be discussed?)
>> >>>> >
>> >>>> >I saw Martin's updates on the BiocParallel - great.  Florian's SGE
>> >>>> >scheduler was also mentioned; is that one built on top of BatchJobs?
>> >>>> >If so I'd be interested in looking into that/generalizing that to
>> >>>> > work
>> >>>> >with any BatchJobs scheduler.
>> >>>> >
>> >>>> >I believe there is going to be a new release of BatchJobs rather
>> >>>> > soon,
>> >>>> >so it's probably worth waiting until that is available.
>> >>>> >
>> >>>> >The main use case I'm interested in is to launch batch jobs on a
>> >>>> >PBS/Torque cluster, and then use multicore processing on each
>> >>>> > compute
>> >>>> >node.  It would be nice to be able to do this using the BiocParallel
>> >>>> >model, but maybe it is too optimistic to get everything to work
>> >>>> > under
>> >>>> >same model.  Also, as Vince hinted, fault tolerance etc needs to be
>> >>>> >addressed and needs to be addressed differently in the different
>> >>>> >setups.
>> >>>> >
>> >>>> >/Henrik
>> >>>> >
>> >>>> >On Tue, Nov 20, 2012 at 6:59 AM, Ramon Diaz-Uriarte
>> >>>> > <rdiaz02 at gmail.com>
>> >>>> >wrote:
>> >>>> >>
>> >>>> >>
>> >>>> >>
>> >>>> >> On Sat, 17 Nov 2012 13:05:29 -0800,"Ryan C. Thompson"
>> >>>> >><rct at thompsonclan.org> wrote:
>> >>>> >>
>> >>>> >>> On 11/17/2012 02:39 AM, Ramon Diaz-Uriarte wrote:
>> >>>> >>> > In addition to Steve's comment, is it really a good thing that
>> >>>> >>> > "all
>> >>>> >>>code
>> >>>> >>> > stays the same."?  I mean, multiple machines vs. multiple cores
>> >>>> >>> > are,
>> >>>> >>> > often, _very_ different things: for instance, shared vs.
>> >>>> >>> > distributed
>> >>>> >>> > memory, communication overhead differences, whether or not you
>> >>>> >>> > can
>> >>>> >>>assume
>> >>>> >>> > packages and objects to be automagically present in the
>> >>>> >>> > slaves/child
>> >>>> >>> > process, etc. So, given they are different situations, I think
>> >>>> >>> > it
>> >>>> >>> > sometimes makes sense to want to write different code for each
>> >>>> >>>situation
>> >>>> >>> > (I often do); not to mention Steve's hybrid cases ;-).
>> >>>> >>> >
>> >>>> >>> >
>> >>>> >>> > Since BiocParallel seems to be a major undertaking, maybe it
>> >>>> >>> > would
>> >>>> >>> > be
>> >>>> >>> > appropriate to provide a flexible approach, instead of hard
>> >>>> >>> > wiring
>> >>>> >>>the
>> >>>> >>> > foreach approach.
>> >>>> >>> Of course there are cases where the same code simply can't work
>> >>>> >>> for
>> >>>> >>>both
>> >>>> >>> multicore and multi-machine situations, but those generally don't
>> >>>> >>> fall
>> >>>> >>> into the category of things that can be done using lapply. Lapply
>> >>>> >>> and
>> >>>> >>> all of its parallelized buddies like mclapply, parLapply, and
>> >>>> >>> foreach
>> >>>> >>> are designed for data-parallel operations with no interdependence
>> >>>> >>> between results, and these kinds of operations generally
>> >>>> >>> parallelize
>> >>>> >>> as
>> >>>> >>> well across machines as across cores, unless your network is not
>> >>>> >>> fast
>> >>>> >>> enough (in which case you would choose not to use multi-machine
>> >>>> >>> parallelism). If you want a parallel algorithm for something like
>> >>>> >>> the
>> >>>> >>> disjoin method of GRanges, you might need to write some special
>> >>>> >>> purpose
>> >>>> >>> code, and that code might be very different for multicore vs
>> >>>> >>>multi-machine.
>> >>>> >>
>> >>>> >>> So yes, sometimes there is a fundamental reason that you have to
>> >>>> >>> change
>> >>>> >>> the code to make it run on multiple machines, and neither foreach
>> >>>> >>> nor
>> >>>> >>> any other parallelization framework will save you from having to
>> >>>> >>>rewrite
>> >>>> >>> your code. But often there is no fundamental reason that the code
>> >>>> >>> has
>> >>>> >>>to
>> >>>> >>> change, but you end up changing it anyway because of limitations
>> >>>> >>> in
>> >>>> >>>your
>> >>>> >>> parallelization framework. This is the case that foreach saves
>> >>>> >>> you
>> >>>> >>>from.
>> >>>> >>
>> >>>> >>
>> >>>> >>
>> >>>> >> Hummm... I guess you are right, and we are talking about "often"
>> >>>> >> or
>> >>>> >>"most
>> >>>> >> of the time", which is where all this would fit. Point taken.
>> >>>> >>
>> >>>> >>
>> >>>> >> Best,
>> >>>> >>
>> >>>> >> R.
>> >>>> >>
>> >>>> >>
>> >>>> >>
>> >>>> >>
>> >>>> >>
>> >>>> >>
>> >>>> >> --
>> >>>> >> Ramon Diaz-Uriarte
>> >>>> >> Department of Biochemistry, Lab B-25
>> >>>> >> Facultad de Medicina
>> >>>> >> Universidad Autónoma de Madrid
>> >>>> >> Arzobispo Morcillo, 4
>> >>>> >> 28029 Madrid
>> >>>> >> Spain
>> >>>> >>
>> >>>> >> Phone: +34-91-497-2412
>> >>>> >>
>> >>>> >> Email: rdiaz02 at gmail.com
>> >>>> >>        ramon.diaz at iib.uam.es
>> >>>> >>
>> >>>> >> http://ligarto.org/rdiaz
>> >>>> >>
>> >>>> >> _______________________________________________
>> >>>> >> Bioc-devel at r-project.org mailing list
>> >>>> >> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> >>>>
>> >>>
>> >
>> > _______________________________________________
>> > Bioc-devel at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>