[Bioc-devel] BiocParallel: flattening iteration

Ryan C. Thompson rct at thompsonclan.org
Thu Nov 14 20:22:41 CET 2013


Just a note: the foreach package has solved this by providing a 
"nesting" operator, which effectively converts multiple nested foreach 
loops into one big one: 
http://cran.r-project.org/web/packages/foreach/vignettes/nested.pdf

On Thu 14 Nov 2013 09:24:29 AM PST, Michael Lawrence wrote:
> I like the general idea of having iterators; was just checking out the
> itertools package after not having looked at it for a while. I could see
> having a BiocIterators package, and a bpiterate(iterator, FUN, ...,
> BPPARAM). My suggestion was simpler though. Right now, bpmapply runs a
> single job per iteration, which is fine; the flattening could just be made
> more convenient.
>
>
> On Thu, Nov 14, 2013 at 8:47 AM, Michel Lang <michellang at gmail.com> wrote:
>
>> We use a design iterator in BatchExperiments::makeDesign for a cartesian
>> product. I found a old version of designIterator (cf. <
>> https://github.com/tudo-r/BatchExperiments/blob/master/R/designs.R>) w/o
>> the optional data.frame input which is easier to read: <
>> https://gist.github.com/mllg/7469844>.
>>
>> AFAIR the speed was decent. Major advantage is the low memory footprint.
>> Could easily be ported to a reference class and should run on arbitrary
>> objects with small modifications.
>>
>> In parallel it might get tricky though. You could try to sequentially
>> request [n] elements of the iterator, chunk them to match [desired number
>> of jobs] and then parallelize. Or write a function which converts a single
>> integer to the respective integer vector "state" in above referenced gist
>> which is in my opinion much more flexible.
>>
>>
>>
>> 2013/11/14 Michael Lawrence <lawrence.michael at gene.com>
>>
>>> Hi guys,
>>>
>>> We often need to iterate over the cartesian product of two dimensions,
>>> like sample X chromosome. This is preferable to nested iteration, which
>> is
>>> complicated. I've been using expand.grid and bpmapply for this, but it
>>> seems like this could be made easier. Like bpmapply could gain a
>> CARTESIAN
>>> argument with a list of arguments to multiply? But somehow you would want
>>> the resuls combined in a way that does not require generating the product
>>> manually. Maybe a list of lists? We need to think more about the
>>> combination step, in general.
>>>
>>> Michael
>>>
>>
>>          [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel



More information about the Bioc-devel mailing list