[Bioc-devel] timeout on perceval server macos snow leopard

Giorgio Melloni melloni.giorgio at gmail.com
Tue Sep 8 09:38:08 CEST 2015


Hi Valerie,

thanks for useful advice. I applied the modifications you suggested, exposing BPPARAM and removing parallel options section. The same change was applied to an internal method of function allPfamAnalysis called lfmSingleSequence so that there are no nested parallel part anymore. I also changed the name of the internal variable allPfamAnalysis so that has not the same name of the function. Finally, I reduced all the parallelizations in vignette and example to a maximum of 2 cores.

The only doubt I have is with this apply:

This is how I wrote it (lines 130-141):
singleSequence <- lapply(allPfamsLM , function(object) {
		if(Sys.info()[['sysname']] == 'Windows') library(LowMACA)
		if(!verbose)
			suppressMessages(lfmSingleSequence(object, metric='qvalue', threshold=.05
				, conservation=conservation , mail=NULL , perlCommand="perl" , BPPARAM=BPPARAM))
		else
			lfmSingleSequence(object, metric='qvalue', threshold=.05
				, conservation=conservation , mail=NULL , perlCommand="perl" , BPPARAM=BPPARAM
				, verbose=TRUE)
		})

In this case the parallelization is inside the lapply and is destined to the lfmSingleSequence method that inherit BPPARAM from the allPfamAnalysis function (e.g. BPPARAM=MulticoreParam(2) ).

I also tried this second version, with external parallelization and forced serialization inside, but since it was slower I dropped it (in my little experience, it is generally faster to parallelize the external layer, but apparently this is not the case):

singleSequence <- bplapply(allPfamsLM , function(object) {
		if(Sys.info()[['sysname']] == 'Windows') library(LowMACA)
		if(!verbose)
			suppressMessages(lfmSingleSequence(object, metric='qvalue', threshold=.05
				, conservation=conservation , mail=NULL , perlCommand="perl" , BPPARAM=bpparam("SerialParam"))
		else
			lfmSingleSequence(object, metric='qvalue', threshold=.05
				, conservation=conservation , mail=NULL , perlCommand="perl" , BPPARAM=bpparam("SerialParam")
				, verbose=TRUE)
		} , BPPARAM=BPPARAM)

A part from speed, what is the best option?


thanks again,

Giorgio

On Sep 4, 2015, at 6:52 PM, Obenchain, Valerie wrote:

> Hi,
> 
> I've taken a quick look at LowMACA. It looks like you're doing more work
> than you need to with 'applyfun'. BiocParallel automatically detects the
> OS and will use SnowParam, MulticoreParam or SerialParam as needed.
> 
> The more important problem is that you have several nested layers of
> parallel evaluation, each requesting maximum cores. That doesn't explain
> why the package only fails only on perceval but it probably points to
> requesting more resources than are available. I can help you work
> through this offline (valerie.obenchain at roswellpark.org) or here on the
> list.
> 
> I would start by making these changes:
> 
> - replace 'applyfun' with bplapply()
> 
> - remove "Parallel Options" section in code files
> 
> - expose the BPPARAM arg to users so they can select the back-end and
> number of cores
> 
> - modify allPfamAnalysis.R
> 
> Inside the allPfamAnalysis() function you have another function with the
> same name and allPfamanalysis is also used as a variable name. This
> makes the code difficult to read and hard to determine the number of
> time the function is called.
> 
> It looks like inside allPfamAnalysis() you create another
> allPfamAnalysis() that calls lfmSingleSequence() (2 nested layers of
> bplaply with max cores). The same goes for singleSequence() which calls
> lfmSingleSequence() (2 nested layers again).
> 
> BiocParallel does not manage the number of cores used so you need to be
> careful of nesting layers too deep. The best approach is to expose
> BPPARAM to the user so they can manage resources. By 'expose' I mean
> allow the argument to be passed to the function and used in bplapply().
> 
> Let me know if you have questions.
> Valerie
> 
> 
> On 09/04/2015 03:22 AM, Giorgio Melloni wrote:
>> I recently enhanced my package LowMACA on Bioconductor and I finally obtained the OK on build and check from linux and windows. On MacOS, the build get stuck at some point of the vignette during building but I don't know how to spot what is the function that is taking so long. 
>> 
>> Here is the result of the build report. http://bioconductor.org/checkResults/devel/bioc-LATEST/LowMACA/perceval-buildsrc.html
>> 
>> Do you know how to figure it out?
>> 
>> thanks,
>> Giorgio
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> 
> 
> 
> 
> This email message may contain legally privileged and/or confidential information.  If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited.  If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you.



More information about the Bioc-devel mailing list