[Bioc-devel] [Untrusted Server]Re: [Untrusted Server]Re: strange error in Jenkins build forsingleCellWorkflow

Martin Morgan martin.morgan at roswellpark.org
Tue Sep 26 10:05:22 CEST 2017


On 09/26/2017 03:04 AM, Aaron Lun wrote:
> Hi Herve,
> 
> 
> I tried out the .BBSoptions approach, but it seems that the build system 
> is still having some trouble:
> 
> 
> http://docbuilder.bioconductor.org:8080/job/simpleSingleCell/label=master/59/console
> 
> 
> I bumped up the maximum number of DLLs to 200 in .BBSoptions, but to no 
> effect. Any ideas?

This is my bad advice; as Herve mentions the workflow builders do not 
respect BBS options. We will adjust the max. DLLs on our end. Please be 
patient.

Martin

> 
> 
> -Aaron
> 
> ------------------------------------------------------------------------
> *From:* Hervé Pagès <hpages at fredhutch.org>
> *Sent:* Thursday, 21 September 2017 3:06:18 PM
> *To:* Aaron Lun; Martin Morgan; bioc-devel at r-project.org
> *Subject:* Re: [Bioc-devel] [Untrusted Server]Re: [Untrusted Server]Re: 
> strange error in Jenkins build forsingleCellWorkflow
> Hi,
> 
> @Martin: It's good news that the workflows have been standardized as
> packages but aren't we still using the traditional workflow builder?
> AFAIK .BBSoptions files are only honoured on the main build system
> (a.k.a. BBS).
> 
> @Aaron: If we decide to use BBS (our main build system) to build the
> workflows, then you'll be able to control R_MAX_NUM_DLLS by putting
> the following lines to your .BBSoptions file:
> 
> RbuildPrepend: R_MAX_NUM_DLLS=150
> RbuildPrepend.win: set R_MAX_NUM_DLLS=150&&
> RcheckPrepend: R_MAX_NUM_DLLS=150
> RcheckPrepend.win: set R_MAX_NUM_DLLS=150&&
> 
> You might not need all of them but it doesn't hurt to have them
> all. Note that you should not try to put a space before && in the
> RbuildPrepend.win or RcheckPrepend.win value.
> 
> H.
> 
> On 09/19/2017 05:51 PM, Aaron Lun wrote:
>> Thanks Martin. I think I will stick to one workflow for now, until the
>> BioC-workflows page provides some formal support for multiple workflows
>> representing different components of the same workflow (i.e., other than
>> me manually writing in the abstract that "This workflow is based on the
>> concepts introduced in the previous workflow X").
>>
>>
>> @Herve can you help me out with the .BBSoptions configuration for
>> R_MAX_NUM_DLLS? I guess we should also indicate to the user that this
>> needs to be increased in order for the workflow to run.
>>
>>
>> -Aaron
>>
>>
>>
>> ------------------------------------------------------------------------
>> *From:* Bioc-devel <bioc-devel-bounces at r-project.org> on behalf of
>> Martin Morgan <martin.morgan at roswellpark.org>
>> *Sent:* Wednesday, 20 September 2017 2:16 AM
>> *To:* Wolfgang Huber; bioc-devel at r-project.org
>> *Subject:* Re: [Bioc-devel] [Untrusted Server]Re: [Untrusted Server]Re:
>> strange error in Jenkins build forsingleCellWorkflow
>> On 09/19/2017 09:50 AM, Wolfgang Huber wrote:
>>>
>>> My 3 cents:
>>> - I think this is a more and more common problem that I'm also
>>> encountering in everyday work and that asks for a general solution.
>>> - I agree with Martin that setting R_MAX_NUM_DLLS is better than
>>> unloading. AfaIk it is not even possible to cleanly unload every package
>>> ('as if it had never been loaded') due to irreversible global effects;
>>> although I'd happy to be educated otherwise.
>>> - R_MAX_NUM_DLLS is not a sustainable solution either: the current
>>> default is 100, but e.g. on my MacOS 10.12 any value >152 leads to an
>>> error. Upping to the maximum 152 will give us some temporary respite but
>>> seems not really future-proof.
>>
>> This was the R-core motivation for increasing the max to only 100, but
>> it's still surprising to me that a modern OS has such a tight limit.
>> I'll see if there are ideas in R-core.
>>
>>   From our internal discussions there is some willingness to (continue)
>> supporting large and complicated work flows, but it is valuable to think
>> carefully about the consequences for users following along. Maybe part
>> of this is clearly alerting the user to the fact that 500G of data are
>> going to be downloaded, the workflow requires advanced configuration of
>> R, etc.
>>
>> @Aaron -- if you'd like to continue with one work flow, contact Herve
>> (cc'd) and he'll provide the .BBSoptions configuration to allow the
>> build system to use an appropriate R_MAX_NUM_DLLS. If instead you'd like
>> to produce two workflows, then the best strategy in your case would be
>> to simply have two independent packages (DESCRIPTION + vignettes/) each
>> with more modest numbers of DLLs; contact Lori (cc'd) when you've
>> decided on a second name, and we'll create the svn location for you.
>>
>> Martin
>>
>>>
>>>      Wolfgang
>>>
>>> 19.9.17 12:02, Martin Morgan scripsit:
>>>> On 09/18/2017 10:42 PM, Shian Su wrote:
>>>>> Hi Aaron,
>>>>>
>>>>> Would you mind sharing the code for flushing DLLs? This is a problem
>>>>> that others working with single cells and I have faced.
>>>>>
>>>>
>>>> For the user encountering this problem I think a better solution is to
>>>> increase the number of DLLs allowed by R, for instance editing
>>>> .Renviron to contain the line
>>>>
>>>> R_MAX_NUM_DLLS=120
>>>>
>>>> or similar. This can be on an installation-wide, user-wise, or
>>>> project-specific basis, as described in ?Startup
>>>>
>>>> @Aaron -- we are still discussing things internally; for instance it
>>>> is possible to set the maximum number of DLLs in the build system.
>>>>
>>>> Martin
>>>>
>>>>> Better yet would anyone know of code that would allow unused DLL to
>>>>> be identified and unloaded? I suspect not as it would require keeping
>>>>> track of the dependency tree of your current environment but I’m
>>>>> hopeful.
>>>>>
>>>>> Kind regards,
>>>>> Shian Su
>>>>>
>>>>>> On 19 Sep 2017, at 12:30 pm, Aaron Lun <alun at wehi.edu.au> wrote:
>>>>>>
>>>>>> Well, inertia won out in the end, and so I've just moved a whole
>>>>>> stack of packages into "Suggests" for now. This is probably not a
>>>>>> sustainable solution as the workflow can potentially get larger over
>>>>>> time; I would prefer to have some formal support for splitting up
>>>>>> the workflow into modules that can be independently installed.
>>>>>>
>>>>>> -Aaron
>>>>>> ________________________________
>>>>>> From: Vincent Carey <stvjc at channing.harvard.edu>
>>>>>> Sent: Saturday, 16 September 2017 10:08:13 PM
>>>>>> To: Aaron Lun
>>>>>> Cc: Martin Morgan; bioc-devel at r-project.org
>>>>>> Subject: Re: [Bioc-devel] [Untrusted Server]Re: strange error in
>>>>>> Jenkins build forsingleCellWorkflow
>>>>>>
>>>>>> IMHO the pedagogic value of a unified document that treats a topic
>>>>>> thoroughly
>>>>>> is quite high.  Building the whole workflow on an arbitrary user's
>>>>>> system seems to
>>>>>> me to be a lower priority.  Thus using the environment variable in
>>>>>> the build system
>>>>>> to avoid this limit seems an appropriate solution.
>>>>>>
>>>>>> On Sat, Sep 16, 2017 at 7:43 AM, Aaron Lun
>>>>>> <alun at wehi.edu.au<mailto:alun at wehi.edu.au>> wrote:
>>>>>> Thanks Martin. Yes, it's quite unfortunate that scater drags in
>>>>>> dplyr and ggplot2, which - combined with Bioconductor's core
>>>>>> packages - already puts us pretty close to the limit without doing
>>>>>> anything else!
>>>>>>
>>>>>>
>>>>>> A solution might be to split my workflow into self-contained
>>>>>> components, each of which can become its own workflow package (e.g.,
>>>>>> simpleSingleCell1, simpleSingleCell2, simpleSingleCell3 and so on).
>>>>>> This should avoid all of the problems and our associated hacks.
>>>>>>
>>>>>>
>>>>>> I'm happy to do this, but is it possible for the website to indicate
>>>>>> that there is a connection between the component workflows? For
>>>>>> example, the link that ordinarily goes to the compiled workflow
>>>>>> could instead go to an indexing page, which contains links to
>>>>>> individual component workflows.
>>>>>>
>>>>>>
>>>>>> -Aaron
>>>>>>
>>>>>>
>>>>>> ________________________________
>>>>>> From: Martin Morgan
>>>>>> <martin.morgan at roswellpark.org<mailto:martin.morgan at roswellpark.org>>
>>>>>> Sent: Saturday, 16 September 2017 8:18:09 PM
>>>>>> To: Aaron Lun;
>>>>>> bioc-devel at r-project.org<mailto:bioc-devel at r-project.org>
>>>>>> Subject: Re: [Bioc-devel] [Untrusted Server]Re: strange error in
>>>>>> Jenkins build forsingleCellWorkflow
>>>>>>
>>>>>> On 09/16/2017 01:53 AM, Aaron Lun wrote:
>>>>>>> Bumping this rather old thread. To re-iterate, I'm updating my
>>>>>>> simpleSingleCell workflow and I'm running into R's DLL limit. I've
>>>>>>> added a code block halfway through the workflow that unloads all
>>>>>>> DLLs and cleans them out, and this works fine during compilation on
>>>>>>> my local machine.
>>>>>>>
>>>>>>>
>>>>>>> However, it seems that the BioC workflow builder uses a
>>>>>>> pre-processing step whereby it first tries to load all packages
>>>>>>> contained within library() calls. This hits the DLL limit as it
>>>>>>> doesn't execute the protective code block, which defeats the
>>>>>>> purpose of all my fiddling in the first place.
>>>>>>>
>>>>>>>
>>>>>>> What options are there? I'm happy to split my workflow into
>>>>>>> multiple smaller Rmarkdown files that get compiled separately,
>>>>>>> provided there is appropriate support for this setup from the build
>>>>>>> system
>>>>>>
>>>>>> The workflows have been standardized as packages. The packages put the
>>>>>> workflow dependencies in the 'Depends:' field, with the idea being that
>>>>>> the user installing the workflow package 'in the usual way' will get
>>>>>> the
>>>>>> packages used in the vignette installed in their system 'in the usual
>>>>>> way' without having to execute special variants of biocLite() /
>>>>>> install.packages() / funky code in the vignette itself to be able to
>>>>>> build the vignette.
>>>>>>
>>>>>> Loading a package loads its Depends: (and Imports:) so triggers the
>>>>>> problem.
>>>>>>
>>>>>> Writing separate vignettes would not help with this (but might make the
>>>>>> workflow more palatable; I'm not 100% sure of support for separate work
>>>>>> flows in a single package, there is no problem with having multiple
>>>>>> workflow packages on the same general topic).
>>>>>>
>>>>>> One could move (some?) packages to Suggests: and use your trick of
>>>>>> unloading packages part-way through the vignette. But then users will
>>>>>> find that they need to install packages to complete the vignette.
>>>>>>
>>>>>> 'We' could add a support for a BBS option that increases
>>>>>> R_MAX_NUM_DLLS,
>>>>>> but that would allow the workflow to build on the build system, but not
>>>>>> on the users' system.
>>>>>>
>>>>>> I think also the R-core approach to this
>>>>>> (https://stat.ethz.ch/pipermail/r-devel/2016-December/073529.html,
>>>>>>https://github.com/wch/r-source/commit/757bfa1d7ff373a604d6d34617f9cad78e0c875e
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_wch_r-2Dsource_commit_757bfa1d7ff373a604d6d34617f9cad78e0c875e&d=DwMFEA&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=wbZGwxgJ7vc_2EjT6t3tlmN3HOB8koZjSWG1bhJaso0&s=hWib1RRxLYfpoHR_GROWJ26var56HcJRnNGB1cj25J8&e=>)
>>
>>>>>>
>>>>>> is a little insightful, where one could imagine increasing the default
>>>>>> R_MAX_NUM_DLLS, but apparently on some OS these compete for number of
>>>>>> open files, and this in turn can be quite low.
>>>>>>
>>>>>> I note that users have already struggled with the DLL problem 'in the
>>>>>> wild'https://stackoverflow.com/a/45552926/547331
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_a_45552926_547331&d=DwMFEA&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=wbZGwxgJ7vc_2EjT6t3tlmN3HOB8koZjSWG1bhJaso0&s=l2gtLudMs8ZthtaIFk7n7Bb7QvaLQHCIcWDWT-jRLJY&e=>.
>> This seems
>>>>>> particularly problematic for workflows, which are appealing to
>>>>>> relatively novice users.
>>>>>>
>>>>>> At the end of the day I think the workflows should make realistic
>>>>>> use of
>>>>>> R resources. I think this means modifying the workflow to use fewer
>>>>>> DLLs. (this general comment is relevant to other workflows, which for
>>>>>> instance start by downloading very large data sets -- I know that less
>>>>>> constrained use of computing resources is supposed to be a selling
>>>>>> point
>>>>>> of the workflows, but in excess this seems counter-productive to their
>>>>>> primary use as pedagogic tools [rather than, for instance,
>>>>>> comprehensive
>>>>>> exemplars of reproducible research]).
>>>>>>
>>>>>> Maybe there is additional discussion about some of the technical
>>>>>> aspects
>>>>>> of workflows that others might contribute.
>>>>>>
>>>>>> Martin
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Cheers
>>>>>>>
>>>>>>>
>>>>>>> Aaron
>>>>>>>
>>>>>>> ________________________________
>>>>>>> From: Bioc-devel
>>>>>>> <bioc-devel-bounces at r-project.org<mailto:bioc-devel-bounces at r-project.org>>
>>>>>>> on behalf of Aaron Lun <alun at wehi.edu.au<mailto:alun at wehi.edu.au>>
>>>>>>> Sent: Wednesday, 21 June 2017 12:09:13 AM
>>>>>>> To: bioc-devel at r-project.org<mailto:bioc-devel at r-project.org>
>>>>>>> Subject: [Untrusted Server]Re: [Bioc-devel] strange error in
>>>>>>> Jenkins build forsingleCellWorkflow
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>>
>>>>>>> I'm getting a curious error in the Jenkins log when I try to build
>>>>>>> the singleCellWorkflow:
>>>>>>>
>>>>>>>
>>>>>>>http://docbuilder.bioconductor.org:8080/job/simpleSingleCell/48/label=master/console
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__docbuilder.bioconductor.org-3A8080_job_simpleSingleCell_48_label-3Dmaster_console&d=DwMFEA&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=wbZGwxgJ7vc_2EjT6t3tlmN3HOB8koZjSWG1bhJaso0&s=RswvfSl6whS1FwPPojy-aqHFraiNpmUhkRN5t-MGpL4&e=>
>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> The key part is at the bottom:
>>>>>>>
>>>>>>>
>>>>>>> Error: package or namespace load failed for 'GenomicFeatures' in
>>>>>>> dyn.load(file, DLLpath = DLLpath, ...):
>>>>>>>   unable to load shared object
>>>>>>> '/var/lib/jenkins/R/x86_64-pc-linux-gnu-library/3.4/Rsamtools/libs/Rsamtools.so':
>>>>>>>
>>>>>>>    `maximal number of DLLs reached...
>>>>>>>
>>>>>>>
>>>>>>> The workflow had previously been running fine on the build system;
>>>>>>> I'm not quite sure what's going on here, given that it's not even
>>>>>>> failing at the point where I made the latest changes.
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> Aaron
>>>>>>>
>>>>>>>          [[alternative HTML version deleted]]
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Bioc-devel at r-project.org<mailto:Bioc-devel at r-project.org> mailing list
>>>>>>>https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwMFEA&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=wbZGwxgJ7vc_2EjT6t3tlmN3HOB8koZjSWG1bhJaso0&s=h3K_hFGpne-7mRXJe_epyAop1mQi_0q-ld8a0aCyVSg&e=>
>>>>>>>
>>>>>>>        [[alternative HTML version deleted]]
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Bioc-devel at r-project.org<mailto:Bioc-devel at r-project.org> mailing list
>>>>>>>https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwMFEA&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=wbZGwxgJ7vc_2EjT6t3tlmN3HOB8koZjSWG1bhJaso0&s=h3K_hFGpne-7mRXJe_epyAop1mQi_0q-ld8a0aCyVSg&e=>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> This email message may contain legally privileged and/or
>>>>>> confidential information.  If you are not the intended recipient(s),
>>>>>> or the employee or agent responsible for the delivery of this
>>>>>> message to the intended recipient(s), you are hereby notified that
>>>>>> any disclosure, copying, distribution, or use of this email message
>>>>>> is prohibited. If you have received this message in error, please
>>>>>> notify the sender immediately by e-mail and delete this email
>>>>>> message from your computer. Thank you.
>>>>>>
>>>>>>         [[alternative HTML version deleted]]
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioc-devel at r-project.org<mailto:Bioc-devel at r-project.org> mailing list
>>>>>>https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwMFEA&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=wbZGwxgJ7vc_2EjT6t3tlmN3HOB8koZjSWG1bhJaso0&s=h3K_hFGpne-7mRXJe_epyAop1mQi_0q-ld8a0aCyVSg&e=>
>>>>>>
>>>>>>
>>>>>>     [[alternative HTML version deleted]]
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioc-devel at r-project.org mailing list
>>>>>>https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwMFEA&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=wbZGwxgJ7vc_2EjT6t3tlmN3HOB8koZjSWG1bhJaso0&s=h3K_hFGpne-7mRXJe_epyAop1mQi_0q-ld8a0aCyVSg&e=>
>>>>>
>>>>> _______________________________________________
>>>>> Bioc-devel at r-project.org mailing list
>>>>>https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwMFEA&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=wbZGwxgJ7vc_2EjT6t3tlmN3HOB8koZjSWG1bhJaso0&s=h3K_hFGpne-7mRXJe_epyAop1mQi_0q-ld8a0aCyVSg&e=>
>>>>>
>>>>
>>>>
>>>> This email message may contain legally privileged and/or...{{dropped:2}}
>>>>
>>>> _______________________________________________
>>>> Bioc-devel at r-project.org mailing list
>>>>https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwMFEA&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=wbZGwxgJ7vc_2EjT6t3tlmN3HOB8koZjSWG1bhJaso0&s=h3K_hFGpne-7mRXJe_epyAop1mQi_0q-ld8a0aCyVSg&e=>
>>>
>>
>>
>> This email message may contain legally privileged and/or...{{dropped:2}}
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwMFEA&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=wbZGwxgJ7vc_2EjT6t3tlmN3HOB8koZjSWG1bhJaso0&s=h3K_hFGpne-7mRXJe_epyAop1mQi_0q-ld8a0aCyVSg&e=>
> 
> -- 
> Hervé Pagès
> 
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
> 
> E-mail: hpages at fredhutch.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319


This email message may contain legally privileged and/or...{{dropped:2}}



More information about the Bioc-devel mailing list