[Bioc-devel] build machines
Martin Morgan
martin.morgan at roswellpark.org
Fri Apr 27 19:50:08 CEST 2018
For what it's worth, BiocParallel implemented as outlined in it's
vignette limits the number of cores via
if (nzchar(Sys.getenv("BBS_HOME")))
cores <- min(4L, cores)
i.e., checking an environment variable set on the build system. This is
highly fragile and I wouldn't necessarily recommend this outside the
BiocParallel context.
Martin
On 04/27/2018 01:39 PM, Ludwig Geistlinger wrote:
> Hi Hervé,
>
>> Some packages are good citizens and limit the number of
>> cores to 1 or 2 only during 'R CMD check' but some packages
>> try to use all the cores that are available
>
> That seems to be an important note for developers using parallel computation.
> What's best practice to realize this within my code, i.e. checking whether the code is currently subject to R CMD check (and accordingly reducing the number of cores used)?
>
> Thanks,
> Ludwig
>
> --
> Dr. Ludwig Geistlinger
> CUNY School of Public Health
>
> ________________________________________
> From: Bioc-devel <bioc-devel-bounces at r-project.org> on behalf of Kasper Daniel Hansen <kasperdanielhansen at gmail.com>
> Sent: Friday, April 27, 2018 10:29 AM
> To: Hervé Pagès
> Cc: bioc-devel at r-project.org
> Subject: Re: [Bioc-devel] build machines
>
> Thanks.
>
> I used
> /usr/bin/time -v R CMD check ...
> to record the max memory usage of the check, which for minfi suggests
> around 5Gb. That's a lot.
>
> Best,
> Kasper
>
> On Thu, Apr 26, 2018 at 3:02 PM, Hervé Pagès <hpages at fredhutch.org> wrote:
>
>> Hi,
>>
>> The Linux and Windows builders have 32 GB of RAM, the Mac
>> builders 64 Gb.
>>
>> We also run concurrent R CMD check's.
>>
>> Here is a summary:
>>
>> platform RAM nb of nb of concurrent
>> (Gb) cores R CMD check's
>> ---------------------------------------------------
>> Linux (malbecs) 32 20 10
>> Windows (tokays) 32 40 24
>> Mac (meridas) 64 24 18
>>
>> That's a lot of concurrency. And there is actually more
>> concurrency than that if you consider the fact that many
>> packages run things in parallel during 'R CMD check'.
>> Some packages are good citizens and limit the number of
>> cores to 1 or 2 only during 'R CMD check' but some packages
>> try to use all the cores that are available. This will have
>> a strong impact on the overall progress of the builds. We
>> don't have an easy way to identify those packages right now.
>>
>> In average, based on our monitoring of the build machines
>> things seem to work ok i.e. the concurrent R CMD check's
>> don't seem to be competing too much to access resources.
>>
>> But occasionally there could be too much competition. The
>> crazy big elapsed time compared to the relatively short user
>> and system times that you observed Kasper are likely to reflect
>> that. They could be the sign that the machine ran out of memory
>> and started swapping. Not because it happens to your package
>> means that your package uses too much memory. The swapping is
>> the result of the **cumulated** memory usage of all the
>> R CMD check's running at that moment. It could be worth checking
>> how much memory R CMD check'ing your package uses though.
>>
>> The exact set of packages that are being R CMD check'ed at any
>> given time is in constant fluctuation and will also vary from
>> one day to the other. This would explain why some days you see
>> timeouts on some platforms and some days not. We don't have
>> an easy way to know which packages were competing with yours
>> during the 40 min window that 'R CMD check' was running on your
>> package until the build system declared a timeout. It's possible
>> (by looking at the BBS logs) but is time consuming.
>>
>> We should probably add some memory at some point to the Windows
>> builders. 32 Gb is not enough to smoothly run 24 R CMD check's
>> concurrently.
>>
>> H.
>>
>>
>> On 04/26/2018 08:48 AM, Diogo FT Veiga wrote:
>>
>>> Hi Daniel,
>>>
>>> I have the same issue with my package (new contribution). I just finish
>>> reviewing the package with the modifications requested.
>>>
>>> I am having a warning because R CMD check is exceeding 5 min, but this is
>>> happening only in the Windows machine.
>>>
>>> In Linux and OSX the check finishes in <= 4min, while in Windows takes
>>> ~6min.
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__biocondu
>>> ctor.org_spb-5Freports_maser-5Fbuildreport-5F20180425114748
>>> .html&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY
>>> _wJYbW0WYiZvSXAJJKaaPhzWA&m=JwiMI-3BEUJlonlihLD_mDkPuEIalQbk
>>> rQPSGahzfsg&s=1aMitB3PnVLoojx1lnj_UT_ZeKlJ_OcJDFT4D6BPXow&e=
>>>
>>>
>>> Not sure how to proceed from here.
>>>
>>> Thanks,
>>> Diogo
>>>
>>>
>>> On Thu, Apr 26, 2018 at 9:52 AM, Kasper Daniel Hansen <
>>> kasperdanielhansen at gmail.com> wrote:
>>>
>>> We have been working on the minfi package lately, with a move to a
>>>> DelayedArray backend.
>>>>
>>>> Right now there are some weird issues regarding timings in R CMD check.
>>>> Leaving aside the issue that the tests (now disabled) and examples are
>>>> too
>>>> slow, we get some very weird behaviour.
>>>>
>>>> An example is the current (soon to be replace) build report of minfi
>>>> 1.25.2
>>>> which prints
>>>>
>>>> Examples with CPU or elapsed time > 5s
>>>> user system elapsed
>>>> preprocessFunnorm 99.388 0.632 148.554
>>>> combineArrays 64.104 2.120 68.329
>>>> bumphunter 62.540 1.392 64.107
>>>> preprocessNoob 43.944 0.016 44.955
>>>> preprocessQuantile 33.968 0.064 36.547
>>>> getAnnotation 31.072 0.024 31.126
>>>> compartments 18.668 0.188 18.871
>>>> minfiQC 10.124 6.628 1102.929
>>>> getSex 10.536 0.012 10.561
>>>> read.metharray 7.504 2.116 82.713
>>>> read.metharray.exp 9.076 0.032 10.592
>>>> mapToGenome-methods 4.648 0.548 163.648
>>>> mdsPlot 0.340 0.204 14.901
>>>>
>>>>
>>>> on Tokay (Linux). Note minfiQC which has an elapsed time which is crazy
>>>> high compared to user+system. Previous build report (which I didn't
>>>> save)
>>>> had a timeout on all platforms with a semingly similar behaviour but with
>>>> the getSex function. The code did not change in the meantime. For
>>>> today's
>>>> build we only see this on Linux, but yesterday all platforms were
>>>> affected.
>>>>
>>>> This is likely to be very hard to debug. But I am thinking memory
>>>> issues:
>>>> this example requires loading an annotation package and a data package,
>>>> both of which are "big". How much RAM does the machines have and are
>>>> multiple R CMD check's run concurrently?
>>>>
>>>> Best,
>>>> Kasper
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> Bioc-devel at r-project.org mailing list
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.et
>>>> hz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt
>>>> 84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Jw
>>>> iMI-3BEUJlonlihLD_mDkPuEIalQbkrQPSGahzfsg&s=R1DGN1kNpBZ4ZRBC
>>>> TQzDPQlNYapuBNSYB4JTM6tO60w&e=
>>>>
>>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.et
>>> hz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt
>>> 84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Jw
>>> iMI-3BEUJlonlihLD_mDkPuEIalQbkrQPSGahzfsg&s=R1DGN1kNpBZ4ZRBC
>>> TQzDPQlNYapuBNSYB4JTM6tO60w&e=
>>>
>>>
>> --
>> Hervé Pagès
>>
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1-B514
>> P.O. Box 19024
>> Seattle, WA 98109-1024
>>
>> E-mail: hpages at fredhutch.org
>> Phone: (206) 667-5791
>> Fax: (206) 667-1319
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
This email message may contain legally privileged and/or...{{dropped:2}}
More information about the Bioc-devel
mailing list