[Bioc-devel] build machines

Martin Morgan martin.morgan at roswellpark.org
Fri Apr 27 19:50:08 CEST 2018


For what it's worth, BiocParallel implemented as outlined in it's 
vignette limits the number of cores via

     if (nzchar(Sys.getenv("BBS_HOME")))
         cores <- min(4L, cores)

i.e., checking an environment variable set on the build system. This is 
highly fragile and I wouldn't necessarily recommend this outside the 
BiocParallel context.

Martin

On 04/27/2018 01:39 PM, Ludwig Geistlinger wrote:
> Hi Hervé,
> 
>> Some packages are good citizens and limit the number of
>> cores to 1 or 2 only during 'R CMD check' but some packages
>> try to use all the cores that are available
> 
> That seems to be an important note for developers using parallel computation.
> What's best practice to realize this within my code, i.e. checking whether the code is currently subject to R CMD check (and accordingly reducing the number of cores used)?
> 
> Thanks,
> Ludwig
> 
> --
> Dr. Ludwig Geistlinger
> CUNY School of Public Health
> 
> ________________________________________
> From: Bioc-devel <bioc-devel-bounces at r-project.org> on behalf of Kasper Daniel Hansen <kasperdanielhansen at gmail.com>
> Sent: Friday, April 27, 2018 10:29 AM
> To: Hervé Pagès
> Cc: bioc-devel at r-project.org
> Subject: Re: [Bioc-devel] build machines
> 
> Thanks.
> 
> I used
>    /usr/bin/time -v R CMD check ...
> to record the max memory usage of the check, which for minfi suggests
> around 5Gb.  That's a lot.
> 
> Best,
> Kasper
> 
> On Thu, Apr 26, 2018 at 3:02 PM, Hervé Pagès <hpages at fredhutch.org> wrote:
> 
>> Hi,
>>
>> The Linux and Windows builders have 32 GB of RAM, the Mac
>> builders 64 Gb.
>>
>> We also run concurrent R CMD check's.
>>
>> Here is a summary:
>>
>>    platform           RAM   nb of     nb of concurrent
>>                       (Gb)  cores        R CMD check's
>>    ---------------------------------------------------
>>    Linux (malbecs)     32      20                   10
>>    Windows (tokays)    32      40                   24
>>    Mac (meridas)       64      24                   18
>>
>> That's a lot of concurrency. And there is actually more
>> concurrency than that if you consider the fact that many
>> packages run things in parallel during 'R CMD check'.
>> Some packages are good citizens and limit the number of
>> cores to 1 or 2 only during 'R CMD check' but some packages
>> try to use all the cores that are available. This will have
>> a strong impact on the overall progress of the builds. We
>> don't have an easy way to identify those packages right now.
>>
>> In average, based on our monitoring of the build machines
>> things seem to work ok i.e. the concurrent R CMD check's
>> don't seem to be competing too much to access resources.
>>
>> But occasionally there could be too much competition. The
>> crazy big elapsed time compared to the relatively short user
>> and system times that you observed Kasper are likely to reflect
>> that. They could be the sign that the machine ran out of memory
>> and started swapping. Not because it happens to your package
>> means that your package uses too much memory. The swapping is
>> the result of the **cumulated** memory usage of all the
>> R CMD check's running at that moment. It could be worth checking
>> how much memory R CMD check'ing your package uses though.
>>
>> The exact set of packages that are being R CMD check'ed at any
>> given time is in constant fluctuation and will also vary from
>> one day to the other. This would explain why some days you see
>> timeouts on some platforms and some days not. We don't have
>> an easy way to know which packages were competing with yours
>> during the 40 min window that 'R CMD check' was running on your
>> package until the build system declared a timeout. It's possible
>> (by looking at the BBS logs) but is time consuming.
>>
>> We should probably add some memory at some point to the Windows
>> builders. 32 Gb is not enough to smoothly run 24 R CMD check's
>> concurrently.
>>
>> H.
>>
>>
>> On 04/26/2018 08:48 AM, Diogo FT Veiga wrote:
>>
>>> Hi Daniel,
>>>
>>> I have the same issue with my package (new contribution). I just finish
>>> reviewing the package with the modifications requested.
>>>
>>> I am having a warning because R CMD check is exceeding 5 min, but this is
>>> happening only in the Windows machine.
>>>
>>> In Linux and OSX the check finishes in <= 4min, while in Windows takes
>>> ~6min.
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__biocondu
>>> ctor.org_spb-5Freports_maser-5Fbuildreport-5F20180425114748
>>> .html&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY
>>> _wJYbW0WYiZvSXAJJKaaPhzWA&m=JwiMI-3BEUJlonlihLD_mDkPuEIalQbk
>>> rQPSGahzfsg&s=1aMitB3PnVLoojx1lnj_UT_ZeKlJ_OcJDFT4D6BPXow&e=
>>>
>>>
>>> Not sure how to proceed from here.
>>>
>>> Thanks,
>>> Diogo
>>>
>>>
>>> On Thu, Apr 26, 2018 at 9:52 AM, Kasper Daniel Hansen <
>>> kasperdanielhansen at gmail.com> wrote:
>>>
>>> We have been working on the minfi package lately, with a move to a
>>>> DelayedArray backend.
>>>>
>>>> Right now there are some weird issues regarding timings in R CMD check.
>>>> Leaving aside the issue that the tests (now disabled) and examples are
>>>> too
>>>> slow, we get some very weird behaviour.
>>>>
>>>> An example is the current (soon to be replace) build report of minfi
>>>> 1.25.2
>>>> which prints
>>>>
>>>> Examples with CPU or elapsed time > 5s
>>>>                         user system  elapsed
>>>> preprocessFunnorm   99.388  0.632  148.554
>>>> combineArrays       64.104  2.120   68.329
>>>> bumphunter          62.540  1.392   64.107
>>>> preprocessNoob      43.944  0.016   44.955
>>>> preprocessQuantile  33.968  0.064   36.547
>>>> getAnnotation       31.072  0.024   31.126
>>>> compartments        18.668  0.188   18.871
>>>> minfiQC             10.124  6.628 1102.929
>>>> getSex              10.536  0.012   10.561
>>>> read.metharray       7.504  2.116   82.713
>>>> read.metharray.exp   9.076  0.032   10.592
>>>> mapToGenome-methods  4.648  0.548  163.648
>>>> mdsPlot              0.340  0.204   14.901
>>>>
>>>>
>>>> on Tokay (Linux).  Note minfiQC which has an elapsed time which is crazy
>>>> high compared to user+system.  Previous build report (which I didn't
>>>> save)
>>>> had a timeout on all platforms with a semingly similar behaviour but with
>>>> the getSex function.  The code did not change in the meantime.  For
>>>> today's
>>>> build we only see this on Linux, but yesterday all platforms were
>>>> affected.
>>>>
>>>> This is likely to be very hard to debug.  But I am thinking memory
>>>> issues:
>>>> this example requires loading an annotation package and a data package,
>>>> both of which are "big".  How much RAM does the machines have and are
>>>> multiple R CMD check's run concurrently?
>>>>
>>>> Best,
>>>> Kasper
>>>>
>>>>           [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> Bioc-devel at r-project.org mailing list
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.et
>>>> hz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt
>>>> 84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Jw
>>>> iMI-3BEUJlonlihLD_mDkPuEIalQbkrQPSGahzfsg&s=R1DGN1kNpBZ4ZRBC
>>>> TQzDPQlNYapuBNSYB4JTM6tO60w&e=
>>>>
>>>>
>>>          [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.et
>>> hz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt
>>> 84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Jw
>>> iMI-3BEUJlonlihLD_mDkPuEIalQbkrQPSGahzfsg&s=R1DGN1kNpBZ4ZRBC
>>> TQzDPQlNYapuBNSYB4JTM6tO60w&e=
>>>
>>>
>> --
>> Hervé Pagès
>>
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1-B514
>> P.O. Box 19024
>> Seattle, WA 98109-1024
>>
>> E-mail: hpages at fredhutch.org
>> Phone:  (206) 667-5791
>> Fax:    (206) 667-1319
>>
> 
>          [[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> 
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> 


This email message may contain legally privileged and/or...{{dropped:2}}



More information about the Bioc-devel mailing list