[Bioc-devel] library() calls removed in simpleSingleCell workflow

Wolfgang Huber wolfgang.huber at embl.de
Fri Oct 6 19:16:50 CEST 2017


Interesting! In iTerm2, I get
$ ulimit -Sn
4864

and
env R_MAX_NUM_DLLS=1000 R

works, which means that on Mac it IS possible to have many more DLLs 
open than 100 if R is started in the right way.

Wolfgang

PS I meant OS X 10.12.6, too. SOrry for the typo.


6.10.17 14:50, Kasper Daniel Hansen scripsit:
> On OS X 10.12.6 (I don't think 10.12.16 exists), I get
> 
> $ ulimit -Sn
> 7168
> 
> Interestingly, this is because I use iTerm2 for my command line prompt.  
> If I do the same command in Terminal I get 256.  If I start R inside of 
> Emacs I get 256 as well.  I don't know anything about ulimit and how it 
> is set, but that is a pretty start difference.
> 
> Best,
> Kasper
> 
> 
> 
> On Fri, Oct 6, 2017 at 3:12 AM, Wolfgang Huber <wolfgang.huber at embl.de 
> <mailto:wolfgang.huber at embl.de>> wrote:
> 
>     On Mac OSX 10.12.16:
>     $ ulimit -Sn
>     256
> 
>     so the maximum value of R_MAX_NUM_DLLS is 153 ...
> 
>              Wolfgang
> 
>     5.10.17 23:02, Henrik Bengtsson scripsit:
> 
>         About the DLL limit:
> 
>         Just wanna make sure you're aware of "new" environment variable
>         R_MAX_NUM_DLLS available in R (>= 3.4.0).  It allows you to push the
>         current default limit of 100 open DLLs a bit higher.  It can be
>         set in
>         .Renviron or before, e.g.
> 
>         $ R_MAX_NUM_DLLS=500 R
> 
>         This, of course, assumes that you can set it, which you might not be
>         able to do on build servers.  Also, there is an upper limit
>         min(0.6*fd_limit,1000) that depends on the number of files you can
>         have open at the same time (fd_limit), e.g. on my Ubuntu 16.04 I've
>         got:
> 
>         $ ulimit -Sn
>         1024
> 
>         so R_MAX_NUM_DLLS=614 is the maximum for me.
> 
>         /Henrik
> 
>         On Thu, Oct 5, 2017 at 11:22 AM, Wolfgang Huber
>         <wolfgang.huber at embl.de <mailto:wolfgang.huber at embl.de>> wrote:
> 
> 
>             Breaking up long workflows into several smaller "modules"
>             each with a
>             clearly defined input and output is a good idea, certainly
>             for didactic &
>             maintenance reasons.
> 
>             It doesn't "solve" the DLL issue though, it only avoids it
>             (for now)...
> 
>             I believe you can use a Makefile for your vignettes
>             (https://cran.r-project.org/doc/manuals/R-exts.html#Writing-package-vignettes
>             <https://cran.r-project.org/doc/manuals/R-exts.html#Writing-package-vignettes>),
>             and this might be a good way of managing which depends on
>             which. For passing
>             along output/input, perhaps local .RData files are good
>             enough, perhaps some
>             wheel-reinventing can also be avoided by using
>             https://bioconductor.org/packages/release/bioc/html/BiocFileCache.html
>             <https://bioconductor.org/packages/release/bioc/html/BiocFileCache.html>
>             (haven't actually used it yet, though).
> 
>                       Wolfgang
> 
> 
> 
>             5.10.17 20:02, Aaron Lun scripsit:
> 
> 
>                 This may relate to what I was thinking with respect to
>                 solving the DLL
>                 problem, by breaking up large workflows into modules
>                 that can be executed in
>                 separate R sessions. The same approach would also make
>                 it easier to
>                 associate package dependencies with specific parts of
>                 the workflow.
> 
> 
>                 In my particular situation, it is easy to break up the
>                 workflow into
>                 sections that can be executed completely independently.
>                 However, I can also
>                 imagine situations where dependencies on previous
>                 objects, etc. make it
>                 difficult to break up the workflow. If multiple files
>                 are present in
>                 vignettes/, can they be directed to execute in a
>                 specific order, and would
>                 output files from one vignette persist during the
>                 execution of another?
> 
> 
>                 -Aaron
> 
>                 ------------------------------------------------------------------------
>                 *From:* Wolfgang Huber <wolfgang.huber at embl.de
>                 <mailto:wolfgang.huber at embl.de>>
>                 *Sent:* Thursday, 5 October 2017 6:23:47 PM
>                 *To:* Laurent Gatto; Aaron Lun
>                 *Cc:* bioc-devel at r-project.org
>                 <mailto:bioc-devel at r-project.org>
>                 *Subject:* Re: [Bioc-devel] library() calls removed in
>                 simpleSingleCell
>                 workflow
> 
> 
>                 I agree it is nice to be able to only load the packages
>                 needed for a
>                 certain section of a vignette and not the whole thing.
>                 And that too many
>                 `::` can make code look unwieldy (though some may
>                 actually increase
>                 readability).
> 
>                 But relying on manually sprinkled in `library` calls
>                 seems like a hack
>                 prone to error. And there are always bound to be
>                 dependencies that are
>                 non-local, e.g. on general infrastructure like
>                 SummarizedExperiment,
>                 ggplot2, dplyr.
> 
>                 So: do we need a way to computationally determine the
>                 dependencies of a
>                 vignette section, including highlighting/eliminating
>                 potential name
>                 clashes (b/c the warnings about masking emitted at
>                 package loading are
>                 easily ignored)? This seems like a straightforward
>                 engineering task.
> 
>                 Eventually with such code analysis we could get rid of
>                 explicit
>                 `library` calls altogether :)
> 
>                            Wolfgang
> 
> 
> 
> 
> 
>                 5.10.17 08:53, Laurent Gatto scripsit:
> 
> 
> 
>                     On  5 October 2017 00:11, Aaron Lun wrote:
> 
>                         Here's another two cents from me:
> 
>                         The explicit library() calls allow for easy
>                         copy-pasting if people
>                         only want to use/adapt a section of the
>                         workflow. In such cases,
>                         calling "library(simpleSingleCell)" could drag
>                         in a lot of unnecessary
>                         packages (e.g., which could hit the DLL limit).
>                         Reading through the
>                         text to figure out the requirements for each
>                         code chunk seems like a
>                         pain, and lots of "::" are unwieldy.
> 
>                         More generally, the removal of individual
>                         library() calls seems to
>                         encourage the use of a single
>                         "library(simpleSingleCell)" call at the
>                         top of any user-developed custom analysis
>                         scripts based on the
>                         workflow. This seems conceptually odd to me -
>                         the simpleSingleCell
>                         package is simply a vehicle for the compiled
>                         workflow, it shouldn't be
>                         involved in analyses of other data.
> 
> 
> 
>                     I can confirm that this is a possibility.
> 
>                     Before workflows became available, I created the
>                     RforProteomics package
>                     that essentially provided one relatively large
>                     vignette to demonstrate a
>                     variety of applications of R/Bioconductor for mass
>                     spectrometry and
>                     proteomics. I think this has been a useful way to
>                     disseminate R and
>                     Bioconductor in these respective communities, but
>                     also lead to the
>                     confusion that it was that package that "did all the
>                     stuff", i.e. people
>                     saying that they were using RforProteomics to do a
>                     task that was
>                     described in the vignette. The RforProteomics
>                     vignette does explicitly
>                     call library at the beginning of each section and
>                     explained that the
>                     package was only a collection of analyses stemming
>                     from other packages,
>                     but that wasn't enough apparently.
> 
>                     Laurent
> 
> 
>                         -Aaron
> 
>                         ________________________________
>                         From: Bioc-devel
>                         <bioc-devel-bounces at r-project.org
>                         <mailto:bioc-devel-bounces at r-project.org>> on
>                         behalf of
>                         Wolfgang Huber <wolfgang.huber at embl.de
>                         <mailto:wolfgang.huber at embl.de>>
>                         Sent: Thursday, 5 October 2017 8:26 AM
>                         To: bioc-devel at r-project.org
>                         <mailto:bioc-devel at r-project.org>
>                         Subject: Re: [Bioc-devel] library() calls
>                         removed in simpleSingleCell
>                         workflow
> 
> 
>                         I find `eval=FALSE` chunks not a good idea, since
>                         - they confuse users who only see the rendered
>                         HTML/PDF (where this flag
>                         is not shown)
>                         - they are not tested, so more prone to code rot.
> 
>                         I'd also like to object to the idea that
>                         proximity of a `library` call
>                         to code that uses a package is somehow didactic.
>                         It's actually a bad
>                         habit: the R interpreter does not care. The
>                         relevant package
>                         - can be mentioned in the narrative,
>                         - stated in the code with the pkgname:: prefix.
>                         The latter is good didactics to get people used
>                         to the idea of
>                         namespaces, especially since there is an
>                         increasing frequency of name
>                         clashes in CRAN, tidyverse, BioC (e.g. consider
>                         the various functions
>                         named 'filter' and the obscure malbehaviors that
>                         can result from these).
> 
>                         Best wishes
>                                             Wolfgang
> 
>                         On 04/10/2017 22:20, Turaga, Nitesh wrote:
> 
> 
>                             Hi Aaron,
> 
> 
>                             A work around solution maybe to, put all
>                             libraries in a “eval=FALSE”
>                             block in the r code chunk
> 
>                             ```{r, eval=FALSE}
>                             library(scran)
>                             library(scater)
>                             ```
> 
>                             etc.
> 
> 
>                             This way the users can see the library()
>                             calls in the vignette.
> 
>                             Best,
> 
>                             Nitesh
> 
>                                 On Oct 4, 2017, at 4:14 PM, Obenchain,
>                                 Valerie
>                                 <Valerie.Obenchain at RoswellPark.org> wrote:
> 
>                                 Hi guys,
> 
>                                 A little background on this vignette ->
>                                 package conversion. The
>                                 workflows were converted to package form
>                                 because we want to integrate them
>                                 into the nightly build system instead of
>                                 supporting separate machines as
>                                 we're now doing.
> 
>                                 As part of this conversion, packages
>                                 loaded in workflow vignettes were
>                                 moved to Depends in DESCRIPTION. This
>                                 enables the user to load a single
>                                 package instead of many. Packages were
>                                 moved to Depends instead of Suggests
>                                 (as is usually done with software 
>                                 packages) because these vignette is the
>                                 only thing these workflow
> 
> 
>                 packages have going - no defined classes or methods.
>                 This seemed a more
>                 tidy approach and the dependencies are listed in Depends
>                 for the user to
>                 see. This was my (maybe bad?) idea and Nitesh was the
>                 messenger. If you feel
>                 the individual loading of packages in the vignette is a
>                 key part of the
>                 instruction/learning we can leave them as is and list
>                 the packages in
>                 Suggests.
> 
> 
> 
>                                 I should also mention that incorporating
>                                 the workflows into the build
>                                 system won't happen until after the
>                                 release. At that time we'll move the
>                                 repositories from svn to git and it's
>                                 likely we'll have to ask maintainers
>                                 to abide by some time/space guidelines. 
>                                 At that point the build machines
>                                 will be building software,
> 
> 
>                 experimental data and workflows and resources aren't
>                 unlimited. When that
>                 time comes we'll update the workflow guidelines and
>                 contact maintainers.
> 
> 
> 
>                                 Thanks.
>                                 Valerie
> 
> 
> 
>                                 On 10/04/2017 12:27 PM, Kasper Daniel
>                                 Hansen wrote:
> 
>                                 yeah, that is super super useful to
>                                 people. In my vignettes (granted,
>                                 not
>                                 workflows) I have a separate
>                                 "Dependencies" section which is basically
>                                 a
>                                 series of library() calls.
> 
>                                 On Wed, Oct 4, 2017 at 3:18 PM, Aaron Lun
>                                 <alun at wehi.edu.au
>                                 <mailto:alun at wehi.edu.au>><mailto:alun at wehi.edu.au
>                                 <mailto:alun at wehi.edu.au>> wrote:
> 
> 
> 
>                                 Dear Nitesh, list;
> 
> 
>                                 The library() calls in the
>                                 simpleSingleCell workflow have been
>                                 removed.
>                                 Why is this? I find explicit library()
>                                 calls to be quite useful for
>                                 readers
>                                 of the compiled vignette, because it
>                                 makes it easier for them to
>                                 determine
>                                 the packages that are required to adapt
>                                 parts of the workflow for
>                                 their own
>                                 analyses. If it doesn't hurt the build
>                                 system, I would prefer to have
>                                 these
>                                 library() calls in the vignette.
> 
> 
>                                 Cheers,
> 
> 
>                                 Aaron
> 
>                                            [[alternative HTML version
>                                 deleted]]
> 
>                                 _______________________________________________
>                                 Bioc-devel at r-project.org
>                                 <mailto:Bioc-devel at r-project.org><mailto:Bioc-devel at r-project.org
>                                 <mailto:Bioc-devel at r-project.org>>
>                                 mailing list
>                                 https://stat.ethz.ch/mailman/listinfo/bioc-devel
>                                 <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
> 
> 
> 
> 
>                                            [[alternative HTML version
>                                 deleted]]
> 
>                                 _______________________________________________
>                                 Bioc-devel at r-project.org
>                                 <mailto:Bioc-devel at r-project.org><mailto:Bioc-devel at r-project.org
>                                 <mailto:Bioc-devel at r-project.org>>
>                                 mailing list
>                                 https://stat.ethz.ch/mailman/listinfo/bioc-devel
>                                 <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
> 
> 
> 
> 
> 
>                                 This email message may contain legally
>                                 privileged and/or confidential
>                                 information.  If you are not the
>                                 intended recipient(s), or the employee or
>                                 agent responsible for the delivery of
>                                 this message to the intended
>                                 recipient(s), you are hereby notified
>                                 that  any disclosure, copying,
>                                 distribution, or use of this email
>                                 message is
> 
> 
>                 prohibited.  If you have received this message in error,
>                 please notify the
>                 sender immediately by e-mail and delete this email
>                 message from your
>                 computer. Thank you.
> 
> 
>                                           [[alternative HTML version
>                                 deleted]]
> 
>                                 _______________________________________________
>                                 Bioc-devel at r-project.org
>                                 <mailto:Bioc-devel at r-project.org>
>                                 mailing list
>                                 https://stat.ethz.ch/mailman/listinfo/bioc-devel
>                                 <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
> 
> 
>                         Bioc-devel Info Page - ETH
>                         Zurich<https://stat.ethz.ch/mailman/listinfo/bioc-devel
>                         <https://stat.ethz.ch/mailman/listinfo/bioc-devel>>
>                         stat.ethz.ch <http://stat.ethz.ch>
>                         Your email address: Your name (optional): You
>                         may enter a privacy
>                         password below. This provides only mild
>                         security, but should prevent others
>                         from messing with ...
> 
> 
> 
> 
> 
> 
>                             This email message may contain legally
>                             privileged and/or confidential
>                             information.  If you are not the intended
>                             recipient(s), or the employee or
>                             agent responsible for the delivery of this
>                             message to the intended
>                             recipient(s), you are hereby notified that 
>                             any disclosure, copying,
>                             distribution, or use of this email message is
> 
> 
>                 prohibited.  If you have received this message in error,
>                 please notify the
>                 sender immediately by e-mail and delete this email
>                 message from your
>                 computer. Thank you.
> 
> 
>                             _______________________________________________
>                             Bioc-devel at r-project.org
>                             <mailto:Bioc-devel at r-project.org> mailing list
>                             https://stat.ethz.ch/mailman/listinfo/bioc-devel
>                             <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
> 
> 
>                         Bioc-devel Info Page - ETH
>                         Zurich<https://stat.ethz.ch/mailman/listinfo/bioc-devel
>                         <https://stat.ethz.ch/mailman/listinfo/bioc-devel>>
>                         stat.ethz.ch <http://stat.ethz.ch>
>                         Your email address: Your name (optional): You
>                         may enter a privacy
>                         password below. This provides only mild
>                         security, but should prevent others
>                         from messing with ...
> 
> 
> 
> 
> 
> 
> 
>                 --
>                 With thanks in advance-
>                 Wolfgang
> 
>                 -------
>                 Wolfgang Huber
>                 Principal Investigator, EMBL Senior Scientist
>                 European Molecular Biology Laboratory (EMBL)
>                 Heidelberg, Germany
> 
>                 wolfgang.huber at embl.de <mailto:wolfgang.huber at embl.de>
>                 http://www.huber.embl.de
> 
> 
> 
> 
> 
> 
> 
> 
>             --
>             With thanks in advance-
>             Wolfgang
> 
>             -------
>             Wolfgang Huber
>             Principal Investigator, EMBL Senior Scientist
>             European Molecular Biology Laboratory (EMBL)
>             Heidelberg, Germany
> 
>             wolfgang.huber at embl.de <mailto:wolfgang.huber at embl.de>
>             http://www.huber.embl.de
> 
>             _______________________________________________
>             Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
>             mailing list
>             https://stat.ethz.ch/mailman/listinfo/bioc-devel
>             <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
> 
> 
>     -- 
>     With thanks in advance-
>     Wolfgang
> 
>     -------
>     Wolfgang Huber
>     Principal Investigator, EMBL Senior Scientist
>     European Molecular Biology Laboratory (EMBL)
>     Heidelberg, Germany
> 
>     wolfgang.huber at embl.de <mailto:wolfgang.huber at embl.de>
>     http://www.huber.embl.de
> 
>     _______________________________________________
>     Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org> mailing list
>     https://stat.ethz.ch/mailman/listinfo/bioc-devel
>     <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
> 
> 

-- 
With thanks in advance-
Wolfgang

-------
Wolfgang Huber
Principal Investigator, EMBL Senior Scientist
European Molecular Biology Laboratory (EMBL)
Heidelberg, Germany

wolfgang.huber at embl.de
http://www.huber.embl.de



More information about the Bioc-devel mailing list