[Bioc-devel] how to trace 'Matrix' as package dependency for 'GenomicScores'

Sean Davis @e@nd@v| @end|ng |rom gm@||@com
Sun Feb 9 17:01:28 CET 2020


There are some good ideas here that would provide enhancement to
BiocPkgTools. I don't have the bandwidth to incorporate right now, but
filing issues or a pull request with a skeleton would be helpful to keep
track.

Sean

On Sun, Feb 9, 2020 at 7:31 AM Vincent Carey <stvjc using channing.harvard.edu>
wrote:

> On Sat, Feb 8, 2020 at 12:02 PM Martin Morgan <mtmorgan.bioc using gmail.com>
> wrote:
>
> > I find it quite interesting to identify formal strategies for removing
> > dependencies, but also a little outside my domain of expertise. This code
> >
>
> It would be nice to collect the ideas in this thread into some
> recommendations.  The themes I am thinking of
> are "how developers can make their packages robust to loss of external
> packages" and "how can the
> Bioc ecosystem best deal with departures of packages from itself and from
> CRAN?"  A good and well-adopted
> solution to the first one makes the second one moot.
>
> Two CRAN-related events I know of that required some effort are (temporary)
> loss of ashr and (recently)
> archiving of Seurat.
>
>
> > library(tools)
> > library(dplyr)
> >
> > ## non-base packages the user requires for GenomicScores
> > deps <- package_dependencies("GenomicScores", db, recursive=TRUE)[[1]]
> > deps <- intersect(deps, rownames(db))
> >
> > ## only need the 'universe' of GenomicScores dependencies
> > db1 <- db[c("GenomicScores", deps),]
> >
> > ## sub-graph of packages between each dependency and GenomicScores
> > revdeps <- package_dependencies(deps, db1, recursive = TRUE, reverse =
> > TRUE)
> >
> > tibble(
> >     package = names(olap),
> >     n_remove = lengths(revdeps),
> > ) %>%
> >     arrange(n_remove)
> >
> > produces a tibble
> >
> > # A tibble: 106 x 2
> >    package           n_remove
> >    <chr>                <int>
> >  1 BSgenome                 1
> >  2 AnnotationHub            1
> >  3 shinyjs                  1
> >  4 DT                       1
> >  5 shinycustomloader        1
> >  6 data.table               1
> >  7 shinythemes              1
> >  8 rtracklayer              2
> >  9 BiocFileCache            2
> > 10 BiocManager              2
> > # … with 96 more rows
> >
> > shows me, via n_remove, that I can remove the dependency on AnnotationHub
> > by removing the dependency on just one package (AnnotationHub!), but to
> > remove BiocFileCache I'd also have to remove another package
> > (AnnotationHub, I'd guess). So this provides some measure of the ease
> with
> > which a package can be removed.
> >
> > I'd like a 'benefit' column, too -- if I were to remove AnnotationHub,
> how
> > many additional packages would I also be able to remove, because they are
> > present only to satisfy the dependency on AnnotationHub? More generally,
> > perhaps there is a dependency of AnnotationHub that is only used by
> > AnnotationHub and BSgenome. So removing AnnotationHub as a dependency
> would
> > make it easier to remove BSgenome, etc. I guess this is a graph
> > optimization problem.
> >
> > Probably also worth mentioning the itdepends package (
> > https://github.com/r-lib/itdepends), which I think tries primarily to
> > determine the relationship between package dependencies and lines of
> code,
> > which seems like complementary information.
> >
> > Martin
> >
> > On 2/6/20, 12:29 PM, "Robert Castelo" <robert.castelo using upf.edu> wrote:
> >
> >     true, i was just searching for the shortest path, we can search for
> > all
> >     simple (i.e., without repeating "vertices") paths and there are up to
> >     five routes from "GenomicScores" to "Matrix"
> >
> >     igraph::all_simple_paths(igraph::igraph.from.graphNEL(g),
> >     from="GenomicScores", to="Matrix", mode="out")
> >     [[1]]
> >     + 7/117 vertices, named, from 04133ec:
> >     [1] GenomicScores        BSgenome             rtracklayer
> >     [4] GenomicAlignments    SummarizedExperiment DelayedArray
> >     [7] Matrix
> >
> >     [[2]]
> >     + 6/117 vertices, named, from 04133ec:
> >     [1] GenomicScores        BSgenome             rtracklayer
> >     [4] GenomicAlignments    SummarizedExperiment Matrix
> >
> >     [[3]]
> >     + 6/117 vertices, named, from 04133ec:
> >     [1] GenomicScores DT            crosstalk     ggplot2       mgcv
> >     [6] Matrix
> >
> >     [[4]]
> >     + 6/117 vertices, named, from 04133ec:
> >     [1] GenomicScores        rtracklayer          GenomicAlignments
> >     [4] SummarizedExperiment DelayedArray         Matrix
> >
> >     [[5]]
> >     + 5/117 vertices, named, from 04133ec:
> >     [1] GenomicScores        rtracklayer          GenomicAlignments
> >     [4] SummarizedExperiment Matrix
> >
> >     this is interesting, because it means that if i wanted to get rid of
> > the
> >     "Matrix" dependence i'd need to get rid not only of the "rtracklayer"
> >     dependence but also of "BSgenome" and "DT".
> >
> >     robert.
> >
> >
> >     On 2/6/20 5:41 PM, Martin Morgan wrote:
> >     > Excellent! I think there are other, independent, paths between your
> > immediate dependents...
> >     >
> >     > RBGL::sp.between(g, start="DT", finish="Matrix",
> > detail=TRUE)[[1]]$path_detail
> >     > [1] "DT"        "crosstalk" "ggplot2"   "mgcv"      "Matrix"
> >     >
> >     > ??
> >     >
> >     > Martin
> >     >
> >     > On 2/6/20, 10:47 AM, "Robert Castelo" <robert.castelo using upf.edu>
> > wrote:
> >     >
> >     >      hi Martin,
> >     >
> >     >      thanks for hint!! i wasn't aware of
> > 'tools::package_dependencies()',
> >     >      adding a bit of graph sorcery i get the result i was looking
> > for:
> >     >
> >     >      repos <- BiocManager::repositories()[c(1,5)]
> >     >      repos
> >     >                                            BioCsoft
> >     >      "https://bioconductor.org/packages/3.11/bioc"
> >     >                                                CRAN
> >     >                          "https://cran.rstudio.com"
> >     >
> >     >      db <- available.packages(repos=repos)
> >     >
> >     >      deps <- tools::package_dependencies("GenomicScores", db,
> >     >      recursive=TRUE)[[1]]
> >     >
> >     >      deps <- tools::package_dependencies(c("GenomicScores", deps),
> > db)
> >     >
> >     >      g <- graph::graphNEL(nodes=names(deps), edgeL=deps,
> > edgemode="directed")
> >     >
> >     >      RBGL::sp.between(g, start="GenomicScores", finish="Matrix",
> >     >      detail=TRUE)[[1]]$path_detail
> >     >      [1] "GenomicScores"        "rtracklayer"
> > "GenomicAlignments"
> >     >      [4] "SummarizedExperiment" "Matrix"
> >     >
> >     >      so, it was the rtracklayer dependency that leads to Matrix
> > through
> >     >      GenomeAlignments and SummarizedExperiment.
> >     >
> >     >      maybe the BioC package 'pkgDepTools' should be deprecated if
> its
> >     >      functionality is part of 'tools' and it does not even work as
> > fast and
> >     >      correct as 'tools'.
> >     >
> >     >      cheers,
> >     >
> >     >      robert.
> >     >
> >     >
> >     >      On 2/6/20 2:51 PM, Martin Morgan wrote:
> >     >      > The first thing is to get the correct repositories
> >     >      >
> >     >      >    repos = BiocManager::repositories()
> >     >      >
> >     >      > (maybe trim the experiment and annotation repos from this).
> I
> > also tried pkgDepTools::makeDepGraph() but it took so long that I moved
> > on... it has an option 'keep.builtin' which might include Matrix.
> >     >      >
> >     >      > There is also BiocPkgTools::buildPkgDependencyDataFrame() &
> > friends, but this seems to build dependencies within a single
> repository...
> >     >      >
> >     >      > The building block for a solution is
> > `tools::package_dependencies()`, and I can confirm that "Matrix" _is_ a
> > dependency
> >     >      >
> >     >      >    db = available.packages(repos =
> > BiocManager::repositories())
> >     >      >    revdeps <- tools::package_dependencies("GenomicScores",
> > db, recursive = TRUE)
> >     >      >    "Matrix" %in% revdeps[[1]]
> >     >      >    ## [1] TRUE
> >     >      >
> >     >      > so I'll leave the clever recursive or graph-based algorithm
> > up to you, to report back to the mailing list?
> >     >      >
> >     >      > For what it's worth I think the last time this came up
> Martin
> > Maechler pointed to a function in base R (probably the tools package)
> that
> > implements this, too...?
> >     >      >
> >     >      > Martin Morgan
> >     >      >
> >     >      > On 2/6/20, 6:40 AM, "Bioc-devel on behalf of Robert
> Castelo"
> > <bioc-devel-bounces using r-project.org on behalf of robert.castelo using upf.edu>
> > wrote:
> >     >      >
> >     >      >      hi,
> >     >      >
> >     >      >      when i load the package 'GenomicScores' in a clean
> > session i see thorugh
> >     >      >      the 'sessionInfo()' that the package 'Matrix' is listed
> > under "loaded
> >     >      >      via a namespace (and not attached)".
> >     >      >
> >     >      >      i'd like to know what is the dependency that
> > 'GenomicsScores' has that
> >     >      >      ends up requiring the package 'Matrix'.
> >     >      >
> >     >      >      i've tried using the package 'pkgDepTools' without
> > success, because the
> >     >      >      dependency graph does not list any path from
> > 'GenomicScores' to 'Matrix'.
> >     >      >
> >     >      >      i've been manually browsing the Bioc website and,
> unless
> > i've overlooked
> >     >      >      something, the only association with 'Matrix' i could
> > find is that
> >     >      >      'S4Vectors' and 'GenomicRanges', which are required by
> > 'GenomicScores',
> >     >      >      list 'Matrix' in the 'Suggests' field, but my
> > understanding is that
> >     >      >      those packages are not required and should not be
> loaded.
> >     >      >
> >     >      >      so, is there any way in which i can figure out what of
> > the
> >     >      >      'GenomicScores' dependencies leads to loading the
> > package 'Matrix'?
> >     >      >
> >     >      >      here are the depends, import and suggests fields from
> > 'GenomicScores':
> >     >      >
> >     >      >      Depends: R (>= 3.5), S4Vectors (>= 0.7.21),
> > GenomicRanges, methods,
> >     >      >               BiocGenerics (>= 0.13.8)
> >     >      >      Imports: utils, XML, Biobase, IRanges (>= 2.3.23),
> > Biostrings,
> >     >      >               BSgenome, GenomeInfoDb, AnnotationHub, shiny,
> > shinyjs,
> >     >      >            DT, shinycustomloader, rtracklayer, data.table,
> > shinythemes
> >     >      >      Suggests: BiocStyle, knitr, rmarkdown,
> > BSgenome.Hsapiens.UCSC.hg19,
> >     >      >               phastCons100way.UCSC.hg19,
> > MafDb.1Kgenomes.phase1.hs37d5,
> >     >      >               SNPlocs.Hsapiens.dbSNP144.GRCh37,
> > VariantAnnotation,
> >     >      >               TxDb.Hsapiens.UCSC.hg19.knownGene, gwascat,
> > RColorBrewer
> >     >      >
> >     >      >      and here a session information in a fresh R-devel
> > session after loading
> >     >      >      the package 'GenomicScores':
> >     >      >
> >     >      >      R Under development (unstable) (2020-01-29 r77745)
> >     >      >      Platform: x86_64-pc-linux-gnu (64-bit)
> >     >      >      Running under: CentOS Linux 7 (Core)
> >     >      >
> >     >      >      Matrix products: default
> >     >      >      BLAS:   /opt/R/R-devel/lib64/R/lib/libRblas.so
> >     >      >      LAPACK: /opt/R/R-devel/lib64/R/lib/libRlapack.so
> >     >      >
> >     >      >      locale:
> >     >      >        [1] LC_CTYPE=en_US.UTF8       LC_NUMERIC=C
> >     >      >        [3] LC_TIME=en_US.UTF8        LC_COLLATE=en_US.UTF8
> >     >      >        [5] LC_MONETARY=en_US.UTF8    LC_MESSAGES=en_US.UTF8
> >     >      >        [7] LC_PAPER=en_US.UTF8       LC_NAME=C
> >     >      >        [9] LC_ADDRESS=C              LC_TELEPHONE=C
> >     >      >      [11] LC_MEASUREMENT=en_US.UTF8 LC_IDENTIFICATION=C
> >     >      >
> >     >      >      attached base packages:
> >     >      >      [1] parallel  stats4    stats     graphics  grDevices
> > utils     datasets
> >     >      >      [8] methods   base
> >     >      >
> >     >      >      other attached packages:
> >     >      >      [1] GenomicScores_1.11.4 GenomicRanges_1.39.2
> > GenomeInfoDb_1.23.10
> >     >      >      [4] IRanges_2.21.3       S4Vectors_0.25.12
> > BiocGenerics_0.33.0
> >     >      >      [7] colorout_1.2-2
> >     >      >
> >     >      >      loaded via a namespace (and not attached):
> >     >      >        [1] Rcpp_1.0.3                    lattice_0.20-38
> >     >      >        [3] shinycustomloader_0.9.0       Rsamtools_2.3.3
> >     >      >        [5] Biostrings_2.55.4             assertthat_0.2.1
> >     >      >        [7] digest_0.6.23                 mime_0.9
> >     >      >        [9] BiocFileCache_1.11.4          R6_2.4.1
> >     >      >      [11] RSQLite_2.2.0                 httr_1.4.1
> >     >      >      [13] pillar_1.4.3                  zlibbioc_1.33.1
> >     >      >      [15] rlang_0.4.4                   curl_4.3
> >     >      >      [17] data.table_1.12.8             blob_1.2.1
> >     >      >      [19] DT_0.12                       Matrix_1.2-18
> >     >      >      [21] shinythemes_1.1.2             shinyjs_1.1
> >     >      >      [23] BiocParallel_1.21.2           AnnotationHub_2.19.7
> >     >      >      [25] htmlwidgets_1.5.1             RCurl_1.98-1.1
> >     >      >      [27] bit_1.1-15.1                  shiny_1.4.0
> >     >      >      [29] DelayedArray_0.13.3           compiler_4.0.0
> >     >      >      [31] httpuv_1.5.2                  rtracklayer_1.47.0
> >     >      >      [33] pkgconfig_2.0.3               htmltools_0.4.0
> >     >      >      [35] tidyselect_1.0.0
> > SummarizedExperiment_1.17.1
> >     >      >      [37] tibble_2.1.3
> GenomeInfoDbData_1.2.2
> >     >      >      [39] interactiveDisplayBase_1.25.0 matrixStats_0.55.0
> >     >      >      [41] XML_3.99-0.3                  crayon_1.3.4
> >     >      >      [43] dplyr_0.8.4                   dbplyr_1.4.2
> >     >      >      [45] later_1.0.0
> >  GenomicAlignments_1.23.1
> >     >      >      [47] bitops_1.0-6                  rappdirs_0.3.1
> >     >      >      [49] grid_4.0.0                    xtable_1.8-4
> >     >      >      [51] DBI_1.1.0                     magrittr_1.5
> >     >      >      [53] XVector_0.27.0                promises_1.1.0
> >     >      >      [55] vctrs_0.2.2                   tools_4.0.0
> >     >      >      [57] bit64_0.9-7                   BSgenome_1.55.3
> >     >      >      [59] Biobase_2.47.2                glue_1.3.1
> >     >      >      [61] purrr_0.3.3                   BiocVersion_3.11.1
> >     >      >      [63] fastmap_1.0.1                 yaml_2.2.1
> >     >      >      [65] AnnotationDbi_1.49.1          BiocManager_1.30.10
> >     >      >      [67] memoise_1.1.0
> >     >      >
> >     >      >
> >     >      >
> >     >      >      thanks!!
> >     >      >
> >     >      >      robert.
> >     >      >
> >     >      >      _______________________________________________
> >     >      >      Bioc-devel using r-project.org mailing list
> >     >      >      https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >     >      >
> >     >      >
> >     >
> >     >      --
> >     >      Robert Castelo, PhD
> >     >      Associate Professor
> >     >      Dept. of Experimental and Health Sciences
> >     >      Universitat Pompeu Fabra (UPF)
> >     >      Barcelona Biomedical Research Park (PRBB)
> >     >      Dr Aiguader 88
> >     >      E-08003 Barcelona, Spain
> >     >      telf: +34.933.160.514
> >     >      fax: +34.933.160.550
> >     >
> >     >
> >
> >     --
> >     Robert Castelo, PhD
> >     Associate Professor
> >     Dept. of Experimental and Health Sciences
> >     Universitat Pompeu Fabra (UPF)
> >     Barcelona Biomedical Research Park (PRBB)
> >     Dr Aiguader 88
> >     E-08003 Barcelona, Spain
> >     telf: +34.933.160.514
> >     fax: +34.933.160.550
> >
> > _______________________________________________
> > Bioc-devel using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >
>
> --
> The information in this e-mail is intended only for th...{{dropped:20}}



More information about the Bioc-devel mailing list