[Bioc-devel] how to trace 'Matrix' as package dependency for 'GenomicScores'
Sean Davis
@e@nd@v| @end|ng |rom gm@||@com
Sun Feb 9 17:01:28 CET 2020
There are some good ideas here that would provide enhancement to
BiocPkgTools. I don't have the bandwidth to incorporate right now, but
filing issues or a pull request with a skeleton would be helpful to keep
track.
Sean
On Sun, Feb 9, 2020 at 7:31 AM Vincent Carey <stvjc using channing.harvard.edu>
wrote:
> On Sat, Feb 8, 2020 at 12:02 PM Martin Morgan <mtmorgan.bioc using gmail.com>
> wrote:
>
> > I find it quite interesting to identify formal strategies for removing
> > dependencies, but also a little outside my domain of expertise. This code
> >
>
> It would be nice to collect the ideas in this thread into some
> recommendations. The themes I am thinking of
> are "how developers can make their packages robust to loss of external
> packages" and "how can the
> Bioc ecosystem best deal with departures of packages from itself and from
> CRAN?" A good and well-adopted
> solution to the first one makes the second one moot.
>
> Two CRAN-related events I know of that required some effort are (temporary)
> loss of ashr and (recently)
> archiving of Seurat.
>
>
> > library(tools)
> > library(dplyr)
> >
> > ## non-base packages the user requires for GenomicScores
> > deps <- package_dependencies("GenomicScores", db, recursive=TRUE)[[1]]
> > deps <- intersect(deps, rownames(db))
> >
> > ## only need the 'universe' of GenomicScores dependencies
> > db1 <- db[c("GenomicScores", deps),]
> >
> > ## sub-graph of packages between each dependency and GenomicScores
> > revdeps <- package_dependencies(deps, db1, recursive = TRUE, reverse =
> > TRUE)
> >
> > tibble(
> > package = names(olap),
> > n_remove = lengths(revdeps),
> > ) %>%
> > arrange(n_remove)
> >
> > produces a tibble
> >
> > # A tibble: 106 x 2
> > package n_remove
> > <chr> <int>
> > 1 BSgenome 1
> > 2 AnnotationHub 1
> > 3 shinyjs 1
> > 4 DT 1
> > 5 shinycustomloader 1
> > 6 data.table 1
> > 7 shinythemes 1
> > 8 rtracklayer 2
> > 9 BiocFileCache 2
> > 10 BiocManager 2
> > # … with 96 more rows
> >
> > shows me, via n_remove, that I can remove the dependency on AnnotationHub
> > by removing the dependency on just one package (AnnotationHub!), but to
> > remove BiocFileCache I'd also have to remove another package
> > (AnnotationHub, I'd guess). So this provides some measure of the ease
> with
> > which a package can be removed.
> >
> > I'd like a 'benefit' column, too -- if I were to remove AnnotationHub,
> how
> > many additional packages would I also be able to remove, because they are
> > present only to satisfy the dependency on AnnotationHub? More generally,
> > perhaps there is a dependency of AnnotationHub that is only used by
> > AnnotationHub and BSgenome. So removing AnnotationHub as a dependency
> would
> > make it easier to remove BSgenome, etc. I guess this is a graph
> > optimization problem.
> >
> > Probably also worth mentioning the itdepends package (
> > https://github.com/r-lib/itdepends), which I think tries primarily to
> > determine the relationship between package dependencies and lines of
> code,
> > which seems like complementary information.
> >
> > Martin
> >
> > On 2/6/20, 12:29 PM, "Robert Castelo" <robert.castelo using upf.edu> wrote:
> >
> > true, i was just searching for the shortest path, we can search for
> > all
> > simple (i.e., without repeating "vertices") paths and there are up to
> > five routes from "GenomicScores" to "Matrix"
> >
> > igraph::all_simple_paths(igraph::igraph.from.graphNEL(g),
> > from="GenomicScores", to="Matrix", mode="out")
> > [[1]]
> > + 7/117 vertices, named, from 04133ec:
> > [1] GenomicScores BSgenome rtracklayer
> > [4] GenomicAlignments SummarizedExperiment DelayedArray
> > [7] Matrix
> >
> > [[2]]
> > + 6/117 vertices, named, from 04133ec:
> > [1] GenomicScores BSgenome rtracklayer
> > [4] GenomicAlignments SummarizedExperiment Matrix
> >
> > [[3]]
> > + 6/117 vertices, named, from 04133ec:
> > [1] GenomicScores DT crosstalk ggplot2 mgcv
> > [6] Matrix
> >
> > [[4]]
> > + 6/117 vertices, named, from 04133ec:
> > [1] GenomicScores rtracklayer GenomicAlignments
> > [4] SummarizedExperiment DelayedArray Matrix
> >
> > [[5]]
> > + 5/117 vertices, named, from 04133ec:
> > [1] GenomicScores rtracklayer GenomicAlignments
> > [4] SummarizedExperiment Matrix
> >
> > this is interesting, because it means that if i wanted to get rid of
> > the
> > "Matrix" dependence i'd need to get rid not only of the "rtracklayer"
> > dependence but also of "BSgenome" and "DT".
> >
> > robert.
> >
> >
> > On 2/6/20 5:41 PM, Martin Morgan wrote:
> > > Excellent! I think there are other, independent, paths between your
> > immediate dependents...
> > >
> > > RBGL::sp.between(g, start="DT", finish="Matrix",
> > detail=TRUE)[[1]]$path_detail
> > > [1] "DT" "crosstalk" "ggplot2" "mgcv" "Matrix"
> > >
> > > ??
> > >
> > > Martin
> > >
> > > On 2/6/20, 10:47 AM, "Robert Castelo" <robert.castelo using upf.edu>
> > wrote:
> > >
> > > hi Martin,
> > >
> > > thanks for hint!! i wasn't aware of
> > 'tools::package_dependencies()',
> > > adding a bit of graph sorcery i get the result i was looking
> > for:
> > >
> > > repos <- BiocManager::repositories()[c(1,5)]
> > > repos
> > > BioCsoft
> > > "https://bioconductor.org/packages/3.11/bioc"
> > > CRAN
> > > "https://cran.rstudio.com"
> > >
> > > db <- available.packages(repos=repos)
> > >
> > > deps <- tools::package_dependencies("GenomicScores", db,
> > > recursive=TRUE)[[1]]
> > >
> > > deps <- tools::package_dependencies(c("GenomicScores", deps),
> > db)
> > >
> > > g <- graph::graphNEL(nodes=names(deps), edgeL=deps,
> > edgemode="directed")
> > >
> > > RBGL::sp.between(g, start="GenomicScores", finish="Matrix",
> > > detail=TRUE)[[1]]$path_detail
> > > [1] "GenomicScores" "rtracklayer"
> > "GenomicAlignments"
> > > [4] "SummarizedExperiment" "Matrix"
> > >
> > > so, it was the rtracklayer dependency that leads to Matrix
> > through
> > > GenomeAlignments and SummarizedExperiment.
> > >
> > > maybe the BioC package 'pkgDepTools' should be deprecated if
> its
> > > functionality is part of 'tools' and it does not even work as
> > fast and
> > > correct as 'tools'.
> > >
> > > cheers,
> > >
> > > robert.
> > >
> > >
> > > On 2/6/20 2:51 PM, Martin Morgan wrote:
> > > > The first thing is to get the correct repositories
> > > >
> > > > repos = BiocManager::repositories()
> > > >
> > > > (maybe trim the experiment and annotation repos from this).
> I
> > also tried pkgDepTools::makeDepGraph() but it took so long that I moved
> > on... it has an option 'keep.builtin' which might include Matrix.
> > > >
> > > > There is also BiocPkgTools::buildPkgDependencyDataFrame() &
> > friends, but this seems to build dependencies within a single
> repository...
> > > >
> > > > The building block for a solution is
> > `tools::package_dependencies()`, and I can confirm that "Matrix" _is_ a
> > dependency
> > > >
> > > > db = available.packages(repos =
> > BiocManager::repositories())
> > > > revdeps <- tools::package_dependencies("GenomicScores",
> > db, recursive = TRUE)
> > > > "Matrix" %in% revdeps[[1]]
> > > > ## [1] TRUE
> > > >
> > > > so I'll leave the clever recursive or graph-based algorithm
> > up to you, to report back to the mailing list?
> > > >
> > > > For what it's worth I think the last time this came up
> Martin
> > Maechler pointed to a function in base R (probably the tools package)
> that
> > implements this, too...?
> > > >
> > > > Martin Morgan
> > > >
> > > > On 2/6/20, 6:40 AM, "Bioc-devel on behalf of Robert
> Castelo"
> > <bioc-devel-bounces using r-project.org on behalf of robert.castelo using upf.edu>
> > wrote:
> > > >
> > > > hi,
> > > >
> > > > when i load the package 'GenomicScores' in a clean
> > session i see thorugh
> > > > the 'sessionInfo()' that the package 'Matrix' is listed
> > under "loaded
> > > > via a namespace (and not attached)".
> > > >
> > > > i'd like to know what is the dependency that
> > 'GenomicsScores' has that
> > > > ends up requiring the package 'Matrix'.
> > > >
> > > > i've tried using the package 'pkgDepTools' without
> > success, because the
> > > > dependency graph does not list any path from
> > 'GenomicScores' to 'Matrix'.
> > > >
> > > > i've been manually browsing the Bioc website and,
> unless
> > i've overlooked
> > > > something, the only association with 'Matrix' i could
> > find is that
> > > > 'S4Vectors' and 'GenomicRanges', which are required by
> > 'GenomicScores',
> > > > list 'Matrix' in the 'Suggests' field, but my
> > understanding is that
> > > > those packages are not required and should not be
> loaded.
> > > >
> > > > so, is there any way in which i can figure out what of
> > the
> > > > 'GenomicScores' dependencies leads to loading the
> > package 'Matrix'?
> > > >
> > > > here are the depends, import and suggests fields from
> > 'GenomicScores':
> > > >
> > > > Depends: R (>= 3.5), S4Vectors (>= 0.7.21),
> > GenomicRanges, methods,
> > > > BiocGenerics (>= 0.13.8)
> > > > Imports: utils, XML, Biobase, IRanges (>= 2.3.23),
> > Biostrings,
> > > > BSgenome, GenomeInfoDb, AnnotationHub, shiny,
> > shinyjs,
> > > > DT, shinycustomloader, rtracklayer, data.table,
> > shinythemes
> > > > Suggests: BiocStyle, knitr, rmarkdown,
> > BSgenome.Hsapiens.UCSC.hg19,
> > > > phastCons100way.UCSC.hg19,
> > MafDb.1Kgenomes.phase1.hs37d5,
> > > > SNPlocs.Hsapiens.dbSNP144.GRCh37,
> > VariantAnnotation,
> > > > TxDb.Hsapiens.UCSC.hg19.knownGene, gwascat,
> > RColorBrewer
> > > >
> > > > and here a session information in a fresh R-devel
> > session after loading
> > > > the package 'GenomicScores':
> > > >
> > > > R Under development (unstable) (2020-01-29 r77745)
> > > > Platform: x86_64-pc-linux-gnu (64-bit)
> > > > Running under: CentOS Linux 7 (Core)
> > > >
> > > > Matrix products: default
> > > > BLAS: /opt/R/R-devel/lib64/R/lib/libRblas.so
> > > > LAPACK: /opt/R/R-devel/lib64/R/lib/libRlapack.so
> > > >
> > > > locale:
> > > > [1] LC_CTYPE=en_US.UTF8 LC_NUMERIC=C
> > > > [3] LC_TIME=en_US.UTF8 LC_COLLATE=en_US.UTF8
> > > > [5] LC_MONETARY=en_US.UTF8 LC_MESSAGES=en_US.UTF8
> > > > [7] LC_PAPER=en_US.UTF8 LC_NAME=C
> > > > [9] LC_ADDRESS=C LC_TELEPHONE=C
> > > > [11] LC_MEASUREMENT=en_US.UTF8 LC_IDENTIFICATION=C
> > > >
> > > > attached base packages:
> > > > [1] parallel stats4 stats graphics grDevices
> > utils datasets
> > > > [8] methods base
> > > >
> > > > other attached packages:
> > > > [1] GenomicScores_1.11.4 GenomicRanges_1.39.2
> > GenomeInfoDb_1.23.10
> > > > [4] IRanges_2.21.3 S4Vectors_0.25.12
> > BiocGenerics_0.33.0
> > > > [7] colorout_1.2-2
> > > >
> > > > loaded via a namespace (and not attached):
> > > > [1] Rcpp_1.0.3 lattice_0.20-38
> > > > [3] shinycustomloader_0.9.0 Rsamtools_2.3.3
> > > > [5] Biostrings_2.55.4 assertthat_0.2.1
> > > > [7] digest_0.6.23 mime_0.9
> > > > [9] BiocFileCache_1.11.4 R6_2.4.1
> > > > [11] RSQLite_2.2.0 httr_1.4.1
> > > > [13] pillar_1.4.3 zlibbioc_1.33.1
> > > > [15] rlang_0.4.4 curl_4.3
> > > > [17] data.table_1.12.8 blob_1.2.1
> > > > [19] DT_0.12 Matrix_1.2-18
> > > > [21] shinythemes_1.1.2 shinyjs_1.1
> > > > [23] BiocParallel_1.21.2 AnnotationHub_2.19.7
> > > > [25] htmlwidgets_1.5.1 RCurl_1.98-1.1
> > > > [27] bit_1.1-15.1 shiny_1.4.0
> > > > [29] DelayedArray_0.13.3 compiler_4.0.0
> > > > [31] httpuv_1.5.2 rtracklayer_1.47.0
> > > > [33] pkgconfig_2.0.3 htmltools_0.4.0
> > > > [35] tidyselect_1.0.0
> > SummarizedExperiment_1.17.1
> > > > [37] tibble_2.1.3
> GenomeInfoDbData_1.2.2
> > > > [39] interactiveDisplayBase_1.25.0 matrixStats_0.55.0
> > > > [41] XML_3.99-0.3 crayon_1.3.4
> > > > [43] dplyr_0.8.4 dbplyr_1.4.2
> > > > [45] later_1.0.0
> > GenomicAlignments_1.23.1
> > > > [47] bitops_1.0-6 rappdirs_0.3.1
> > > > [49] grid_4.0.0 xtable_1.8-4
> > > > [51] DBI_1.1.0 magrittr_1.5
> > > > [53] XVector_0.27.0 promises_1.1.0
> > > > [55] vctrs_0.2.2 tools_4.0.0
> > > > [57] bit64_0.9-7 BSgenome_1.55.3
> > > > [59] Biobase_2.47.2 glue_1.3.1
> > > > [61] purrr_0.3.3 BiocVersion_3.11.1
> > > > [63] fastmap_1.0.1 yaml_2.2.1
> > > > [65] AnnotationDbi_1.49.1 BiocManager_1.30.10
> > > > [67] memoise_1.1.0
> > > >
> > > >
> > > >
> > > > thanks!!
> > > >
> > > > robert.
> > > >
> > > > _______________________________________________
> > > > Bioc-devel using r-project.org mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/bioc-devel
> > > >
> > > >
> > >
> > > --
> > > Robert Castelo, PhD
> > > Associate Professor
> > > Dept. of Experimental and Health Sciences
> > > Universitat Pompeu Fabra (UPF)
> > > Barcelona Biomedical Research Park (PRBB)
> > > Dr Aiguader 88
> > > E-08003 Barcelona, Spain
> > > telf: +34.933.160.514
> > > fax: +34.933.160.550
> > >
> > >
> >
> > --
> > Robert Castelo, PhD
> > Associate Professor
> > Dept. of Experimental and Health Sciences
> > Universitat Pompeu Fabra (UPF)
> > Barcelona Biomedical Research Park (PRBB)
> > Dr Aiguader 88
> > E-08003 Barcelona, Spain
> > telf: +34.933.160.514
> > fax: +34.933.160.550
> >
> > _______________________________________________
> > Bioc-devel using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >
>
> --
> The information in this e-mail is intended only for th...{{dropped:20}}
More information about the Bioc-devel
mailing list