[Bioc-devel] Reducing dependencies
Robert Castelo
robert@c@@te|o @end|ng |rom up|@edu
Wed Jun 3 10:12:59 CEST 2020
hi Koen,
you can do some analysis of the dependencies using the BiocPkgTools as
follows:
library(BiocPkgTools)
depdf <- buildPkgDependencyDataFrame(repo=c("BioCsoft", "CRAN"),
dependencies=c("Depends", "Imports"))
## if you get this error
##
## Error in readRDS(gzcon(con)) :
## cannot open the connection to
'https://packagemanager.rstudio.com/all/__linux__/bionic/latest/web/packages/packages.rds'
##
## please change the CRAN mirror and choose anything but RStudio, by
doing ..
chooseCRANmirror()
## then call the function 'pkgDepMetrics()'
pdm <- pkgDepMetrics("tradeSeq", depdf)
pdm
ImportedAndUsed Exported Usage DepOverlap
DepGainIfExcluded
S4Vectors 1 275 0.36 0.09 0
dplyr 2 261 0.77 0.30 4
mgcv 3 172 1.74 0.13 0
ggplot2 10 504 1.98 0.53 11
magrittr 1 35 2.86 0.01 0
BiocParallel 2 67 2.99 0.12 6
pbapply 1 17 5.88 0.03 1
SummarizedExperiment 6 79 7.59 0.32 0
SingleCellExperiment 5 55 9.09 0.33 0
slingshot 4 23 17.39 0.43 3
princurve 1 5 20.00 0.08 0
Biobase NA 128 NA 0.08 0
edgeR NA 234 NA 0.13 3
matrixStats NA 105 NA 0.01 0
RColorBrewer NA 4 NA 0.01 0
tibble NA 42 NA 0.24 0
in the help page of 'pkgDepMetrics' and the section "7 Dependency
burden" from the BiocPkgTools vignette, you can find a description of
these columns, but essentially we see that 'ggplot2' is the dependency
that has the larger overlap with the dependency graph of 'tradeSeq' and
by removing it you would have the largest reduction in dependencies.
however, you're also using 10 functions from this package so this is not
a dependency you can easily replace. you can try to explore whether you
could get rid of the dependencies for which 'BiocPkgTools' could not
identify the functionality imported, which are those with NA values in
the column 'Usage'. you can explore what functions you're actually using
with 'pkgDepImports()', for instance:
imp <- pkgDepImports("tradeSeq")
imp[imp$pkg %in% "dplyr", ]
# A tibble: 2 x 2
pkg fun
<chr> <chr>
1 dplyr filter
2 dplyr mutate
this means that if you would avoid using 'filter()' and 'mutate()', you
could in principle remove 'dplyr' as a dependency.
you also mentioned below that you moved packages from imports to
suggests, to do this kind of analysis including packages in 'suggests'
you need to call again 'buildPkgDependencyDataFrame()' adding 'Suggests'
to the 'dependencies' argument and then call 'pkgDepMetrics'. however, i
guess the packages in suggests are used only in the vignette, so the
solution there would be to try to simplify the vignette.
cheers,
robert.
On 02/06/2020 23:18, Koen Van den Berge wrote:
> Dear All,
>
> We have recently extended our Bioconductor package tradeSeq <https://bioconductor.org/packages/devel/bioc/html/tradeSeq.html> to allow different input formats and accommodate extended downstream analyses, by building on other R/Bioconductor packages.
> However this has resulted in a significant increase in the number of dependencies due to relying on other packages that also have many dependencies, for example causing very long build times on Travis <https://travis-ci.com/github/statOmics/tradeSeq>.
>
> We are therefore wondering about current recommendations to reduce the dependency load. We have moved some larger packages from ‘Imports’ to ‘Suggests’, but to no avail.
>
> Best,
> Koen
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
More information about the Bioc-devel
mailing list