[Bioc-devel] Moving minfi classes definition to a lighter package
Robert Castelo
robert@c@@te|o @end|ng |rom up|@edu
Wed Mar 3 14:42:40 CET 2021
hi,
about a year ago we had a developer's forum session devoted to this
subject, you might find useful the discussion we had starting on minute
29th here:
https://www.youtube.com/watch?v=xsM4nN85cok
part of the result of that discussion is in section 7 of this vignette:
http://bioconductor.org/packages/release/bioc/vignettes/BiocPkgTools/inst/doc/BiocPkgTools.html#dependency-burden
which illustrates how to calculate some metrics on the dependency burden
of a package using functionality we implemented in the package
BiocPkgTools, in the case of minfi, this is the output:
library(BiocPkgTools)
depdf <- buildPkgDependencyDataFrame(repo=c("BioCsoft", "CRAN"),
dependencies=c("Depends", "Imports"))
minfidepmetrics <- pkgDepMetrics("minfi", depdf)
minfidepmetrics
ImportedAndUsed Exported Usage DepOverlap
DepGainIfExcluded
DelayedArray 1 188 0.53
0.11 0
grDevices 1 112 0.89
0.01 0
data.table 1 100 1.00
0.01 1
MASS 1 78 1.28
0.04 0
limma 4 310 1.29
0.04 0
reshape 1 67 1.49
0.03 2
nlme 2 109 1.83
0.05 1
utils 4 216 1.85
0.01 0
lattice 3 144 2.08
0.05 0
BiocGenerics 5 141 3.55
0.04 0
stats 16 449 3.56
0.01 0
siggenes 2 51 3.92
0.13 3
genefilter 2 49 4.08
0.38 3
Biobase 6 128 4.69
0.05 0
GenomeInfoDb 3 60 5.00
0.09 0
preprocessCore 2 39 5.13
0.02 1
GEOquery 1 17 5.88
0.32 4
HDF5Array 5 72 6.94
0.15 4
bumphunter 1 14 7.14
0.76 25
BiocParallel 6 68 8.82
0.07 0
Biostrings 23 240 9.58
0.11 0
graphics 9 87 10.34
0.01 0
IRanges 40 254 15.75
0.06 0
S4Vectors 47 278 16.91
0.05 0
DelayedMatrixStats 14 74 18.92
0.14 2
GenomicRanges 23 106 21.70
0.12 0
RColorBrewer 1 4 25.00
0.01 1
SummarizedExperiment 23 82 28.05
0.19 0
illuminaio 1 3 33.33
0.04 2
quadprog 1 2 50.00
0.01 1
beanplot 1 1 100.00
0.01 1
mclust NA 271 NA
0.04 1
nor1mix NA 38 NA
0.02 1
so, with the exception of 'bumphunter', it doesn't look like the removal
of a single dependency will give you much gain. it seems that minfi
imports a single functionality from bumphunter:
imp <- pkgDepImports("minfi")
imp[imp$pkg %in% "bumphunter", ]
# A tibble: 1 x 2
pkg fun
<chr> <chr>
1 bumphunter bumphunter
you can explore the gain by excluding combinations of package
dependencies with the function 'pkgCombDependencyGain()':
pcd <- pkgCombDependencyGain("minfi", depdf, maxNbr=2L)
dim(pcd)
[1] 561 3
head(pcd[order(pcd$DepGain, decreasing = TRUE), ])
Packages NbrExcl DepGain
160 bumphunter, GEOquery 2 43
175 bumphunter, genefilter 2 40
98 BiocParallel, bumphunter 2 31
161 bumphunter, HDF5Array 2 29
165 bumphunter, siggenes 2 28
157 bumphunter, DelayedMatrixStats 2 27
have fun with the dependency exploration game! :)
robert.
On 3/3/21 1:28 PM, Kasper Daniel Hansen wrote:
> I am happy to engage in a discussion about this, although I'm not sure that
> I am ultimately interested in having two packages.
>
> But first I would like to look at some dependency graphs. I am wondering
> what makes the dependency tree this big (and my tree is smaller than yours,
> but still big: library(minfi) gives me 16 attached packages and 89 loaded
> packages for the current release). This includes some part of the tidyverse
> which we don't really use much though (and which could probably get removed
> from the package with almost no work).
>
> What's the current best tool for dependency graphs in Bioconductor?
> pkgDepTools?
>
> Best,
> Kasper
>
> On Mon, Mar 1, 2021 at 6:24 PM Carlos Ruiz <carlos.ruiz using isglobal.org> wrote:
>
>> Dear Bioc developers,
>>
>> I have been developing different packages to analyze DNA methylation. In
>> all of them, I have used minfi's class GenomicRatioSet to manage DNA
>> methylation data, in order to take profit of the features of
>> RangedSummarizedExperiment.
>>
>> Although I am very happy with the potential of the class, importing its
>> definition from minfi, makes me add the package to imports. As minfi has a
>> high number of dependencies (129 in the current release), my packages end
>> up having hundreds of dependencies too. This is particularly problematic as
>> I do not use any of the other functions of minfi.
>>
>> I am wondering whether it could be possible to move minfi's class (or at
>> least GenomicRatioSet) to a lighter package, so people developing packages
>> on DNA methylation could rely on this class without having to import the
>> whole minfi package and its dependencies.
>>
>> Thank you very much,
>> --
>>
>> Carlos Ruiz
>>
>> --
>>
>>
>> This message is intended exclusively for its addressee and may contain
>> information that is CONFIDENTIAL and protected by professional privilege.
>> If
>> you are not the intended recipient you are hereby notified that any
>> dissemination, copy or disclosure of this communication is strictly
>> prohibited
>> by law. If this message has been received in error, please
>> immediately notify
>> us via e-mail and delete it.
>>
>>
>>
>> DATA PROTECTION. We
>> inform you that your personal data, including your
>> e-mail address and data
>> included in your email correspondence, are included in
>> the ISGlobal
>> Foundation files. Your personal data will be used for the purpose
>> of
>> contacting you and sending information on the activities of the above
>> foundations. You can exercise your rights of access, rectification,
>> cancellation and opposition by contacting the following address:
>> lopd using isglobal.org <mailto:lopd using isglobal.org>. ISGlobal
>> Privacy Policy at
>> www.isglobal.org <http://www.isglobal.org/>.
>>
>>
>>
>>
>> -----------------------------------------------------------------------------------------------------------------------------
>>
>> CONFIDENCIALIDAD. Este mensaje y sus anexos se dirigen exclusivamente a
>> su
>> destinatario y puede contener información confidencial, por lo que la
>> utilización,
>> divulgación y/o copia sin autorización está prohibida por la
>> legislación
>> vigente. Si ha recibido este mensaje por error, le rogamos lo
>> comunique
>> inmediatamente por esta misma vía y proceda a su destrucción.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> PROTECCIÓN DE DATOS. Sus datos de carácter personal utilizados en este
>> envío, incluida su dirección de e-mail, forman parte de ficheros de
>> titularidad
>> de la Fundación ISGlobal para cualquier
>> finalidades de
>> contacto, relación institucional y/o envío de información sobre
>> sus
>> actividades. Los datos que usted nos pueda facilitar contestando este
>> correo quedarán incorporados en los correspondientes ficheros, autorizando
>> el
>> uso de su dirección de e-mail para las finalidades citadas. Puede
>> ejercer los
>> derechos de acceso, rectificación, cancelación y oposición
>> dirigiéndose a lopd using isglobal.org <mailto:lopd using isglobal.org>* *. Política
>> de
>> privacidad
>> en www.isglobal.org <http://www.isglobal.org/>.
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
--
Robert Castelo, PhD
Associate Professor
Dept. of Experimental and Health Sciences
Universitat Pompeu Fabra (UPF)
Barcelona Biomedical Research Park (PRBB)
Dr Aiguader 88
E-08003 Barcelona, Spain
telf: +34.933.160.514
fax: +34.933.160.550
More information about the Bioc-devel
mailing list