[Bioc-devel] Submit data package or use AnnotationHub?

Steve Lianoglou mailinglist.honeypot at gmail.com
Thu Dec 21 21:26:48 CET 2017


Hi all,

Briefly:

I'm looking to get guidance on how to handle data packages that
support a suite of software packages I'd like to submit to
bioconductor.

More Detail:

We (Genentech) have opened sourced some packages I've been developing
internally for the past few years that facilitate the execution and
exploration of gene set enrichment analyses.

https://github.com/lianos/multiGSEA
https://github.com/lianos/multiGSEA.shiny

I will submit them to bioc "in the normal way", however my question is
how I should do that because there are also data packages I have (that
are Suggest(ed) by multiGSEA) that need to go in as well.

Would these data go in as data packages or via AnnotationHub?

multiGSEA provides convenience wrappers to retrieve genesets from
different sources. One of these resources is the gene set collections
made available by MSigDB. Using multiGSEA, a user can get the hallmark
and c2 gene set collections like so:

```
library(multiGSEA)
gdb.human <- getMSigGeneSetDb(c("h", "c2"), "human")
gdb.mouse <- getMSigGeneSetDb(c("h", "c2"), "mouse")
```

These function calls check if the following data packages are
installed and retrieve the appropriate gene sets if so (otherwise they
raise an error):

https://github.com/lianos/GeneSetDb.MSigDB.Hsapiens.v61
https://github.com/lianos/GeneSetDb.MSigDB.Mmusculus.v61

I've created these data packages so that they approximate what I think
looks like something suitable for AnnotationHub (ie. with working
inst/scripts/make-data.R scripts). These data packages start with
MSigDB's gene set *xml files (ie. 'msigdb_v6.1.xml') and convert them
into multiGSEA::GeneSetDb *.rds objects which are then used by the
multiGSEA and multiGSEA.shiny packages.

I'm curious how to proceed from here?

Thanks,
-steve

ps: I know bioc looks down on not using "foundational" bioc classes,
so we can have this discussion during pkg review, but a GeneSetDb
object is a reimagined take on the GSEABase::GeneSetCollection.
Unfortunately the latter just wasn't providing the functionality I
wanted for how I felt like I wanted to interact with collections of
genesets ... mulitGSEA provides methods to convert a GeneSetCollection
to a GeneSetDb, and vice versa


-- 
Steve Lianoglou
Bioinformatics Scientist
Cancer Immunology
Genentech



More information about the Bioc-devel mailing list