[Bioc-devel] Feedback on OrganismDb development

Tim Triche, Jr. tim.triche at gmail.com
Thu Apr 7 17:24:38 CEST 2016

Great!  This is an awesome opportunity to move to ENSEMBL as a default ;-) (only half kidding, by the way)

1) BSGenome/2bit would be great -- I use this sometimes to generate fusion transcripts with defined breakpoints to supplement existing txomes

2) class name: don't change it

3) pre made packages: god yes. Try creating an ENSEMBL TxDb from a GTF on a laptop sometime!  I am planning to try and help a bit in this respect with direct Reactome mappings of various ID types for downstream analysis so this is not just a feature request, I will help with it. 

Thanks for picking this up. I and others use the organismdbi packages all the time and was wondering what would become of them now that Marc moved to Seattle Children's. It is great to hear that they will receive renewed attention because it is a really handy infrastructure. About all I could ask for is Drosophila, Danio, and Caenorhabditis organism packages ;-)

Thank you, 


> On Apr 7, 2016, at 7:34 AM, Obenchain, Valerie <Valerie.Obenchain at roswellpark.org> wrote:
> BioC developers,
> After the release we plan to continue development the OrganismDb class
> and packages. This email outlines some ideas for future direction. We're
> interested in feedback on these points as well as other thoughts people
> might have.
> ## Background
> The OrganismDb class is defined in the OrganismDbi package and consists
> of a TxDb object and the combined mappings from GO.db and an OrgDb. It
> supports the select() interface as well as several range-based
> extractors such as exons(), transcripts(), etc. The idea was that given
> a particular organism, a user would only need a single package to access
> both system biology and transcripts-centric annotations.
> We currently have 3 OrganismDb packages
> (http://www.bioconductor.org/packages/release/BiocViews.html#___OrganismDb).
> These are light weight and don't contain any data themselves but instead
> point to the GO.db, OrgDb and TxDb packages.
> ## Current issues
> - Support for sequence representation
> We've discussed incorporating an optional sequence component, maybe
> BSgenome or 2bit or ... ?
> - Class name
> OrganismDb is similar to OrgDb which could cause some confusion. We are
> considering renaming ... here are a few ideas. Let us know what you
> think or add your suggestion.
> OrganismDb (fine as is, leave it)
> FullOrgDb
> CrossDb
> MultipleDb
> - Package name
> The current names are not very descriptive: Homo.sapiens, Mus.musculus
> and Rattus.norvegicus.  We'd like to follow the naming convention used
> in our BSgenome and TxDb packages which means including the source,
> build and track from the TxDb as well as preceding with the class type.
> For example, the current 'Homo.sapiens' package would be renamed
> 'OrganismDb.Hsapiens.UCSC.hg19.knownGene'.
> - Pre-made packages
> Is it useful to supply pre-made packages or just increase awareness of
> the helpers so users can make their own? Current helpers:
>> ?makeOrganism
> ?makeOrganismDbFromBiomart  ?makeOrganismDbFromTxDb    
> ?makeOrganismDbFromUCSC     ?makeOrganismPackage
> NOTE: makeOrgansimPackage() will be renamed to makeOrganismDbPackage().
> Thanks.
> Valerie
> This email message may contain legally privileged and/or confidential information.  If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited.  If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you.
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

More information about the Bioc-devel mailing list