[Bioc-devel] Feedback on OrganismDb development

Vincent Carey stvjc at channing.harvard.edu
Thu Apr 7 17:08:49 CEST 2016


On Thu, Apr 7, 2016 at 10:34 AM, Obenchain, Valerie <
Valerie.Obenchain at roswellpark.org> wrote:

> BioC developers,
>
> After the release we plan to continue development the OrganismDb class
> and packages. This email outlines some ideas for future direction. We're
> interested in feedback on these points as well as other thoughts people
> might have.
>
> ## Background
>
> The OrganismDb class is defined in the OrganismDbi package and consists
> of a TxDb object and the combined mappings from GO.db and an OrgDb. It
> supports the select() interface as well as several range-based
> extractors such as exons(), transcripts(), etc. The idea was that given
> a particular organism, a user would only need a single package to access
> both system biology and transcripts-centric annotations.
>
> We currently have 3 OrganismDb packages
> (http://www.bioconductor.org/packages/release/BiocViews.html#___OrganismDb
> ).
> These are light weight and don't contain any data themselves but instead
> point to the GO.db, OrgDb and TxDb packages.
>
> ## Current issues
>
> - Support for sequence representation
>
> We've discussed incorporating an optional sequence component, maybe
> BSgenome or 2bit or ... ?
>

it could be convenient to have a reference to a relevant sequence source,
presumably
the BSgenome... packages


>
>
> - Class name
>
> OrganismDb is similar to OrgDb which could cause some confusion. We are
> considering renaming ... here are a few ideas. Let us know what you
> think or add your suggestion.
>
> OrganismDb (fine as is, leave it)
>

Leave it, I have seen no objections or confusions.


> FullOrgDb
> CrossDb
> MultipleDb
>
>
> - Package name
>
> The current names are not very descriptive: Homo.sapiens, Mus.musculus
> and Rattus.norvegicus.  We'd like to follow the naming convention used
> in our BSgenome and TxDb packages which means including the source,
> build and track from the TxDb as well as preceding with the class type.
>
> For example, the current 'Homo.sapiens' package would be renamed
> 'OrganismDb.Hsapiens.UCSC.hg19.knownGene'.
>

A simple package name is great for promoting and getting use.  My sense is
that
the OrganismDb concept is underused.  I find it a convenient place to go for
seqinfo, seqlengths, symbol translations.

The objects are lightweight enough that it would seem to me that we really
want
to focus on methods for creating appropriate and valid instances at the
session
level.   Parameters of interest would seem to be the genome reference build,
the gene model source, and maps of genomic feature sets (GO, KEGG, etc.)
that
one would like to use with "select" in some rational way.



>
> - Pre-made packages
>
> Is it useful to supply pre-made packages or just increase awareness of
> the helpers so users can make their own? Current helpers:
>
> > ?makeOrganism
> ?makeOrganismDbFromBiomart  ?makeOrganismDbFromTxDb
> ?makeOrganismDbFromUCSC     ?makeOrganismPackage
>
> NOTE: makeOrgansimPackage() will be renamed to makeOrganismDbPackage().
>
>
> Thanks.
> Valerie
>
>
> This email message may contain legally privileged and/or confidential
> information.  If you are not the intended recipient(s), or the employee or
> agent responsible for the delivery of this message to the intended
> recipient(s), you are hereby notified that any disclosure, copying,
> distribution, or use of this email message is prohibited.  If you have
> received this message in error, please notify the sender immediately by
> e-mail and delete this email message from your computer. Thank you.
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list