[Bioc-devel] Opinions on a meta-annotation package

Kasper Daniel Hansen k@@perd@n|e|h@n@en @end|ng |rom gm@||@com
Thu Oct 24 22:11:08 CEST 2019


From your description it very much sounds like creating a new package is
the way to go.

On Thu, Oct 24, 2019 at 3:03 PM Pages, Herve <hpages using fredhutch.org> wrote:

> Hi Panagiotis,
>
> Avoiding code repetition is always a good idea. An alternative to the
> creation of a 3rd package would be to have one of the 2 packages depend
> on the other. If that is not a good option (and there might be some
> valid reasons for that) then yes, factorizing out the repeated stuff and
> putting it in a 3rd package is a good option.
>
> Note that your subject line is confusing: You're asking opinions on a
> meta-annotation package but IIUC this is about the creation of a
> **software** package that would provide tools for building and/or
> querying a certain type of annotations right? I think of a
> meta-annotation package as a data package that would contain searchable
> meta data about existing biological annotations but that is not what we
> are talking about here is it?
>
> Also I wonder how much overlap there would be between this new package
> and packages like AnnotationDbi, AnnotationForge, GenomicFeatures,
> ensembldb which also provide functionalities for creating and querying
> annotations. For example AnnotationForge and AnnotationDbi are used to
> create and query the hundreds of "classic" *db packages.
>
> Best,
> H.
>
> On 10/20/19 19:56, Panagiotis Moulos wrote:
> > Dear developers,
> >
> > I maintain two packages (metaseqR, recoup) and about to submit an
> enhanced
> > (but different in many points, thus a new package) version of the 1st
> > (metaseqR2). During their course of development, maintenance and usage,
> > these packages have somehow come to use a common underlying annotation
> > system for the genomic regions they operate on, which of course makes use
> > of Bioconductor facilities and of course structures (GenomicRanges,
> > GenomicAlignments, BSgenome, GenomicFeatures etc.)
> >
> > This annotation system:
> > - Builds a local SQLite database
> > - Supports certain "custom" genomic features which are required for the
> > modeling made by these packages
> > - Is currently embedded to each package
> > - Has almost evolved to a package of its own with respect to independent
> > functionalities
> >
> > The reason for this mail/question is that I would like to ask your
> opinion
> > whether it is worthy to create a new package to host  the annotation
> > functions and detach from the other two. Some points to support this
> idea:
> >
> > 1. It's used in the same manner by two other packages, thus there is a
> lot
> > of code  repetition
> > 2. Users (including myself) often load one of these packages just to use
> it
> > to fetch genomic region annotations for other purposes outside the scope
> of
> > each package (metaseqR - RNA-Seq data analysis, recoup - NGS signal
> > visualization).
> > 3. It automatically constructs the required annotation regions to analyze
> > Lexogen Quant-Seq data (a protocol we are using a lot), a function which
> > may be useful to many others
> > 4. The database created can be expanded with custom user annotations
> using
> > a GTF file to create it (making use of makeTxDbFromGFF)
> > 5. Supports various annotation sources (Ensembl, UCSC, RefSeq, custom) in
> > one place
> > 6. Has a versioning system, allowing transparency and reproducibility
> when
> > required
> >
> > Some (maybe obvious) points against this idea:
> >
> > 1. Bioconductor has already a robust and rich genomic annotation system
> > which can be used and re-used as necessary
> > 2. Maybe there is no need for yet another annotation-related package
> > 3. There is possibly no wide acceptance for such a package, other than my
> > usage in the other two, and maybe a few more users that make use of the
> > annotation functionalities
> > 4. Does not follow standard Bioconductor guidelines for creating
> annotation
> > packages (on the other hand it's not an annotation package in the strict
> > sense, but more a meta-annotation package).
> >
> > Do you have any thoughts or opinions on the best way of action?
> >
> > Best regards,
> >
> > Panagiotis
> >
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages using fredhutch.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
> _______________________________________________
> Bioc-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>


-- 
Best,
Kasper

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list