[Bioc-devel] Request for comment metagenomeFeatures package

Nathan Olson nathandavidolson at gmail.com
Wed Aug 5 15:04:23 CEST 2015


Thanks, Martin.  I agree using AnnotationHub to manage the db resources is
a better option than how it is currently setup.  A pull request would be
much appreciated.

On Tue, Aug 4, 2015 at 3:14 PM Vincent Carey <stvjc at channing.harvard.edu>
wrote:

> On Tue, Aug 4, 2015 at 3:00 PM, Martin Morgan <mtmorgan at fredhutch.org>
> wrote:
>
>> On 08/04/2015 06:43 AM, Nathan Olson wrote:
>>
>>> We are starting to work on an infrastructure for annotation of 16S
>>> metagenomic
>>> sequencing datasets and would like your comments and/or contributions.
>>> Below are
>>> links to two github repositories: metagenomeFeatures and
>>> greengenes13.5MgDb.
>>> The metagenomeFeatures package contains two classes; mgDb, for 16S
>>> sequence
>>> databases, and metagenomeAnnotation, for annotating a sequence dataset
>>> with
>>> taxonomic information from a mgDb object.  The greengenes13.5MgDb
>>> package, loads
>>> a mgDb object with the greengenes 13.5 database.  greengenes 13.5 was
>>> used as an
>>>
>>
>> does it make sense to use AnnotationHub to manage these resources?
>
>
> I would think so.  At this time, trying to install greengenes13.5MgDb
> package, the process "testing whether the package
> can be loaded" takes a very long time -- I suspect it is doing some silent
> downloading.  IMHO such activities
> should be explicitly undertaken by the user.
>
>
>> Instead of downloading and managing the fasta and taxonomy files in
>> .onLoad and getGreenGenes13.5Db, .onLoad would be
>>
>>   hub = AnnotationHub()
>>   db_seq = hub[["AH12345"]]
>>   db_taxa_file = hub[["AH12346"]]
>>
>>
> With this setup the first installation of the package could involve a long
> download, silent by default.  It's feasible but
> quite unusual.
>
>
>> with a 'recipe' describing how the corresponding annotation hub resources
>> are to be created. This would move download and management to
>> AnnotationHub, and potentially allow use of the annotation hub records by
>> people with other interests. If that sounds interesting we can work up a
>> pull request.
>>
>> Martin
>>
>> example database, we plan on adding additional packages for other
>>> commonly used
>>> databases, e.g RDP and Silva.
>>>
>>> The metagenomeFeatures includes two vignettes to demonstrating the mgDb
>>> and
>>> metagenomeAnnotation class methods using the greengenes13.5MgDb as an
>>> example
>>> database.
>>>
>>> We are planning on adding additional methods for the mgDb and
>>> metagenomeAnnotation classes.  For the mgDb class, assigning query
>>> sequences to
>>> database sequences using rRDP classifier, and/or sequence alignment
>>> methods that
>>> are part of the Biostrings package.  For the metagenomeAnnotation class
>>> we plan
>>> to include the ability to create a phylogenetic tree from a
>>> metagenomeAnnotation
>>> object.
>>> We would appreciate comments on the package and suggestions for
>>> additional features.
>>>
>>> Links to package github repositories
>>>
>>> https://github.com/HCBravoLab/metagenomeFeatures
>>>
>>> https://github.com/HCBravoLab/greengenes13.5MgDb
>>>
>>> Thanks
>>>
>>> Nate Olson and Hector Corrada Bravo
>>>
>>
>>
>> --
>> Computational Biology / Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N.
>> PO Box 19024 Seattle, WA 98109
>>
>> Location: Arnold Building M1 B861
>> Phone: (206) 667-2793
>
> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list