[Bioc-devel] NCBI taxonomy annotation
Brian Schilder
br|@n_@ch||der @end|ng |rom @|umn|@brown@edu
Mon Aug 9 22:15:48 CEST 2021
Hi Levi,
I recently just put together a new package called orthogene <https://github.com/neurogenomics/orthogene> (currently under review by bioc) that has a convenience function for flexibly mapping species identifiers to any ID types (including NCBI taxa IDs): map_species()
It may not be as comprehensive as GenomeInfoDbData, but might still be useful.
Best,
Brian
___________
Brian Schilder
PhD Candidate
UK Dementia Research Institute at Imperial College London
Faculty of Medicine, Department of Brain Sciences, Neurogenomics Lab
Profile | bit.ly/imperial_profile <https://bit.ly/imperial_profile>
LinkedIn | linkedin.com/in/brian-schilder <https://www.linkedin.com/in/brian-schilder/>
Twitter | twitter.com/BMSchilder <http://www.twitter.com/BMSchilder>
Lab | neurogenomics.co.uk <http://neurogenomics.co.uk/>
UK DRI | www.ukdri.ac.uk <http://www.ukdri.ac.uk/>
> On 8 Aug 2021, at 19:10, Levi Waldron <lwaldron.research using gmail.com> wrote:
>
> Does anyone else do mapping between NCBI taxids, names, and ranks? We do
> this in curatedMetagenomicData and soon other packages, currently using
> external files that lack provenance and versioning, so Ludwig Geistlinger
> was looking for Bioconductor annotation resources. The closest he found was
> in GenomeInfoDbData <https://bioconductor.org/packages/GenomeInfoDbData> but
> this has only genus and species, and some quirks like Bacteria being listed
> as a genus:
>
>> library(GenomeInfoDbData)
>> data(specData)
>> head(specData)
> tax_id genus species
> 1 1 all <NA>
> 2 1 root <NA>
> 3 2 Bacteria <NA>
> 4 6 Azorhizobium <NA>
> 5 7 Azorhizobium caulinodans
> 6 9 Buchnera aphidicola
>> dim(specData)
> [1] 2521271 3
>> subset(specData, c(genus == "Escherichia" & species == "coli"))$tax_id
> [1] 562
>
> Any thoughts from the GenomeInfoDbData maintainer ("Bioconductor Maintainer
> <maintainer at bioconductor.org>") about a pull request either to a) update
> specData to add additional columns for all taxonomic levels, or b) creating
> a new object? Or, another approach altogether? See
> https://github.com/waldronlab/curatedMetagenomicData/issues/245.
>
> --
>
> Levi Waldron
>
> Associate Professor
>
> Department of Epidemiology and Biostatistics
>
> CUNY Graduate School of Public Health and Health Policy
>
> Institute for Implementation Science in Population Health
>
> 55 W 125th St, New York NY 10035
>
> https://waldronlab.io
>
> Join the microbiome Virtual International Forum: https://microbiome-vif.org
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
[[alternative HTML version deleted]]
More information about the Bioc-devel
mailing list