[Bioc-devel] 2 candidates for BiocGenerics

Hervé Pagès hpages at fredhutch.org
Mon Mar 16 22:45:53 CET 2015


Hi Laurent,

On 03/10/2015 07:28 AM, Laurent Gatto wrote:
>
> Dear all,
>
> Two possible candidates for BiocGenerics:
>
>> GenomeInfoDb::species
> standardGeneric for "species" defined from package "GenomeInfoDb"
>
> function (x)
> standardGeneric("species")
> <environment: 0x7278130>
> Methods may be defined for arguments: x
> Use  showMethods("species")  for currently available ones.
>> AnnotationDbi::species
> standardGeneric for "species" defined from package "AnnotationDbi"
>
> function (x)
> standardGeneric("species")
> <environment: 0x89feb18>
> Methods may be defined for arguments: x
> Use  showMethods("species")  for currently available ones.
>
>
> and
>
>> annotate::organism
> standardGeneric for "organism" defined from package "annotate"
>
> function (object)
> standardGeneric("organism")
> <environment: 0x8771908>
> Methods may be defined for arguments: object
> Use  showMethods("organism")  for currently available ones.
>> GenomeInfoDb::organism
> standardGeneric for "organism" defined from package "GenomeInfoDb"
>
> function (x)
> standardGeneric("organism")
> <environment: 0x8232da0>
> Methods may be defined for arguments: x
> Use  showMethods("organism")  for currently available ones.

The situation is actually worse than that: we have 5 organism
generics, 4 `organism<-` generics, 7 species generics, and 2
`species<-` generics, defined across all software packages.

The various "species" methods sometimes return the genus
and species, and sometimes return the Genbank common name
(e.g. Human or Mouse) which I've heard can upset some biologists
(but note that UCSC has been doing this forever
http://genome.ucsc.edu/FAQ/FAQreleases.html#release1)

Anyway, given that:

   (1) The NCBI Assembly pages have an "Organism name" field that
       contains the genus and species:

         http://www.ncbi.nlm.nih.gov/assembly/883148

   (2) On our website, all the views under "Organism" are
       of the form "Genus_species" (note the underscore):

         http://bioconductor.org/packages/devel/BiocViews.html#___Organism

   (3) Almost all our annotation packages have an organism field
       that contains the genus and species.

it seems that maybe we should just keep using organism() the way
it's used (i.e. for genus and species). Then the question is
what to do with species(): the methods that return the genus and
species are redundant with organism() and those that return the
common name just seem wrong. Suggestions?

When I move the organism() and species() generics to BiocGenerics
I'll add a man page for them. This will be a good place (and the
perfect time) to clarify what they're supposed to return. Having some
agreement on that will make it easier to develop tools that search
the ocean of annotations (900 annotation packages + 18992 records
on AnnotationHub).

Thanks,
H.

>
>
> Best wishes,
>
> Laurent
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list