[Bioc-devel] annotation data not updated?

James W. MacDonald jmacdon at uw.edu
Wed Nov 15 01:54:54 CET 2017


On Thu, Nov 9, 2017 at 9:48 AM, Van Twisk, Daniel <
Daniel.VanTwisk at roswellpark.org> wrote:

> Thanks for looking into this.  New versions of the OrgDbs and Db0s
> (v3.5.0) are now available that have up-to-date resources.  Here is the
> output of the new org.Hs.eg.db


Does this issue affect the OrgDbs on AnnotationHub as well? I am finding
e.g., that the OrgDb for Salmo salar contains GO IDs that no longer exist
in GO.db.

> zz
OrgDb object:
| DBSCHEMAVERSION: 2.1
| DBSCHEMA: NOSCHEMA_DB
| ORGANISM: Salmo salar
| SPECIES: Salmo salar
| CENTRALID: GID
| Taxonomy ID: 8030
| Db type: OrgDb
| Supporting package: AnnotationDbi

Please see: help('select') for usage information
> sum(!keys(zz, "GOALL") %in% keys(GO.db))
[1] 38

But this isn't true of, for example, the Homo sapiens OrgDb from
AnnotationHub

> z
OrgDb object:
| DBSCHEMAVERSION: 2.1
| Db type: OrgDb
| Supporting package: AnnotationDbi
| DBSCHEMA: HUMAN_DB
| ORGANISM: Homo sapiens
| SPECIES: Human
| EGSOURCEDATE: 2017-Nov6
| EGSOURCENAME: Entrez Gene
| EGSOURCEURL: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA
| CENTRALID: EG
| TAXID: 9606
| GOSOURCENAME: Gene Ontology
| GOSOURCEURL:
ftp://ftp.geneontology.org/pub/go/godatabase/archive/latest-lite/
| GOSOURCEDATE: 2017-Nov01
| GOEGSOURCEDATE: 2017-Nov6
| GOEGSOURCENAME: Entrez Gene
| GOEGSOURCEURL: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA
| KEGGSOURCENAME: KEGG GENOME
| KEGGSOURCEURL: ftp://ftp.genome.jp/pub/kegg/genomes
| KEGGSOURCEDATE: 2011-Mar15
| GPSOURCENAME: UCSC Genome Bioinformatics (Homo sapiens)
| GPSOURCEURL:
| GPSOURCEDATE: 2017-Oct9
| ENSOURCEDATE: 2017-Aug23
| ENSOURCENAME: Ensembl
| ENSOURCEURL: ftp://ftp.ensembl.org/pub/current_fasta
| UPSOURCENAME: Uniprot
| UPSOURCEURL: http://www.UniProt.org/
| UPSOURCEDATE: Tue Nov  7 20:57:02 2017

Please see: help('select') for usage information
> sum(!keys(z, "GOALL") %in% keys(GO.db))
[1] 0


But I am not sure when they were added, because the human OrgDb has an
rdatadateadded that is obviously not correct, since it precedes the
SOURCEDATEs from the OrgDb itself!

> mcols(hub["AH57973"])$rdatadateadded  <------ Human
[1] "2017-10-23"
> mcols(hub["AH58003"])$rdatadateadded  <------  Salmo
[1] "2017-10-27"

Best,

Jim





>
> > x <- org.Hs.eg.db
> > x
> OrgDb object:
> | DBSCHEMAVERSION: 2.1
> | Db type: OrgDb
> | Supporting package: AnnotationDbi
> | DBSCHEMA: HUMAN_DB
> | ORGANISM: Homo sapiens
> | SPECIES: Human
> | EGSOURCEDATE: 2017-Nov6
> | EGSOURCENAME: Entrez Gene
> | EGSOURCEURL: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA
> | CENTRALID: EG
> | TAXID: 9606
> | GOSOURCENAME: Gene Ontology
> | GOSOURCEURL: ftp://ftp.geneontology.org/pub/go/godatabase/archive/
> latest-lite/
> | GOSOURCEDATE: 2017-Nov01
> | GOEGSOURCEDATE: 2017-Nov6
> | GOEGSOURCENAME: Entrez Gene
> | GOEGSOURCEURL: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA
> | KEGGSOURCENAME: KEGG GENOME
> | KEGGSOURCEURL: ftp://ftp.genome.jp/pub/kegg/genomes
> | KEGGSOURCEDATE: 2011-Mar15
> | GPSOURCENAME: UCSC Genome Bioinformatics (Homo sapiens)
> | GPSOURCEURL:
> | GPSOURCEDATE: 2017-Oct9
> | ENSOURCEDATE: 2017-Aug23
> | ENSOURCENAME: Ensembl
> | ENSOURCEURL: ftp://ftp.ensembl.org/pub/current_fasta
> | UPSOURCENAME: Uniprot
> | UPSOURCEURL: http://www.UniProt.org/
> | UPSOURCEDATE: Tue Nov  7 20:57:02 2017
>
>
> ________________________________
> From: Bioc-devel <bioc-devel-bounces at r-project.org> on behalf of
> Obenchain, Valerie <Valerie.Obenchain at RoswellPark.org>
> Sent: Thursday, November 2, 2017 12:47:43 PM
> To: Yu, Guangchuang; bioc-devel
> Subject: Re: [Bioc-devel] annotation data not updated?
>
> Guangchuang,
>
> Thanks for reporting this. We've looked into it and there is indeed a more
> recent version of the data. Daniel is working on re-generating the db0 and
> OrgDb packages. We'll post back with more information when the packages are
> ready.
>
> Valerie
>
>
> On 11/02/2017 05:40 AM, Yu, Guangchuang wrote:
>
> Dear all,
>
> I just upgraded BioC to 3.6 and found that the data source of org.Hs.eg.db
> and GO.db is still half year ago.
>
> I was wondering whether these packages had been updated in current release.
>
>
>
> org.Hs.eg.db
>
>
> OrgDb object:
> | DBSCHEMAVERSION: 2.1
> | Db type: OrgDb
> | Supporting package: AnnotationDbi
> | DBSCHEMA: HUMAN_DB
> | ORGANISM: Homo sapiens
> | SPECIES: Human
> | EGSOURCEDATE: *2017-Mar29*
> | EGSOURCENAME: Entrez Gene
> | EGSOURCEURL: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA
> | CENTRALID: EG
> | TAXID: 9606
> | GOSOURCENAME: Gene Ontology
> | GOSOURCEURL: ftp://ftp.geneontology.org/pub/go/godatabase/archive/
> latest-lite/
> | GOSOURCEDATE: *2017-Mar29*
> | GOEGSOURCEDATE: 2017-Mar29
> | GOEGSOURCENAME: Entrez Gene
> | GOEGSOURCEURL: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA
> | KEGGSOURCENAME: KEGG GENOME
> | KEGGSOURCEURL: ftp://ftp.genome.jp/pub/kegg/genomes
> | KEGGSOURCEDATE: 2011-Mar15
> | GPSOURCENAME: UCSC Genome Bioinformatics (Homo sapiens)
> | GPSOURCEURL:
> | GPSOURCEDATE: 2017-Sep7
> | ENSOURCEDATE: 2017-Mar29
> | ENSOURCENAME: Ensembl
> | ENSOURCEURL: ftp://ftp.ensembl.org/pub/current_fasta
> | UPSOURCENAME: Uniprot
> | UPSOURCEURL: http://www.UniProt.org/
> | UPSOURCEDATE: Thu Oct  5 16:07:33 2017
>
> Please see: help('select') for usage information
>
>
> GO.db
>
>
> GODb object:
> | GOSOURCENAME: Gene Ontology
> | GOSOURCEURL: ftp://ftp.geneontology.org/pub/go/godatabase/archive/
> latest-lite/
> | GOSOURCEDATE: *2017-Mar29*
> | Db type: GODb
> | package: AnnotationDbi
> | DBSCHEMA: GO_DB
> | GOEGSOURCEDATE: 2017-Mar29
> | GOEGSOURCENAME: Entrez Gene
> | GOEGSOURCEURL: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA
> | DBSCHEMAVERSION: 2.1
>
> Please see: help('select') for usage information
>
>
> sessionInfo()
>
>
> R version 3.4.2 (2017-09-28)
> Platform: x86_64-apple-darwin15.6.0 (64-bit)
> Running under: macOS Sierra 10.12.6
>
> Matrix products: default
> BLAS: /Library/Frameworks/R.framework/Versions/3.4/
> Resources/lib/libRblas.0.dylib
> LAPACK: /Library/Frameworks/R.framework/Versions/3.4/
> Resources/lib/libRlapack.dylib
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] parallel  stats4    stats     graphics  grDevices utils     datasets
> [8] methods   base
>
> other attached packages:
>  [1] org.Hs.eg.db_3.4.2   GO.db_3.4.2          AnnotationDbi_1.40.0
>  [4] IRanges_2.12.0       S4Vectors_0.16.0     Biobase_2.38.0
>  [7] BiocGenerics_0.24.0  rvcheck_0.0.9        rmarkdown_1.6
> [10] roxygen2_6.0.1       magrittr_1.5         BiocInstaller_1.28.0
>
> loaded via a namespace (and not attached):
>  [1] Rcpp_0.12.13    knitr_1.17      xml2_1.1.1      bit_1.1-12
>  [5] R6_2.2.2        rlang_0.1.2     blob_1.1.0      stringr_1.2.0
>  [9] tools_3.4.2     DBI_0.7         htmltools_0.3.6 commonmark_1.4
> [13] bit64_0.9-7     rprojroot_1.2   digest_0.6.12   tibble_1.3.4
> [17] memoise_1.1.0   RSQLite_2.0     evaluate_0.10.1 stringi_1.1.5
> [21] compiler_3.4.2  backports_1.1.1 pkgconfig_2.0.1
>
>
>
>
>
>
>>
>
>
>
> This email message may contain legally privileged and/or confidential
> information.  If you are not the intended recipient(s), or the employee or
> agent responsible for the delivery of this message to the intended
> recipient(s), you are hereby notified that any disclosure, copying,
> distribution, or use of this email message is prohibited.  If you have
> received this message in error, please notify the sender immediately by
> e-mail and delete this email message from your computer. Thank you.
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>
> This email message may contain legally privileged and/or confidential
> information.  If you are not the intended recipient(s), or the employee or
> agent responsible for the delivery of this message to the intended
> recipient(s), you are hereby notified that any disclosure, copying,
> distribution, or use of this email message is prohibited.  If you have
> received this message in error, please notify the sender immediately by
> e-mail and delete this email message from your computer. Thank you.
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>



-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list