[Bioc-devel] Weird monkey identifiers in org.Hs.eg.db
Aaron Lun
|n||n|te@monkey@@w|th@keybo@rd@ @end|ng |rom gm@||@com
Tue Apr 23 05:53:35 CEST 2019
Playing around with org.Hs.eg.db 3.8.0. What on earth is ENSPTRG0000...?
> library(org.Hs.eg.db)
> mapIds(org.Hs.eg.db, key="GCG", keytype="SYMBOL", column="ENSEMBL")
'select()' returned 1:many mapping between keys and columns
GCG
"ENSPTRG00000000777"
Well, at least it still recovers the right identifier... eventually.
> select(org.Hs.eg.db, key="GCG", keytype="SYMBOL", columns="ENSEMBL")
'select()' returned 1:many mapping between keys and columns
SYMBOL ENSEMBL
1 GCG ENSPTRG00000000777
2 GCG ENSG00000115263
The SYMBOL->Entrez ID relational table seems to be okay:
> Y <- toTable(org.Hs.egSYMBOL)
> Y[which(Y[,2]=="GCG"),]
gene_id symbol
2152 2641 GCG
So the cause is the Ensembl->Entrez mappings:
> Z <- toTable(org.Hs.egENSEMBL2EG)
> Z[Z[,1]==2641,]
gene_id ensembl_id
3028 2641 ENSPTRG00000000777
3029 2641 ENSG00000115263
Googling suggests that ENSPTRG00000000777 is an identifier for some
other gene in one of the other monkeys. Hardly "Hs" stuff.
Session info (not technically R 3.6, but I didn't think that would have
been the cause):
> R Under development (unstable) (2019-04-11 r76379)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 18.04.2 LTS
>
> Matrix products: default
> BLAS: /home/luna/Software/R/trunk/lib/libRblas.so
> LAPACK: /home/luna/Software/R/trunk/lib/libRlapack.so
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] parallel stats4 stats graphics grDevices utils datasets
> [8] methods base
>
> other attached packages:
> [1] org.Hs.eg.db_3.8.0 AnnotationDbi_1.45.1 IRanges_2.17.5
> [4] S4Vectors_0.21.23 Biobase_2.43.1 BiocGenerics_0.29.2
>
> loaded via a namespace (and not attached):
> [1] Rcpp_1.0.1 digest_0.6.18 DBI_1.0.0 RSQLite_2.1.1
> [5] blob_1.1.1 bit64_0.9-7 bit_1.1-14 compiler_3.7.0
> [9] pkgconfig_2.0.2 memoise_1.1.0
More information about the Bioc-devel
mailing list