[BioC] Discrepancies finding genes with a given GO term

Oscar Rueda Oscar.Rueda at cancer.org.uk
Wed Mar 16 12:51:37 CET 2011


Dear list, 

I'm trying to find the list of genes with a given go term. I'm using two
different methods and I get different results for amigo web page.
If I use: 

> library(GO.db)
> library(org.Hs.eg.db)
> res <- get("GO:0006913", revmap(org.Hs.egGO))
> res <- do.call('c', mget(res, org.Hs.egSYMBOL))
> sort(unique(res))
 [1] "AAAS"    "ANKRD54" "ANP32A"  "CAMK1"   "CDK5"    "EIF5A"   "MLX"
 [8] "MLXIP"   "MYBBP1A" "NPM1"    "NUP155"  "NUP205"  "NUP98"   "RAN"
[15] "RBM15B"  "RSRC1"   "SET"     "ZNF384"
 

I get 18 genes. According to amigo, AXIN1, CAMK4, NUP62, UPF3A and UPF3B
should be there too.

If I use biomaRt I get:

> library(biomaRt)
> ensembl <- useMart("ensembl", dataset = "hsapiens_gene_ensembl")
> res <- getBM(c("hgnc_symbol"), filters = "go", values = "GO:0006913", mart =
ensembl)
> sort(unique(res[,1]))
  [1] "AAAS"    "ANKRD54" "ANP32A"  "AXIN1"   "CAMK1"   "CAMK4"   "CDC42"
  [8] "CDK5"    "DIRAS1"  "DIRAS2"  "DIRAS3"  "DNAJC27" "EFCAB4B" "EIF5A"
 [15] "HRAS"    "IFT27"   "KRAS"    "LRRK2"   "MLX"     "MLXIP"   "MRAS"
 [22] "MYBBP1A" "NPM1"    "NRAS"    "NUP155"  "NUP205"  "NUP54"   "NUP98"
 [29] "NUPL1"   "RAB10"   "RAB11A"  "RAB11B"  "RAB12"   "RAB13"   "RAB14"
 [36] "RAB15"   "RAB17"   "RAB18"   "RAB19"   "RAB1A"   "RAB1B"   "RAB20"
 [43] "RAB21"   "RAB22A"  "RAB23"   "RAB24"   "RAB25"   "RAB26"   "RAB27A"
 [50] "RAB27B"  "RAB28"   "RAB2A"   "RAB2B"   "RAB30"   "RAB31"   "RAB32"
 [57] "RAB33A"  "RAB33B"  "RAB34"   "RAB35"   "RAB36"   "RAB37"   "RAB38"
 [64] "RAB39"   "RAB39B"  "RAB3A"   "RAB3B"   "RAB3C"   "RAB3D"   "RAB40A"
 [71] "RAB40AL" "RAB40B"  "RAB40C"  "RAB41"   "RAB42P1" "RAB43"   "RAB43P1"
 [78] "RAB44"   "RAB4A"   "RAB4B"   "RAB5A"   "RAB5B"   "RAB5C"   "RAB6A"
 [85] "RAB6B"   "RAB6C"   "RAB7A"   "RAB7L1"  "RAB8A"   "RAB8B"   "RAB9A"
 [92] "RAB9B"   "RABL2A"  "RABL2B"  "RAC1"    "RAC2"    "RAC3"    "RALA"
 [99] "RALB"    "RAN"     "RAP1A"   "RAP1B"   "RAP2A"   "RAP2B"   "RAP2C"
[106] "RASD1"   "RASD2"   "RASEF"   "RASL11B" "RASL12"  "REM1"    "REM2"
[113] "RERG"    "RHOA"    "RHOB"    "RHOC"    "RHOD"    "RHOF"    "RHOG"
[120] "RHOH"    "RHOJ"    "RHOQ"    "RHOU"    "RHOV"    "RIT1"    "RIT2"
[127] "RND2"    "RND3"    "RRAS"    "RRAS2"   "RSRC1"   "SET"     "ZNF384"

This is a bigger list.

I don't know exactly what's happening here. Can anyone tell me what am I
doing wrong?

Cheers, 
Oscar

> sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: i386-apple-darwin9.8.0/i386 (32-bit)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] biomaRt_2.6.0        org.Hs.eg.db_2.4.6   GO.db_2.4.5
[4] RSQLite_0.9-2        DBI_0.2-5            AnnotationDbi_1.12.0
[7] Biobase_2.10.0

loaded via a namespace (and not attached):
[1] RCurl_1.4-3  tools_2.12.0 XML_3.2-0



Oscar M. Rueda, PhD.
Postdoctoral Research Fellow, Breast Cancer Functional Genomics.
Cancer Research UK Cambridge Research Institute.
Li Ka Shing Centre, Robinson Way.
Cambridge CB2 0RE 
England 




This communication is from Cancer Research UK. Our website is at www.cancerresearchuk.org. We are a registered charity in England and Wales (1089464) and in Scotland (SC041666) and a company limited by guarantee registered in England and Wales under number 4325234. Our registered address is Angel Building, 407 St John Street, London, EC1V 4AD. Our central telephone number is 020 7242 0200.
 
This communication and any attachments contain information which is confidential and may also be privileged.   It is for the exclusive use of the intended recipient(s).  If you are not the intended recipient(s) please note that any form of disclosure, distribution, copying or use of this communication or the information in it or in any attachments is strictly prohibited and may be unlawful.  If you have received this communication in error, please notify the sender and delete the email and destroy any copies of it.
 
E-mail communications cannot be guaranteed to be secure or error free, as information could be intercepted, corrupted, amended, lost, destroyed, arrive late or incomplete, or contain viruses.  We do not accept liability for any such matters or their consequences.  Anyone who communicates with us by e-mail is taken to accept the risks in doing so.



More information about the Bioconductor mailing list