[BioC] mistmatch in GO terms between topGO_1.14.0 and org.Mm.eg.db_2.3.6
Dick Beyer
dbeyer at u.washington.edu
Wed Mar 3 01:15:33 CET 2010
Hello,
I've been running topGO (using mouse Entrez Gene IDs) and found that some GO terms that turn up in the topGO analysis are not in the GO terms from org.Mm.eg.db.
I'd like to give some example code to show how to generate the problem, but my topGO code is a lot of lines. The output looks like:
allResults[[1]][[1]][1:2,]
GO.ID Term Annotated Significant Expected classic elim weight
714 GO:0019222 regulation of metabolic process 2498 143 107.08 0.00010 0.17956 0.9057
762 GO:0006807 nitrogen compound metabolic process 3413 186 146.31 0.00011 0.45337 0.9434
So, the topGO output gives a column of GOIDs and such.
Some of the problem GOIDs from topGO are GO:0030522, GO:0051094, GO:0031497, GO:0046700.
I can't find these in names(Mm.egGO2EG).
library("org.Mm.eg.db")
Mm.egGO2EG <- as.list(org.Mm.egGO2EG)
grep("GO:0030522",names(Mm.egGO2EG))
integer(0)
Is it possible that topGO depends on GO.db, and I'm using org.Mm.eg.db? When I check for GO:0030522 for Mus musculus at geneontology.org, GO:0030522 is valid.
I'm puzzled by the mismatch. I want to get the genes for a given GOID, so there is probably a work around. If anyone has a suggestion or idea, I'd be very grateful to know what to try.
Thanks very much,
Dick
Here is my session info:
sessionInfo()
R version 2.10.0 (2009-10-26)
x86_64-redhat-linux-gnu
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=C
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] grid stats graphics grDevices utils datasets methods base
other attached packages:
[1] limma_3.2.1 topGO_1.14.0 SparseM_0.83 graph_1.24.1 GO.db_2.3.5 org.Mm.eg.db_2.3.6 RSQLite_0.7-3
[8] DBI_0.2-4 AnnotationDbi_1.8.1 Biobase_2.6.0 biomaRt_2.2.0 gplots_2.7.4 caTools_1.10 bitops_1.0-4.1
[15] gdata_2.6.1 gtools_2.6.1
loaded via a namespace (and not attached):
[1] lattice_0.17-26 RCurl_1.3-0 tools_2.10.0 XML_2.6-0
*******************************************************************************
Richard P. Beyer, Ph.D. University of Washington
Tel.:(206) 616 7378 Env. & Occ. Health Sci. , Box 354695
Fax: (206) 685 4696 4225 Roosevelt Way NE, # 100
Seattle, WA 98105-6099
http://depts.washington.edu/ceeh/ServiceCores/FC5/FC5.html
http://staff.washington.edu/~dbeyer
More information about the Bioconductor
mailing list