[BioC] fastest way to get a gene list having certain GO term

Duke duke.lists at gmx.com
Tue Mar 6 23:24:46 CET 2012

On 3/6/12 3:47 PM, Duke wrote:
> Hi folks,
> I need some statistics for a certain GO term (for example, "DNA 
> binding"), and I wonder what is the fastest way to archive the latest 
> list of genes having that specific GO term. There are now a number of 
> GO packages and I would like to hear/learn your experience regarding 
> various different packages.

To archive the above task, I separated it as two processes:

* Get all GO IDs having specific term ("DNA binding")
* Then, get all the genes having the resulting GO IDs

I think I got the numbers now:


GOTerm2GOID = function(term){
   GTL = eapply(GOTERM, function(x){grep(term, x at Term, value=TRUE)})
   GID = sapply(GTL, length)
   names(GTL[GID > 0])

length(unlist(sapply(GOTerm2GOID("DNA binding"), function(x) mget(x, 
revmap(org.Hs.egGO), ifnotfound=NA))))

However, I am still stuck at how to get the gene symbols (Il22, Foxp3 
for example) as well as RefSeq ID of the resulting gene list.

Anybody has any suggestion?



