[BioC] Gene Symbol to KEGG Pathway
Nianhua Li
nli at fhcrc.org
Wed Nov 15 01:08:35 CET 2006
Dear Ramachandran,
The process is: gene symbols --> Entrez Gene ID --> KEGG pathway ID --> KEGG pathway names. You can
get the first step done via Entrez Gene database, and get the rest done via KEGG database.
If you want to make it work in bioconductor, there is a far-from-elegant way:
## gene symbols --> Entrez Gene ID
library(biomaRt)
ensembl <- useMart("ensembl", dataset="hsapiens_gene_ensembl")
eg <- getBM(attributes=c("hgnc_symbol","entrezgene"), filters="hgnc_symbol", values=symbols,
mart=ensembl)
## clean up the result a bit
eg <- split(eg[,2], eg[,1])
eg <- lapply(eg, function(x) {
unique(x)[1]
})
## Entrez Gene ID --> KEGG pathway ID
library(KEGG)
kegg <- mget(as.character(eg), KEGGEXTID2PATHID, ifnotfound=list(NA))
## KEGG pathway ID --> KEGG pathway name
i <- which(!is.na(kegg))
kegg[i] <- lapply(kegg[i], function(x) {
x <- sub("hsa", "", x)
unlist(mget(x, KEGGPATHID2NAME, ifnotfound=list(NA)))
})
names(kegg) <- names(eg)
hope it is helpful
nianhua
=======================================
Dear Colleagues,
I have a list of Human Gene Symbols and I need to get the corresponding KEGG
pathways. Can anyone show me the Bioconductor function to accomplish this?
Many Thanks in Advance
Ramachandran
Dr. S. Ramachandran
Scientist E I
G.N. Ramachandran Knowledge Centre for Genome Informatics
Institute of Genomics and Integrative Biology
Mall Road, Delhi 110 007
Tel: 091-11-2766-6156
Fax: 091-11-2766-7471
More information about the Bioconductor
mailing list