[BioC] KEGGSOAP, hgu95av2.db: limited functionality

Ludwig Geistlinger Ludwig.Geistlinger at campus.lmu.de
Fri Feb 19 10:20:16 CET 2010


Dear BioC developers,

I intend to map gene expression data on KEGG pathways.
In more detail, I performed a DE analysis on gene expression data from a hgu95av2 chip and want to color particular genes in corresponding pathways.
I found out that the KEGGSOAP package already implemented an awesome access to the KEGG API and I honestly appreciate the work that have been done here.
However, the function mark.pathway.by.objects requires KEGG gene ids or at least KEGG orthology terms, while there is now way to map hgu95av2 probe IDs on KEGG gene IDs or KO terms (not in hgu95av2.db, keggorth, KEGG.db, etc.).    
I wondered why there are only selected functions of the KEGG API integrated in the KEGGSOAP package, especially why the "bconv" utility is not integrated, which allows to map foreign identifiers on KEGG identifiers.
With "bconv" it would be easy for me to map hgu95av2 probe IDs on ENSEMBL/UNIGENE/UNIPROT/etc IDs (via hgu95av2.db) and then on KEGG IDs (via bconv).
In addition, the original mark.pathway.by.objects function from the KEGG API allows to put in EC numbers which is not supported by the corresponding KEGGSOAP function.
Could you please explain why there are these limitations and how it would be possible to extend the KEGGSOAP package to all of the function of the KEGG API ?

Currently, my workaround is like that:
(1) map the probe IDs onto ENSEMBL IDs (using hgu95av2.db) for the selected genes
(2) In the meanwhile, I have to retrieve all KEGG entries for the particular pathway using "get.genes.by.pathway" and "bget" from KEGGSOAP
(3) Then, I have to parse each of these entries for ENSEMBL ID and KO ID to create a dictionary ENSEMBL -> KO
(4) I map the IDs from (1) onto KO using (3)

This works but it is uncomfortable and, first of all, time consuming (because of (3)).

Yours faithfully,
Ludwig Geistlinger
(Research for an ongoing diploma thesis)
(University of Cape Town, Institute of Infectious Diseases)



More information about the Bioconductor mailing list