[BioC] KEGGSOAP, hgu95av2.db: limited functionality

Marc Carlson mcarlson at fhcrc.org
Fri Feb 19 18:46:31 CET 2010


Hi Ludwig,

It is a little difficult for me to make sure that I am helping you
sufficiently because you have not posted a specific example of something
that you would like to see happen.  The result is that I have no way to
verify that what I am suggesting will answer your question.  But it
occurs to me that it might help you to look at the following few example
genes and see how the KEGG Gene ID looks compared to the Entrez Gene ID:

KEGG Gene ID:     Entrez Gene ID:

hsa:8355 <http://www.genome.jp/dbget-bin/www_bget?hsa:8355>            
     8355
<http://www.ncbi.nih.gov/entrez/query.fcgi?db=gene&cmd=Retrieve&dopt=Graphics&list_uids=8355>
hsa:9081 <http://www.genome.jp/dbget-bin/www_bget?hsa:9081>             
    9081
<http://www.ncbi.nih.gov/entrez/query.fcgi?db=gene&cmd=Retrieve&dopt=Graphics&list_uids=9081>
hsa:51054 <http://www.genome.jp/dbget-bin/www_bget?hsa:51054>           
    51054
<http://www.ncbi.nih.gov/entrez/query.fcgi?db=gene&cmd=Retrieve&dopt=Graphics&list_uids=51054>

etc.

I am sure that you see a pattern here.  ;)   And so I suspect that you
might be making this more difficult than it needs to be.  You can get
the entrez gene ID from the hgu95av2ENTREZID mapping.

Does this help?


  Marc



Ludwig Geistlinger wrote:
> Dear BioC developers,
>
> I intend to map gene expression data on KEGG pathways.
> In more detail, I performed a DE analysis on gene expression data from a hgu95av2 chip and want to color particular genes in corresponding pathways.
> I found out that the KEGGSOAP package already implemented an awesome access to the KEGG API and I honestly appreciate the work that have been done here.
> However, the function mark.pathway.by.objects requires KEGG gene ids or at least KEGG orthology terms, while there is now way to map hgu95av2 probe IDs on KEGG gene IDs or KO terms (not in hgu95av2.db, keggorth, KEGG.db, etc.).    
> I wondered why there are only selected functions of the KEGG API integrated in the KEGGSOAP package, especially why the "bconv" utility is not integrated, which allows to map foreign identifiers on KEGG identifiers.
> With "bconv" it would be easy for me to map hgu95av2 probe IDs on ENSEMBL/UNIGENE/UNIPROT/etc IDs (via hgu95av2.db) and then on KEGG IDs (via bconv).
> In addition, the original mark.pathway.by.objects function from the KEGG API allows to put in EC numbers which is not supported by the corresponding KEGGSOAP function.
> Could you please explain why there are these limitations and how it would be possible to extend the KEGGSOAP package to all of the function of the KEGG API ?
>
> Currently, my workaround is like that:
> (1) map the probe IDs onto ENSEMBL IDs (using hgu95av2.db) for the selected genes
> (2) In the meanwhile, I have to retrieve all KEGG entries for the particular pathway using "get.genes.by.pathway" and "bget" from KEGGSOAP
> (3) Then, I have to parse each of these entries for ENSEMBL ID and KO ID to create a dictionary ENSEMBL -> KO
> (4) I map the IDs from (1) onto KO using (3)
>
> This works but it is uncomfortable and, first of all, time consuming (because of (3)).
>
> Yours faithfully,
> Ludwig Geistlinger
> (Research for an ongoing diploma thesis)
> (University of Cape Town, Institute of Infectious Diseases)
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>



More information about the Bioconductor mailing list