[BioC] pathview puzzle
Oleg Moskvin
moskvin at wisc.edu
Fri Aug 23 18:19:09 CEST 2013
Hi Weijun,
Thank you for the response.
The problem seems to be deeper than that and is connected to special handling of a particular species - E.coli - by KEGG.
I looked into the pathview() code and here is what I see:
1) gene.data is remapped internally via mol.sum() to have ENTREZ IDs;
2) remapped gene.data is used by node.map() to map onto KEGG nodes using node.data
3) the node.data used in (2) was originally extracted from the KEGG XML by node.info()
The above route implies that the "name" entries in the KEGG XML of type="gene" have "speciesID:ENTREZ" format...
And in the case of E.coli this doesn't hold true! See the examples of XML entries for H.sapience and E.coli from my yesterday's message (below).
In fact, in KEGG XML for E.coli "gene" records b-numbers are used as IDs!
So, for the cases like that, when KEGG fails to be consistent in the supplied XML structure, one may suggest introducing an "id.bypass" option to pathview() which will take the gene.data as is (with the IDs supplied by user that match KEGG XML ids; for example, b-numbers), and pass this directly to the step 3 (node matching).
Thanks!
Oleg
On 08/22/13, Luo Weijun wrote:
> Hi Oleg,
> You are right, the problem is due to ID type inconsistency.
> You have to specify gene.idtype when calling pathview function, if your gene id type is not Entrez Gene. I don’t think b-numbers are recognized for sure. For your gene name example, if you mean official gene symbols by “gene name”, you should specify gene.idtype="SYMBOL" (lower case is fine):
> eco2.out <- pathview(gene.data = T2.CEBF095.crt115.ASCH.DROP3.rel.gn, pathway.id = "02010", gene.idtype="SYMBOL", out.suffix = "T2ACSH", species = "eco", kegg.native=TRUE)
On 08/22/13, Oleg Moskvin wrote:
>
> <entry id="2" name="hsa:51343" type="gene"
> link="http://www.kegg.jp/dbget-bin/www_bget?hsa:51343">
> <graphics name="FZR1, CDC20C, CDH1, FZR, FZR2, HCDH, HCDH1" fgcolor="#000000" bgcolor="#BFFFBF"
> type="rectangle" x="919" y="536" width="46" height="17"/>
> </entry>
>
>
> <entry id="4" name="eco:b1513" type="gene"
> link="http://www.kegg.jp/dbget-bin/www_bget?eco:b1513">
> <graphics name="lsrA" fgcolor="#000000" bgcolor="#BFFFBF"
> type="rectangle" x="339" y="1882" width="46" height="17"/>
> </entry>
More information about the Bioconductor
mailing list