[BioC] KEGG REST: retrieving genes
Hooiveld, Guido
Guido.Hooiveld at wur.nl
Wed Jan 30 09:59:25 CET 2013
Hi Dan,
Actually, after a slight modification your suggestion does work!
I realized that the pathways referred to as 'mapxxxxx' are actually the reference pathways; to retrieve the genes for a specific organism 'map' has to be replaced by the abbreviation of that specific organism, e.g. 'hsa' or 'mmu'.
Thus, all human genes that are in the Arachidonic Acid Metabolism pathway:
> head(keggGet("hsa00590")[[1]]$GENE)
8399
"PLA2G10; phospholipase A2, group X [KO:K01047] [EC:3.1.1.4]"
26279
"PLA2G2D; phospholipase A2, group IID [KO:K01047] [EC:3.1.1.4]"
30814
"PLA2G2E; phospholipase A2, group IIE [KO:K01047] [EC:3.1.1.4]"
50487
"PLA2G3; phospholipase A2, group III [KO:K01047] [EC:3.1.1.4]"
64600
"PLA2G2F; phospholipase A2, group IIF [KO:K01047] [EC:3.1.1.4]"
81579
"PLA2G12A; phospholipase A2, group XIIA [KO:K01047] [EC:3.1.1.4]"
>
Thanks,
Guido
-----Original Message-----
From: Dan Tenenbaum [mailto:dtenenba at fhcrc.org]
Sent: Wednesday, January 30, 2013 00:47
To: Hooiveld, Guido
Cc: bioconductor at r-project.org
Subject: Re: [BioC] KEGG REST: retrieving genes
Hi Guido,
On Tue, Jan 29, 2013 at 2:24 PM, Hooiveld, Guido <Guido.Hooiveld at wur.nl> wrote:
> Hi,
> I am exploring the package KEGG REST.
> I would like to retrieve the genes that belong to a specific pathway, e.g. all human genes that are in the Arachidonic Acid Metabolism pathway (= map00590). For now the topology of the pathway is not of relevance to me.
> I have checked the KEGG REST vignette but could not find how to do this, so if this is possible a pointer would be appreciated.
>
Normally the answer would be:
keggGet("map00590")[[1]]$GENE
But it looks like KEGG does not have gene data for this particular pathway (see the underlying URL, http://rest.kegg.jp/get/path:map00590, we expect a GENE section like you'd see in a different pathway, such as
http://rest.kegg.jp/get/path:hsa05200)
You can find some (possibly outdated) genes for this pathway by doing the following:
library(org.Hs.eg.db)
select(org.Hs.eg.db, "00590", cols=c("ENTREZID","SYMBOL"), keytype="PATH")
This is old KEGG data and I do not know why their REST interface doesn't contain this data.
> Thanks,
> Guido
>
> As a side node (for the maintainer): I noticed that the API has recently been updated (18 January 2013); a.o. KGML files can now be retrieved and also conversion options from/to KEGG IDs has been expanded.
Thanks! I will update the package.
Dan
>
>> sessionInfo()
> R Under development (unstable) (2012-11-21 r61136)
> Platform: i386-w64-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United
> States.1252 [3] LC_MONETARY=English_United States.1252 [4]
> LC_NUMERIC=C [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] KEGGREST_0.99.1
>
> loaded via a namespace (and not attached):
> [1] BiocGenerics_0.5.6 Biostrings_2.27.10 digest_0.6.2 httr_0.2
> [5] IRanges_1.17.30 parallel_2.16.0 png_0.1-4 RCurl_1.91-1.1
> [9] stats4_2.16.0 stringr_0.6.2 tools_2.16.0
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list