[BioC] Retrieving all entrez identifiers that are annotated in KEGG pathways
James W. MacDonald
jmacdon at uw.edu
Sat Mar 2 19:41:19 CET 2013
Hi Anirban,
> library(hgu133plus2.db)
> x <- select(hgu133plus2.db, Lkeys(hgu133plus2PATH), c("ENTREZID","PATH"))
Warning message:
In .generateExtraRows(tab, keys, jointype) :
'select' resulted in 1:many mapping between keys and return rows
> head(x)
PROBEID ENTREZID PATH
1 1007_s_at 780 <NA>
2 1053_at 5982 03030
3 1053_at 5982 03420
> egids <- unique(x$ENTREZID[!is.na(x$PATH)])
> length(egids)
[1] 5498
Best,
Jim
On 3/2/2013 8:20 AM, Anirban [guest] wrote:
> Dear all,
>
> Is there any way to get all entrez identifiers that are annotated with KEGG pathways? Actually I am using GOStats package in R to perform KEGG pathway enrichment analysis.. In general, for each KEGG pathway term there is a list of annotated hgnc symbols or entrez identifiers.. For all KEGG pathway terms we must have one list of entrez identifiers. I want to have that list...
>
> What I am doing write now is as follows:
> library(biomaRt)
> library("GO.db")
> library("KEGG.db")
> library("GOstats")
> library("hgu133plus2.db")
> library("EMA")
> library("fdrtool")
> library("org.Hs.eg.db")
> ensembl = useMart("ensembl", dataset = "hsapiens_gene_ensembl")
> x<- hgu133plus2PATH
> mapped_probes<- mappedkeys(x)
>
> b<-getBM(attributes=c("hgnc_symbol"),filters="affy_hg_u133_plus_2",values=mapped_probes,mart=ensembl)
>
> Is it the correct way to do that?
>
> Thanks in advance.. :)
>
> -- output of sessionInfo():
>
> R version 2.15.1 (2012-06-22)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=C LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list