[BioC] total gene number for a given species in reactome.db
Marc Carlson
mcarlson at fhcrc.org
Mon Aug 13 19:54:33 CEST 2012
Hi Gilbert,
If you are using the new reactome.db (the one in devel), then you can do
this:
rk = keys(reactome.db, keytype="ENTREZID") ## gets all entrez IDs from
the DB
hk = keys(org.Hs.eg.db, keytype="ENTREZID") ## gets all the entrez IDs
from most recent org pkg.
## a cursory glance shows that both overlaps are the same size:
table(hk %in% rk)
table(rk %in% hk)
Or if you are a wiz at reactome.db, the latest reactome.db package has
the ENTIRE reactome database stashed inside. So you might be able to
just write a query to it and specify that you only want human entrez IDs
Marc
On 08/09/2012 10:02 PM, Gang Feng wrote:
> Hello,
>
> I am using reactome.db for over-representive enrichment test, so I wonder how I can get the total gene number for a given species in reactome.db. For example, how many human genes (unique Entrez Ids) are annotated in reactome.db? Is there any simple way to get this number besides counting the shared genes between the annotated genes from "reactomeEXTID2PATHID" and records from "org.Hs.egUNIGENE2EG"? Or retrieve pathways for human in reactome.db, then count the annotated unique genes. Any comment?
>
> I know there is a Reactome Statistics webpage for some species at the Reactome official website, but reactome.db is only updated twice each year, not everyday. I guess the numbers are not accurate for reactome.db .
>
> Thanks
>
> Gilbert
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list