[BioC] gene count in a GO term
Shi, Tao
shidaxia at yahoo.com
Wed Oct 1 19:50:18 CEST 2008
Thanks, Jim and Michael, for the speedy replies!
Following up Michael's point, I tried biomaRt, but the number of genes seems way too low comparing what reported on geneontology webpage (see below). I'm not sure how they're mapped. "org.Hs.eg.db" gave comparable number, which is not surprising as it derived directly from the gene ontology site.
###=====================================================================
## GO:0005575 is the root term for CC, there are 9428 gene product according to AmiGo
##======================================================================
> mart <- useMart("ensembl", dataset="hsapiens_gene_ensembl")
Checking attributes and filters ... ok
> tmp <- getBM(attributes=c( "ensembl_gene_id"), filters="go",values="GO:0005575", mart = mart)
> dim(tmp)
[1] 641 1
> tmp <- getBM(attributes=c( "entrezgene"), filters="go",values="GO:0005575", mart = mart)
> dim(tmp)
[1] 565 1
...Tao
----- Original Message ----
From: James W. MacDonald <jmacdon at med.umich.edu>
To: "Shi, Tao" <shidaxia at yahoo.com>
Cc: bioconductor at stat.math.ethz.ch
Sent: Wednesday, October 1, 2008 5:14:13 AM
Subject: Re: [BioC] gene count in a GO term
Hi Tao,
Shi, Tao wrote:
> Hi list,
>
> Please forgive if this was asked before.
>
> In R, is there a way to find out how many Human gene products in a GO
> term (including all its children) like those reported in AmiGo? I'm
> talking about ALL the gene products, not just those on a affy chip.
> For example, for GO:0005921 and children and its children, the number
> is 6.
There are only 6 if you restrict to TAS and IDA. If you allow IEA then
there are 27:
> library(org.Hs.eg.db)
> get("GO:0005921", org.Hs.egGO2ALLEGS)
TAS TAS IEA IEA IEA TAS
"1823" "2697" "2700" "2701" "2702" "2703"
TAS TAS IEA IEA IEA IEA
"2705" "2706" "2707" "2709" "4284" "9742"
IEA IEA IEA IEA IEA IEA
"10052" "10804" "24145" "56666" "57165" "57369"
IEA IEA IEA IDA IEA IEA
"81025" "84694" "116337" "125111" "127534" "219770"
IEA IEA IEA
"349149" "375519" "100126572"
Best,
Jim
>
> Many thanks!
>
> ...Tao
>
> _______________________________________________ Bioconductor mailing
> list Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the
> archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Hildebrandt Lab
8220D MSRB III
1150 W. Medical Center Drive
Ann Arbor MI 48109-0646
734-936-8662
More information about the Bioconductor
mailing list