[BioC] gene count in a GO term

Shi, Tao shidaxia at yahoo.com
Wed Oct 1 19:50:18 CEST 2008


Thanks, Jim and Michael, for the speedy replies!

Following up Michael's point, I tried biomaRt, but the number of genes seems way too low comparing what reported on geneontology webpage (see below).  I'm not sure how they're mapped.  "org.Hs.eg.db" gave comparable number, which is not surprising as it derived directly from the gene ontology site.


###=====================================================================
## GO:0005575 is the root term for CC, there are 9428 gene product according to AmiGo
##======================================================================
>   mart <- useMart("ensembl", dataset="hsapiens_gene_ensembl")
Checking attributes and filters ... ok
>    tmp <- getBM(attributes=c( "ensembl_gene_id"), filters="go",values="GO:0005575", mart = mart)
>    dim(tmp)
[1] 641   1
>     tmp <- getBM(attributes=c( "entrezgene"), filters="go",values="GO:0005575", mart = mart)
>     dim(tmp)
[1] 565   1
 

...Tao






----- Original Message ----
From: James W. MacDonald <jmacdon at med.umich.edu>
To: "Shi, Tao" <shidaxia at yahoo.com>
Cc: bioconductor at stat.math.ethz.ch
Sent: Wednesday, October 1, 2008 5:14:13 AM
Subject: Re: [BioC] gene count in a GO term

Hi Tao,

Shi, Tao wrote:
> Hi list,
> 
> Please forgive if this was asked before.
> 
> In R, is there a way to find out how many Human gene products in a GO
> term (including all its children) like those reported in AmiGo?  I'm
> talking about ALL the gene products, not just those on a affy chip.
> For example, for GO:0005921 and children and its children, the number
> is 6.

There are only 6 if you restrict to TAS and IDA. If you allow IEA then 
there are 27:

> library(org.Hs.eg.db)
> get("GO:0005921", org.Hs.egGO2ALLEGS)
         TAS         TAS         IEA         IEA         IEA         TAS
      "1823"      "2697"      "2700"      "2701"      "2702"      "2703"
         TAS         TAS         IEA         IEA         IEA         IEA
      "2705"      "2706"      "2707"      "2709"      "4284"      "9742"
         IEA         IEA         IEA         IEA         IEA         IEA
     "10052"     "10804"     "24145"     "56666"     "57165"     "57369"
         IEA         IEA         IEA         IDA         IEA         IEA
     "81025"     "84694"    "116337"    "125111"    "127534"    "219770"
         IEA         IEA         IEA
    "349149"    "375519" "100126572"


Best,

Jim


> 
> Many thanks!
> 
> ...Tao
> 
> _______________________________________________ Bioconductor mailing
> list Bioconductor at stat.math.ethz.ch 
> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the
> archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Hildebrandt Lab
8220D MSRB III
1150 W. Medical Center Drive
Ann Arbor MI 48109-0646
734-936-8662



More information about the Bioconductor mailing list