[BioC] GO analysis code snippet

Seth Falcon sfalcon at fhcrc.org
Tue Jul 31 16:25:05 CEST 2007


"Simon Lin" <simonlin at duke.edu> writes:

> Hello,
>
> I am looking for some GO analysis code-snippet for the following task,
>
> 1) list all the GO terms under "molecular function"

You can do this at the GO level, as Jan suggested, but here's a way
based on the hgu133a annotation data which makes sense given your
second question...

library("annotate")
library("hgu133a")
library("GO")

isMF = filterGOByOntology(ls(hgu133aGO2ALLPROBES), "MF")
mfGO = ls(hgu133aGO2ALLPROBES)[isMF]

> 2) For a given GO id, list the associated Entrez gene IDs (or Affymetrix 
> probe IDs for U133A)

At present, this takes two steps using the annotation data packages.
It will get easier soon when the DB-based annotation packages are
introduced.  But for now...

## map a GO ID to probeset IDs
hgu133aGO2ALLPROBES[[mfGO[12]]][1:5]
          TAS           NAS           NAS           NAS           TAS 
  "203072_at" "203215_s_at" "203216_s_at"   "204527_at"   "204631_at" 

## turn those into Entrez Gene IDs
psids = hgu133aGO2ALLPROBES[[mfGO[12]]][1:5]
unlist(mget(psids, hgu133aENTREZID))
  203072_at 203215_s_at 203216_s_at   204527_at   204631_at 
       4643        4646        4646        4644        4620 

> A further question: is there a package analyzeing GO hierarchically? I mean, 
> testing the parent nodes first; if it is not significant, then, testing the 
> children? I think most of the packages just test all the GO terms 
> simutanuously.

Have a look also at the GOstats package and the hyperGTest method.
There are details in the vignette of how to run a conditional analysis
that orders the computation bottom up (the concept is quite similar to
that provided by Adrian's topGO package).

You may find that the result object returned by hyperGTest in the case
of GO helps you avoid the annotation data package gyrations as it
gives you access to the resulting GO -> Entrez map, etc.  See the
vignette for details.

Best,

+ seth

-- 
Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center
BioC: http://bioconductor.org/
Blog: http://userprimary.net/user/



More information about the Bioconductor mailing list