[BioC] GoStats

Marc Carlson mcarlson at fhcrc.org
Wed Mar 30 02:04:53 CEST 2011


Hi David,

You need to have 3 things to do this analysis.  You need 1) a list of 
"interesting" genes (presumably your list of targeted genes for 
microRNAs?), 2) a list of ALL the genes that you tested (aka the gene 
universe) and 3) the annotations that map the GO terms to all genes (and 
not just the ones that you think are interesting either - we want ALL of 
the annotations).  It is this third thing, (the annotations of GO IDs to 
all your possible genes) that you need to make into a GOALLFrame 
object.  Once you have that, you will basically be testing the 1st thing 
(list of interesting genes) for enrichment in your 2nd thing (gene 
universe).  The essence of the test is to use the annotations to ask if 
there are GO terms that are over or under represented in your 
"interesting" list of genes relative to the list of all possible genes 
(the gene universe).

The usage of this is described in the vignette titled "Hypergeometric 
tests for less common model organisms" in the GOstats package which you 
can read here:

http://www.bioconductor.org/help/bioc-views/devel/bioc/html/GOstats.html


Does that help clear things up?


   Marc



On 03/29/2011 06:33 AM, David martin wrote:
> Hi ,
> I'm a bit confused in the way of using my data.
>
> My input is a list of genes( in fact a list of targeted genes for 
> microRNAs). The first step is to get the GoTerms associated to these 
> genes and then i would like to do hyperg to obtain significant 
> dysregulated Goterms. ALl the examples i went through use affy data or 
> so so i'm not sure this is correct. I would appreciate your feedback
>
>
> library("GOstats")
> library("GSEABase")
> library(org.Hs.eg.db)
>
>
> data="genes.txt" # A list of genes ( "MED13"    "ENDOD1"   "RAP2C" 
> "ACSL1" ...)
>  g=read.table(file=data)
> genes <- as.character(g[,1])
>
> # Get Mapping to GO
> frame<-merge(toTable(org.Hs.egALIAS2EG[genes]), toTable(org.Hs.egGO), 
> by.x= "gene_id", by.y="gene_id")
>
> goframeData = data.frame(frame$go_id, frame$Evidence, frame$gene_id)
> goFrame = GOFrame(goframeData, organism = "Homo sapiens")
> goAllFrame = GOAllFrame(goFrame)
>
>
> #From here i'm a bit confused. Since i have my list of Goterms do i 
> need to use the universe data ?? or do i apply a hyperg on the above 
> data. Thanks for your input.
>
> gsc <- GeneSetCollection(goAllFrame, setType = GOCollection())
> universe = Lkeys(org.Hs.egGO)
> params <- GSEAGOHyperGParams(name = "My Custom GSEA based annot 
> Params",geneSetCollection = gsc, geneIds = unique(frame$gene_id), 
> universeGeneIds = universe,ontology = "BP", pvalueCutoff = 0.05, 
> conditional = FALSE,testDirection = "over")
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list