[BioC] GoStats
Marc Carlson
mcarlson at fhcrc.org
Wed Mar 30 02:04:53 CEST 2011
Hi David,
You need to have 3 things to do this analysis. You need 1) a list of
"interesting" genes (presumably your list of targeted genes for
microRNAs?), 2) a list of ALL the genes that you tested (aka the gene
universe) and 3) the annotations that map the GO terms to all genes (and
not just the ones that you think are interesting either - we want ALL of
the annotations). It is this third thing, (the annotations of GO IDs to
all your possible genes) that you need to make into a GOALLFrame
object. Once you have that, you will basically be testing the 1st thing
(list of interesting genes) for enrichment in your 2nd thing (gene
universe). The essence of the test is to use the annotations to ask if
there are GO terms that are over or under represented in your
"interesting" list of genes relative to the list of all possible genes
(the gene universe).
The usage of this is described in the vignette titled "Hypergeometric
tests for less common model organisms" in the GOstats package which you
can read here:
http://www.bioconductor.org/help/bioc-views/devel/bioc/html/GOstats.html
Does that help clear things up?
Marc
On 03/29/2011 06:33 AM, David martin wrote:
> Hi ,
> I'm a bit confused in the way of using my data.
>
> My input is a list of genes( in fact a list of targeted genes for
> microRNAs). The first step is to get the GoTerms associated to these
> genes and then i would like to do hyperg to obtain significant
> dysregulated Goterms. ALl the examples i went through use affy data or
> so so i'm not sure this is correct. I would appreciate your feedback
>
>
> library("GOstats")
> library("GSEABase")
> library(org.Hs.eg.db)
>
>
> data="genes.txt" # A list of genes ( "MED13" "ENDOD1" "RAP2C"
> "ACSL1" ...)
> g=read.table(file=data)
> genes <- as.character(g[,1])
>
> # Get Mapping to GO
> frame<-merge(toTable(org.Hs.egALIAS2EG[genes]), toTable(org.Hs.egGO),
> by.x= "gene_id", by.y="gene_id")
>
> goframeData = data.frame(frame$go_id, frame$Evidence, frame$gene_id)
> goFrame = GOFrame(goframeData, organism = "Homo sapiens")
> goAllFrame = GOAllFrame(goFrame)
>
>
> #From here i'm a bit confused. Since i have my list of Goterms do i
> need to use the universe data ?? or do i apply a hyperg on the above
> data. Thanks for your input.
>
> gsc <- GeneSetCollection(goAllFrame, setType = GOCollection())
> universe = Lkeys(org.Hs.egGO)
> params <- GSEAGOHyperGParams(name = "My Custom GSEA based annot
> Params",geneSetCollection = gsc, geneIds = unique(frame$gene_id),
> universeGeneIds = universe,ontology = "BP", pvalueCutoff = 0.05,
> conditional = FALSE,testDirection = "over")
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list