[BioC] List significant genes on a GO table
James W. MacDonald
jmacdon at med.umich.edu
Tue Feb 21 16:30:58 CET 2006
Hi Quentin,
Quentin Anstee wrote:
> Dear List,
>
> Can anyone advise me how to add a list of significant genes onto a gene
> ontology table so that I can see which of my differentially expressed genes
> belong to a given GO group?
>
> I would like to be able to output a table that looks like:
>
> GO_id Description p-value
> #Genes Gene_ids/symbols
> GO:12345 Glucose Metabolism 0.0001 34
> IDs of the *significant* probes from the affy chip that are in this GO
> pathway.
You can output tables like this using hyperGtable() in the affycoretools
package. The last column of your table will be a bit messy because there
will be variable numbers of Affy IDs. I prefer a two step approach; do
the above table, and then output the probesets for each row (e.g., each
significant GO term) in individual HTML or text tables using
hyperG2annaffy(), which is also in affycoretools.
Note that affycoretools is in the devel repository, so you need
R-2.3.0dev to automatically download using e.g., biocLite(). However,
there is no dependency on R-2.3.0dev, so you can download from the
website and install by hand into any reasonably recent version of R.
HTH,
Jim
> Having read the vignettes I have been able to generate most of this table
> but not the last column containing the Affy_Ids (or ideally gene symbols). I
> would be very grateful if someone could help me out with this. The script I
> have used so far is attached.
>
> Many thanks,
>
> Quentin
>
> 1. LOAD GENE EXPRESSION ANALYSIS DATA
> =========================================================
>
> a. This is a three way comparison. Data is normalised, filtered,
> limma/eBayes to give a MArrayLM package called fit2.
>
> 2. LOAD LIBRARIES
> =========================================================
>
> library(GO)
> library(GOstats)
> library(annotate)
> library(simpleaffy)
> library(genefilter)
> library(multtest)
> library(affy)
> library(limma)
> library(gcrma)
> library(xtable)
> library(mouse4302)
> library(mouse4302cdf)
> library(annaffy)
> library(Rgraphviz)
>
> 3. MAKE COMPARISONS FOR DFFERENTIAL EXPRESSION
> ==========================================================
> # B-CONTROL
> tab<-topTable(fit2,coef=1)
> # A-CONTROL
> tab<-topTable(fit2,coef=2)
> # A-B
> tab<-topTable(fit2,coef=3)
>
> # topTable contains a a default multadjust
>
> 4. Do GO ANALYSIS, MAKE FIGURE & MAKE TABLE
> ==========================================================
> gn<-as.character(tab$ID)
> gn
> LLID<-unlist(mget(gn,mouse4302LOCUSID,ifnotfound=NA))
> go<-makeGOGraph(as.character(LLID),"CC",removeRoot=FALSE)
> go
>
> # There are 3 choices for ontology: "MF", "BP" and "CC"
>
> a. Plot Graphic
> ----------------------------------------------------------
> att<-list()
> lab<-rep(nodes(go),length(nodes(go)))
> names(lab)<-nodes(go)
> att$label<-lab
> plot(go,nodeAttrs=att)
>
> # Are there more genes at one GO than expected?
> ----------------------------------------------------------
> hyp<-GOHyperG(unique(LLID),lib="mouse4302",what="CC")
> names(hyp)
> go.pv<-hyp$pvalues[nodes(go)]
> go.pv<-sort(go.pv)
>
> b. Create Table
> ----------------------------------------------------------
> sig<-go.pv[go.pv<0.05]
> counts<-hyp$goCounts[names(sig)]
> terms<-getGOTerm(names(sig))[["CC"]]
> nch<-nchar(unlist(terms))
> terms2<-substr(unlist(terms),1,50)
> terms3<-paste(terms2,ifelse(nch>50,"...",""),sep="")
> mat<-matrix(c(names(terms),terms3,round(sig,3),counts),ncol=4,dimnames=list(
> 1:length(sig),c("GO ID","Term","p-value","# Genes")))
> mat
> write.table(mat,"A_B_GO-Table_CC.txt")
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
--
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
More information about the Bioconductor
mailing list