[BioC] List significant genes on a GO table
Quentin Anstee
q.anstee at imperial.ac.uk
Thu Feb 23 11:57:08 CET 2006
Hi Jim,
That helps a great deal. Thank you very much.
Best wishes,
Quentin
> -----Original Message-----
> From: James W. MacDonald [mailto:jmacdon at med.umich.edu]
> Sent: 21 February 2006 15:31
> To: Quentin Anstee
> Cc: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] List significant genes on a GO table
>
> Hi Quentin,
>
> Quentin Anstee wrote:
> > Dear List,
> >
> > Can anyone advise me how to add a list of significant genes onto a
> > gene ontology table so that I can see which of my differentially
> > expressed genes belong to a given GO group?
> >
> > I would like to be able to output a table that looks like:
> >
> > GO_id Description
> p-value
> > #Genes Gene_ids/symbols
> > GO:12345 Glucose Metabolism
> 0.0001 34
> > IDs of the *significant* probes from the affy chip that are
> in this GO
> > pathway.
>
> You can output tables like this using hyperGtable() in the
> affycoretools package. The last column of your table will be
> a bit messy because there will be variable numbers of Affy
> IDs. I prefer a two step approach; do the above table, and
> then output the probesets for each row (e.g., each
> significant GO term) in individual HTML or text tables using
> hyperG2annaffy(), which is also in affycoretools.
>
> Note that affycoretools is in the devel repository, so you
> need R-2.3.0dev to automatically download using e.g.,
> biocLite(). However, there is no dependency on R-2.3.0dev, so
> you can download from the website and install by hand into
> any reasonably recent version of R.
>
> HTH,
>
> Jim
>
>
>
> > Having read the vignettes I have been able to generate most of this
> > table but not the last column containing the Affy_Ids (or
> ideally gene
> > symbols). I would be very grateful if someone could help me
> out with
> > this. The script I have used so far is attached.
> >
> > Many thanks,
> >
> > Quentin
> >
> > 1. LOAD GENE EXPRESSION ANALYSIS DATA
> > =========================================================
> >
> > a. This is a three way comparison. Data is normalised, filtered,
> > limma/eBayes to give a MArrayLM package called fit2.
> >
> > 2. LOAD LIBRARIES
> > =========================================================
> >
> > library(GO)
> > library(GOstats)
> > library(annotate)
> > library(simpleaffy)
> > library(genefilter)
> > library(multtest)
> > library(affy)
> > library(limma)
> > library(gcrma)
> > library(xtable)
> > library(mouse4302)
> > library(mouse4302cdf)
> > library(annaffy)
> > library(Rgraphviz)
> >
> > 3. MAKE COMPARISONS FOR DFFERENTIAL EXPRESSION
> > ==========================================================
> > # B-CONTROL
> > tab<-topTable(fit2,coef=1)
> > # A-CONTROL
> > tab<-topTable(fit2,coef=2)
> > # A-B
> > tab<-topTable(fit2,coef=3)
> >
> > # topTable contains a a default multadjust
> >
> > 4. Do GO ANALYSIS, MAKE FIGURE & MAKE TABLE
> > ==========================================================
> > gn<-as.character(tab$ID)
> > gn
> > LLID<-unlist(mget(gn,mouse4302LOCUSID,ifnotfound=NA))
> > go<-makeGOGraph(as.character(LLID),"CC",removeRoot=FALSE)
> > go
> >
> > # There are 3 choices for ontology: "MF", "BP" and "CC"
> >
> > a. Plot Graphic
> > ----------------------------------------------------------
> > att<-list()
> > lab<-rep(nodes(go),length(nodes(go)))
> > names(lab)<-nodes(go)
> > att$label<-lab
> > plot(go,nodeAttrs=att)
> >
> > # Are there more genes at one GO than expected?
> > ----------------------------------------------------------
> > hyp<-GOHyperG(unique(LLID),lib="mouse4302",what="CC")
> > names(hyp)
> > go.pv<-hyp$pvalues[nodes(go)]
> > go.pv<-sort(go.pv)
> >
> > b. Create Table
> > ----------------------------------------------------------
> > sig<-go.pv[go.pv<0.05]
> > counts<-hyp$goCounts[names(sig)]
> > terms<-getGOTerm(names(sig))[["CC"]]
> > nch<-nchar(unlist(terms))
> > terms2<-substr(unlist(terms),1,50)
> > terms3<-paste(terms2,ifelse(nch>50,"...",""),sep="")
> >
> mat<-matrix(c(names(terms),terms3,round(sig,3),counts),ncol=4,dimnames
> > =list( 1:length(sig),c("GO ID","Term","p-value","# Genes"))) mat
> > write.table(mat,"A_B_GO-Table_CC.txt")
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> Affymetrix and cDNA Microarray Core
> University of Michigan Cancer Center
> 1500 E. Medical Center Drive
> 7410 CCGC
> Ann Arbor MI 48109
> 734-647-5623
>
More information about the Bioconductor
mailing list