[BioC] Quick start to linking GO terms and microarray data

Wed Mar 1 13:08:36 CET 2006

Hi Michael,

regarding your last question, you can also try to use the "biomaRt" 
package (please use the development version 1.5.10), this allows you to 
start from all sorts of IDs that you care to have. For task 2, I like 
the "Category" package, in particular the cateGOry and applyByCategory 
functions.  See their man pages and the vignette of the Category package.

The Category package asks a related, but slightly different question 
from GOstats: not whether a certain GO category is overrepresented in a 
set of genes, buth rather whether a score (e.g. t statistic, mean 
expression level difference) tends to be higher in a GO category of 
genes than in all genes.

  Cheers
  Wolfgang

michael watson (IAH-C) wrote:
> Hi
> 
> I want to investigate the GO terms associated with my microarray data
> (normally, a list of genes from topTable() in limma)
> 
> I have read the vignettes for goTools and GOStats, and to be honest, I
> am still a little unclear what the overall process is, particularly if I
> am working with a custom array and not with affy or operon.
> 
> Lets say, for example, I have my array data in a data.frame containing
> gene names.  In a separate data frame I have a link between my gene
> names and LocusLink IDs.  How do I:
> 
> 1) Find the GO terms associated with subsets of my genes? (I realise I
> can use merge() to link my array data to the LocusLink ids, but what do
> I do then?)
> 
> 2) Fins out if a particular GO term is statistically over-represented in
> a particular group
> 
> Finally, is the only way to link into GO through LocusLink identifiers?
> 
> Many thanks
> Mick

-------------------------------------
Wolfgang Huber
European Bioinformatics Institute
European Molecular Biology Laboratory
Cambridge CB10 1SD
England
Phone: +44 1223 494642
Fax:   +44 1223 494486
Http:  www.ebi.ac.uk/huber