[BioC] Quick start to linking GO terms and microarray data

michael watson (IAH-C) michael.watson at bbsrc.ac.uk
Wed Mar 1 13:04:44 CET 2006


Thanks Sean, but I really wanted to demonstrate this in Bioconductor :-S

I tried running the vignettes in goTools, the first time it froze up my
PC for about 30 minutes and then gave out a cryptic message about
coercing x to a list, the second time it froze up my PC and then R
crashed with no warning :-S

As far as I can tell, GOStats doesn't have any clear examples of simple
mapping of microarray data to GO terms.

Given that one of the major, fundamental tasks biologists want to do is
find out functional information for significantly differentailly
expressed genes, shouldn't this be a little easier, and a little more
transparent, in bioconductor?

Again, I ask, does anyone have any simple examples of going from a list
of LocusLink IDs to a list of GO Terms?  (i.e. GO identifiers and the
biological function/term associated with those identifiers)

Many thanks
Mick

-----Original Message-----
From: Sean Davis [mailto:sdavis2 at mail.nih.gov] 
Sent: 01 March 2006 11:44
To: michael watson (IAH-C); Bioconductor
Subject: Re: [BioC] Quick start to linking GO terms and microarray data




On 3/1/06 6:20 AM, "michael watson (IAH-C)" <michael.watson at bbsrc.ac.uk>
wrote:

> Hi
> 
> I want to investigate the GO terms associated with my microarray data
> (normally, a list of genes from topTable() in limma)
> 
> I have read the vignettes for goTools and GOStats, and to be honest, I
> am still a little unclear what the overall process is, particularly if
I
> am working with a custom array and not with affy or operon.
> 
> Lets say, for example, I have my array data in a data.frame containing
> gene names.  In a separate data frame I have a link between my gene
> names and LocusLink IDs.  How do I:
> 
> 1) Find the GO terms associated with subsets of my genes? (I realise I
> can use merge() to link my array data to the LocusLink ids, but what
do
> I do then?)
>
> 2) Fins out if a particular GO term is statistically over-represented
in
> a particular group

Hi, Mick.

I would take your locuslink IDs for your genes and dump out two lists to
a
text file:

1)  All LocusIDs on your array.
2)  All LoucsIDs in your genelist.

Then use an external program or web tool such as DAVID/EASE to do the
analysis.

That said, there was some discussion on using straight locusIDs (rather
than
requiring a metadata package) in GOHyperG.  I don't know where that
conversion stands.

As to your question about linking genes to GO, that is actually done at
the
transcript/protein level.  Merging to entrez gene (locuslink) happens
after
the fact.  Using various data sources, you can link by refseq,
locuslink,
ensembl ids, ucsc knowngenes, human invitational ids (human), and
probably
several others in species other than human.

Sean



More information about the Bioconductor mailing list