[BioC] GSEA, topGO, GOstats...? what's a good way to look at GO over-representation?

Mon Feb 8 22:05:04 CET 2010

Hi Jose,

If you would rather use a more universal ID such as an Entrez Gene ID,
you can use GOstats, and just pass in one of our "org" packages.  So for
example, if you wanted to do it with human and use Entrez Gene IDs, you
would do something like this:

library("GOstats")
params = new("GOHyperGParams", geneIds=yourList, universeGeneIds =
yourUniverse,
   annotation="org.Hs.eg.db", pvalueCutoff=0.05, testDirection="over",
ontology = "BP", conditional=FALSE)
hOver = hyperGTest(params)
report = summary(hOver)

  Marc

J.delasHeras at ed.ac.uk wrote:
>
> Dear Wolfgang,
>
> thanks for your reply. Yes, the annotation support part in some of the
> vignettes I looked at was throwing me out a bit.
>
> Jose (no, still haven't changed my name ;-)
>
> Quoting Wolfgang Huber <whuber at embl.de>:
>
>> Dear Javier
>>
>> Chapter 13 of the 'Bioconductor Case Studies' book is a good start, as
>> is
>> http://www.bioconductor.org/workshops/2009/SeattleApr09/gsea/GSEA_Lecture.pdf
>>
>> and the vignette of the GSEABase package.
>>
>> Let yourself not be confused by the fact that in some functions (eg
>> GOstats), there is support to make it easier to work with the
>> Bioconductor annotation packages (which are provided, among others, for
>> Affymetrix genechips). The concept of gene set enrichment analysis
>> itself is independent of where you get the gene sets from, and the
>> software above works with general gene lists.
>>
>> And if you do not care so much about automation, reproducibility and
>> flexibility of your workflow, then using websites like mentioned by
>> Michael to copy-paste your gene lists into might be the way to go.
>>
>>     Best wishes
>>     Wolfgang
>>
>>
>> J.delasHeras at ed.ac.uk scripsit 02/08/2010 05:29 PM:
>>>
>>> Dear list,
>>>
>>> I have a few gene lists derived from a human Illumina expression 
>>> array. I just have Illumina IDs, I have gene names, and I have 
>>> entrez gene IDs I obtained for them.
>>>
>>> I would like to analyse the list to look for over-representation of
>>>  some category, probably using gene ontologies.
>>> I see there are several packages that seem to address this, 
>>> although when I look at the examples I get the feeling they were 
>>> designed with Affy arrays in mind and depend on an Affy array 
>>> design...
>>>
>>> I am sure I am not the only one wanting to do this type of work on 
>>> non-Affy arrays... I would appreciate a nudge towards the right 
>>> package, or a way to "persuade" it to work with non-Affy array 
>>> data, after all I imagine that all the array design is used for is 
>>> the definition of teh genelists/universe and retrieval of the 
>>> relevant GO ids.
>>>
>>> Thank you for any helpful comments.
>>>
>>> Jose
>>>
>>
>>
>> -- 
>>
>> Best wishes
>>      Wolfgang
>>
>>
>> -- 
>> Wolfgang Huber
>> EMBL
>> http://www.embl.de/research/units/genome_biology/huber/contact
>
>
>