[BioC] gene set enrichment analysis of RNA-Seq data

Julie Leonard julie.leonard at syngenta.com
Thu Apr 12 23:06:54 CEST 2012


I was wondering if anyone is aware of a gene 
set enrichment algorithm for RNA-Seq data that:

1) does not require a specification of differentially 
expressed (DE) genes (i.e.no need to use a hard 
p-value threshold cutoff for determining the DE gene 
list)

2) uses subject sampling instead of gene sampling 
to obtain the p-value (i.e.this would maintain 
gene-gene correlations)

Basically, I'm looking for a 
self-contained/subject sampling method (e.g.
SAM-GS for microarray data) or a "hybrid" method 
(e.g. GSEA for microarray data).  The only gene set 
enrichment algorithm that I am aware of for RNA-Seq
data is GOSeq, but it uses a competitive/gene 
sampling method (i.e. Fisher's Exact Test).  
Note, the ideas of self-contained vs competitive and 
subject sampling vs gene sampling come from the 
following paper:  Goeman JJ, Bühlmann P.Analyzing 
gene expression data in terms of gene sets: 
methodological issues. Bioinformatics. 2007 Apr 15;23(8)

Something like GSEA-SNP is close to what I want.  
It uses a test-statistic that is suitable for discrete data
and uses subject sampling to calculate the p-values.  

Thanks,
Julie



More information about the Bioconductor mailing list