[BioC] gene set enrichment analysis of RNA-Seq data
Gordon K Smyth
smyth at wehi.EDU.AU
Fri Apr 13 09:20:21 CEST 2012
Dear Julie,
A good question. As far as I know, there is as yet no such method. What
I am doing for this purpose for the time being is to use voom() in the
limma package to transform the RNA-Seq counts to a scale on which
microarray methods can be used, then using roast(). See page 104 of the
limma User's Guide for examples of this:
http://bioconductor.org/packages/2.10/bioc/vignettes/limma/inst/doc/usersguide.pdf
Note that roast() is a self-contained gene set test with the ability to
use linear models and weights:
http://www.ncbi.nlm.nih.gov/pubmed/20610611
Another gene set enrichment option that works fine with RNA-Seq data is
camera(). This is a competitive test, but without the usual disadvantage
of gene sampling in that it estimates and adjusts for inter-gene
correlation. camera() is currently setup to automatically use the weights
that come out of voom(), meaning that camera() respects the mean-variance
relationship of RNA-Seq data. We have used it successfully on RNA-Seq
data.
Best wishes
Gordon
------------ original message ------------------
[BioC] gene set enrichment analysis of RNA-Seq data
Julie Leonard julie.leonard at syngenta.com
Thu Apr 12 23:06:54 CEST 2012
I was wondering if anyone is aware of a gene
set enrichment algorithm for RNA-Seq data that:
1) does not require a specification of differentially
expressed (DE) genes (i.e.no need to use a hard
p-value threshold cutoff for determining the DE gene
list)
2) uses subject sampling instead of gene sampling
to obtain the p-value (i.e.this would maintain
gene-gene correlations)
Basically, I'm looking for a
self-contained/subject sampling method (e.g.
SAM-GS for microarray data) or a "hybrid" method
(e.g. GSEA for microarray data). The only gene set
enrichment algorithm that I am aware of for RNA-Seq
data is GOSeq, but it uses a competitive/gene
sampling method (i.e. Fisher's Exact Test).
Note, the ideas of self-contained vs competitive and
subject sampling vs gene sampling come from the
following paper: Goeman JJ, Bhlmann P.Analyzing
gene expression data in terms of gene sets:
methodological issues. Bioinformatics. 2007 Apr 15;23(8)
Something like GSEA-SNP is close to what I want.
It uses a test-statistic that is suitable for discrete data
and uses subject sampling to calculate the p-values.
Thanks,
Julie
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list