[Bioc-sig-seq] roast/romer for count data (edgeR)?

Gordon K Smyth smyth at wehi.EDU.AU
Mon Jun 13 01:47:45 CEST 2011


Hi Cei,

It is definitely on our to-do list but, no, we don't yet have any means to 
do gene set analyses within the edgeR framework.

At this stage, I think the best bet is simply to analyse the counts as 
approximately normal and use limma.  For example, compute 
log-counts-per-million,

    y <- log2( 1e6* (counts+0.5) / (lib.size+0.5) )

then quantile normalize, then analyse as usual in limma.  Note the use of 
an offset of half-a-count to avoid infinite values.

Alternatively, use the effective library sizes estimated by edgeR in place 
of actual library sizes and skip the quantile normalization.

This normal-based approach will work well for high variability human data. 
If your RNA-Seq data is low variability, close to Poisson, then the 
normal-based approach is a bit further from being optimal, although 
probably still servicable.

Best wishes
Gordon

---------------------------------------------
Professor Gordon K Smyth,
Bioinformatics Division,
Walter and Eliza Hall Institute of Medical Research,
1G Royal Parade, Parkville, Vic 3052, Australia.
Tel: (03) 9345 2326, Fax (03) 9347 0852,
smyth at wehi.edu.au
http://www.wehi.edu.au
http://www.statsci.org/smyth

> Date: Sat, 11 Jun 2011 10:38:45 -0500
> From: Cei Abreu-Goodger <cei at ebi.ac.uk>
> To: bioc-sig-sequencing at r-project.org
> Subject: [Bioc-sig-seq] roast/romer for count data (edgeR)?
>
> Hello Davis, Gordon, et al.,
>
> Is it possible to perform focused or competitive gene-set analysis for
> experiments with count data and linear models? Like what is available in
> limma, with the roast and romer functions, but for edgeR?
>
> Any tips or suggestions would be great!
>
> Thanks,
>
> Cei

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioc-sig-sequencing mailing list