[BioC] Is normalization in edgeR required for small RNA sequencing data?
Mark Robinson
mark.robinson at imls.uzh.ch
Sat Sep 22 09:54:14 CEST 2012
Hi Daniela,
> Do I need to normalize my input data using *calcNormFactors() *once I set
> my DGE list or I could proceed without any normalization? I assume in this
> case that edgeR performs a default normallization when it is "calculating
> library sizes from column totals"?
Yes, by default edgeR will use column totals to "normalize". You don't strictly *need* to do additional normalization -- e.g. by calling calcNormFactors() -- but generally it does no harm and it often helps. That is, if there are no additional biases (beyond library size) to correct for, these additional correction factors will be near 1 anyways. As a trivial (uninteresting) example:
> y <- matrix( rnbinom(300, mu=5, size=2), nrow=150 )
> d <- DGEList(y)
Calculating library sizes from column totals.
> d$samples
group lib.size norm.factors
Sample1 1 720 1
Sample2 1 635 1
> d <- calcNormFactors(d)
> d$samples
group lib.size norm.factors
Sample1 1 720 0.9663861
Sample2 1 635 1.0347831
Of course, it doesn't hurt to look through a few MA-style plots for your data to see that your samples are comparable and that normalization is operating well.
Best, Mark
----------
Prof. Dr. Mark Robinson
Bioinformatics
Institute of Molecular Life Sciences
University of Zurich
Winterthurerstrasse 190
8057 Zurich
Switzerland
v: +41 44 635 4848
f: +41 44 635 6898
e: mark.robinson at imls.uzh.ch
o: Y11-J-16
w: http://tiny.cc/mrobin
----------
http://www.fgcz.ch/Bioconductor2012
On 22.09.2012, at 00:23, Daniela Lopes Paim Pinto wrote:
> Dear All,
>
> I am PhD student, currently working on differential expression analysis of
> my smallRNA library deep sequencing data and trying to identify
> differentially expressed miRNAs, using edgeR package. I have 24 different
> samples with 2 biological replicates (48 libraries). I am performing
> multiple group comparison using GLM method and also Anova-like test to
> idetify DE miRNAs among the different groups of my samples.
> My question is :
>
> Do I need to normalize my input data using *calcNormFactors() *once I set
> my DGE list or I could proceed without any normalization? I assume in this
> case that edgeR performs a default normallization when it is "calculating
> library sizes from column totals"?
>
>
> I would really appreciate any suggestion on this!
>
>
> Thanks in advance,
>
>
> Daniela
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list