[BioC] total number of reads? mapped reads? or total counts?
Mark Robinson
mark.robinson at imls.uzh.ch
Tue Dec 13 09:22:18 CET 2011
Hi Shan,
We typically use a fourth concept, the notion of 'effective' library size.
The idea is quite simple, and spelled out here:
http://genomebiology.com/2010/11/3/R25
And, the function is implemented in edgeR's calcNormFactors().
HTH,
Mark
On 12.12.2011, at 17:44, wang peter wrote:
> hello all
>
>
> In the edgeR package,
> the lib.size: vector of length ncol(counts) giving the total number
> of reads sequenced
> for each sample. If not separately provided, will be set to colSums(counts).
>
> but there are three different concepts.
>
> usually the number of mapped reads < total reads < counts
> because not all of the reads can be mapped
> and one mapped reads have more than 1 hit.
>
> so which one should be used in the NB model?
>
> --
> shan gao
> Room 231(Dr.Fei lab)
> Boyce Thompson Institute
> Cornell University
> Tower Road, Ithaca, NY 14853-1801
> Office phone: 1-607-254-1267(day)
> Official email:sg839 at cornell.edu
> Facebook:http://www.facebook.com/profile.php?id=100001986532253
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
----------
Prof. Dr. Mark Robinson
Bioinformatics
Institute of Molecular Life Sciences
University of Zurich
Winterthurerstrasse 190
8057 Zurich
Switzerland
v: +41 44 635 4848
f: +41 44 635 6898
e: mark.robinson at imls.uzh.ch
o: Y32-J-34
w: http://tiny.cc/mrobin
More information about the Bioconductor
mailing list