[BioC] Single nucleotide based RNAseq normalization with edgeR

Mon Feb 7 11:46:05 CET 2011

Hi Gordon,
thank you for your reply. The resolution of our ~100nt solexa reads is 
to small to detect individual processing sites, so we want to 
investigate every single nucleotide individually ("single nucleotide 
based normalization"). That means that we count, how often an individual 
nucleotide is covered by sequence reads. Of course, this approach will 
virtually increase the lib.size by a factor which depends on length of 
the solexa reads. As the lib.size is critical for the normalization, I 
am not sure if I should use the original read numbers for each library 
or the read numbers multiplicated with the read length to adjust for the 
single nucleotide investigation.

I have two more question regarding to the normalization:
1. Are the norm factors calculated by the calcNormFactors( ) function 
automatically used for further steps like the estimateCommonDisp( ) 
function?
2. Are the pseudocounts calculated by estimateCommonDisp( ) the 
normalized readcounts?

Many thanks

Jens

> Hi Jens,
>
> I don't know what you mean by single nucleotide based normalization, 
> however the following comments may be helpful.
>
> edgeR automatically adjusts for library sizes, whether you include an 
> explicit normalization step or not.  Normalization is a separate 
> issue, and is intended to deal with more subtle issues.
>
> Normalization, as edgeR does it, does not require replicates.
>
> Best wishes
> Gordon
>
>> Date: Fri, 04 Feb 2011 11:28:15 +0100
>> From: Jens Georg <jens.georg at biologie.uni-freiburg.de>
>> To: bioconductor at r-project.org
>> Subject: [BioC] Single nucleotide based RNAseq normalization with
>>     edgeR?
>> Message-ID: <4D4BD4BF.4010009 at biologie.uni-freiburg.de>
>> Content-Type: text/plain; charset=ISO-8859-15; format=flowed
>>
>>
>>
>> Dear edgeR users and developers,
>>
>> we used Solexa sequencing in order to detect RNase E processing sites.
>> Therefor we splitted a RNA sample and treated one half with RNase E
>> prior to cDNA synthesis and sequencing. The libraries differ in size
>> (1.918.953 and 1.208.586 reads respectively) which clearly necessitates
>> a normalization step. Furthermore we expect site specific differences
>> rather than differences in the accumulation of the full length RNAs.
>>
>> So I want to ask, if it is appropiate to do a single nucleotide based
>> normalization with edgeR and do you think a reliable basic normalization
>> is possible without replicates?
>>
>> Thank you for your comments.
>>
>> Best regards
>>
>> Jens
>
> ______________________________________________________________________
> The information in this email is confidential and inte...{{dropped:6}}