[BioC] effect of normalization on analysis of differential knockdown

Naomi Altman naomi at stat.psu.edu
Mon Jul 20 14:59:17 CEST 2009


Dear Wolfgang,
Sorry for any misunderstanding.  I was responding to the original 
post - not to your useful comments.

--Naomi

At 06:12 AM 7/20/2009, Wolfgang Huber wrote:

>Hi Naomi,
>
>of course normalisation is useful. I want to point out the 
>importance of complementing it by quality assessment & control.
>
>Just comparing different normalisation 'black boxes' on the basis of 
>resulting hit lists (of which there seemed a hint in the original 
>post, and which has all too often been done with microarray data in 
>this community) is less advisable.
>
>Best wishes
>      Wolfgang
>
>-------------------------------------------------------
>Wolfgang Huber
>EMBL
>http://www.embl.de/research/units/genome_biology/huber
>-------------------------------------------------------
>
>
>
>
>Naomi Altman ha scritto:
>>Why would you bother to normalize if it did not affect the results 
>>of the analysis?  The purpose of normalization is to dampen some of 
>>the noise so that the
>>signal (i.e. differential expression) is clearer.  The 
>>normalization method can have a huge effect, depending on how much 
>>noise there was in the experiment, and
>>whether the assumptions underlying the normalization are met.
>>I am not familiar with B-score normalization.  Normalization to the 
>>median of a particular treatment or control makes sense if you 
>>expect the median of all the samples to be the same except for 
>>noise.  If not, e.g. if there is down-regulation but no 
>>up-regulation, then you are inducing signal by normalizing.
>>--Naomi
>>At 05:14 PM 7/18/2009, Wolfgang Huber wrote:
>>
>>>Hi Rajarshi
>>>
>>>your t, p, q value computation seems reasonable to me. You may 
>>>want to choose a regularised version of the t-test (like in 
>>>limma's eBayes) since with only 4 samples, you may otherwise get 
>>>an unnecessarily large fraction of false discoveries due to the 
>>>sample variance being small (and t large) by chance.
>>>
>>>As for your question about the choice of normalisation method one 
>>>(perhaps not too constructive, but not ignorable) possible answer 
>>>is that the technical or biological variability ("noise") in your 
>>>data is stronger than the biological signal.
>>>
>>>         Best wishes
>>>         Wolfgang
>>>
>>>
>>>Rajarshi Guha wrote:
>>>>Hi, I am analysing the results from a drug sensitization siRNA 
>>>>screen and am trying to determine which genes are being 
>>>>differentially knocked down (between a vehicle only run and a dosed run).
>>>>Each gene is targeted by 4 siRNA's and my initial strategy has 
>>>>been to consider the signals from the 4 siRNA's to be individual 
>>>>samples for that gene. Then I perform a paired t-test on the 4 
>>>>signals for a given gene across the two conditions. I then 
>>>>calculate Storey's q-values based on the resultant p-values.
>>>>The question: does/should the normalization of the plates have an 
>>>>effect on the results of the above analysis? For example, I 
>>>>considered two normalization schemes - 1) normalizing each plate 
>>>>to the median of a separate negative control plate and 2) B-score 
>>>>normalization.
>>>>If I rank the genes based on their q-values I get 2 very 
>>>>different rankings for the two normalization schemes. 
>>>>Furthermore, the q- & p-values differ greatly. In the case of 
>>>>median normalization I get a number of q-values < 0.05 but when 
>>>>using B-score I get a single gene with a q-value < 0.05 (and the 
>>>>next closest value is 0.58).
>>>>Thinking that this study is analogous to differential expression 
>>>>studies in microarrays, I tried running my dataset through the 
>>>>SAM method (via siggenes). Using this method, the B-score 
>>>>normalized data leads to no hits (and a pi0 = 1) whereas the 
>>>>median normalization method leads to lots of hits.
>>>>I can see that B-score normalized data would differ in character 
>>>>from median normalized data (seeing that the actual signals are 
>>>>replaced with scaled residuals) - but is it to be expected that 
>>>>normalization schemes would lead to such different results in 
>>>>this type of analysis?
>>>>Any pointers would be appreciated.
>>>>Thanks,
>>>>-------------------------------------------------------------------
>>>>Rajarshi Guha  <rajarshi.guha at gmail.com>
>>>>GPG Fingerprint: D070 5427 CC5B 7938 929C  DD13 66A1 922C 51E7 9E84
>>>>-------------------------------------------------------------------
>>>>Q:  What's polite and works for the phone company?
>>>>A:  A deferential operator.
>>>>_______________________________________________
>>>>Bioconductor mailing list
>>>>Bioconductor at stat.math.ethz.ch
>>>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>Search the archives: 
>>>>http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>>
>>>--
>>>
>>>Best wishes
>>>      Wolfgang
>>>
>>>-------------------------------------------------------
>>>Wolfgang Huber
>>>EMBL
>>>http://www.embl.de/research/units/genome_biology/huber
>>>
>>>_______________________________________________
>>>Bioconductor mailing list
>>>Bioconductor at stat.math.ethz.ch
>>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>Search the archives: 
>>>http://news.gmane.org/gmane.science.biology.informatics.conductor
>>Naomi S. Altman                                814-865-3791 (voice)
>>Associate Professor
>>Dept. of Statistics                              814-863-7114 (fax)
>>Penn State University                         814-865-1348 (Statistics)
>>University Park, PA 16802-2111
>>_______________________________________________
>>Bioconductor mailing list
>>Bioconductor at stat.math.ethz.ch
>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>>Search the archives: 
>>http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>--
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives: 
>http://news.gmane.org/gmane.science.biology.informatics.conductor

Naomi S. Altman                                814-865-3791 (voice)
Associate Professor
Dept. of Statistics                              814-863-7114 (fax)
Penn State University                         814-865-1348 (Statistics)
University Park, PA 16802-2111



More information about the Bioconductor mailing list