[BioC] effect of normalization on analysis of differential knockdown
Naomi Altman
naomi at stat.psu.edu
Mon Jul 20 14:59:17 CEST 2009
Dear Wolfgang,
Sorry for any misunderstanding. I was responding to the original
post - not to your useful comments.
--Naomi
At 06:12 AM 7/20/2009, Wolfgang Huber wrote:
>Hi Naomi,
>
>of course normalisation is useful. I want to point out the
>importance of complementing it by quality assessment & control.
>
>Just comparing different normalisation 'black boxes' on the basis of
>resulting hit lists (of which there seemed a hint in the original
>post, and which has all too often been done with microarray data in
>this community) is less advisable.
>
>Best wishes
> Wolfgang
>
>-------------------------------------------------------
>Wolfgang Huber
>EMBL
>http://www.embl.de/research/units/genome_biology/huber
>-------------------------------------------------------
>
>
>
>
>Naomi Altman ha scritto:
>>Why would you bother to normalize if it did not affect the results
>>of the analysis? The purpose of normalization is to dampen some of
>>the noise so that the
>>signal (i.e. differential expression) is clearer. The
>>normalization method can have a huge effect, depending on how much
>>noise there was in the experiment, and
>>whether the assumptions underlying the normalization are met.
>>I am not familiar with B-score normalization. Normalization to the
>>median of a particular treatment or control makes sense if you
>>expect the median of all the samples to be the same except for
>>noise. If not, e.g. if there is down-regulation but no
>>up-regulation, then you are inducing signal by normalizing.
>>--Naomi
>>At 05:14 PM 7/18/2009, Wolfgang Huber wrote:
>>
>>>Hi Rajarshi
>>>
>>>your t, p, q value computation seems reasonable to me. You may
>>>want to choose a regularised version of the t-test (like in
>>>limma's eBayes) since with only 4 samples, you may otherwise get
>>>an unnecessarily large fraction of false discoveries due to the
>>>sample variance being small (and t large) by chance.
>>>
>>>As for your question about the choice of normalisation method one
>>>(perhaps not too constructive, but not ignorable) possible answer
>>>is that the technical or biological variability ("noise") in your
>>>data is stronger than the biological signal.
>>>
>>> Best wishes
>>> Wolfgang
>>>
>>>
>>>Rajarshi Guha wrote:
>>>>Hi, I am analysing the results from a drug sensitization siRNA
>>>>screen and am trying to determine which genes are being
>>>>differentially knocked down (between a vehicle only run and a dosed run).
>>>>Each gene is targeted by 4 siRNA's and my initial strategy has
>>>>been to consider the signals from the 4 siRNA's to be individual
>>>>samples for that gene. Then I perform a paired t-test on the 4
>>>>signals for a given gene across the two conditions. I then
>>>>calculate Storey's q-values based on the resultant p-values.
>>>>The question: does/should the normalization of the plates have an
>>>>effect on the results of the above analysis? For example, I
>>>>considered two normalization schemes - 1) normalizing each plate
>>>>to the median of a separate negative control plate and 2) B-score
>>>>normalization.
>>>>If I rank the genes based on their q-values I get 2 very
>>>>different rankings for the two normalization schemes.
>>>>Furthermore, the q- & p-values differ greatly. In the case of
>>>>median normalization I get a number of q-values < 0.05 but when
>>>>using B-score I get a single gene with a q-value < 0.05 (and the
>>>>next closest value is 0.58).
>>>>Thinking that this study is analogous to differential expression
>>>>studies in microarrays, I tried running my dataset through the
>>>>SAM method (via siggenes). Using this method, the B-score
>>>>normalized data leads to no hits (and a pi0 = 1) whereas the
>>>>median normalization method leads to lots of hits.
>>>>I can see that B-score normalized data would differ in character
>>>>from median normalized data (seeing that the actual signals are
>>>>replaced with scaled residuals) - but is it to be expected that
>>>>normalization schemes would lead to such different results in
>>>>this type of analysis?
>>>>Any pointers would be appreciated.
>>>>Thanks,
>>>>-------------------------------------------------------------------
>>>>Rajarshi Guha <rajarshi.guha at gmail.com>
>>>>GPG Fingerprint: D070 5427 CC5B 7938 929C DD13 66A1 922C 51E7 9E84
>>>>-------------------------------------------------------------------
>>>>Q: What's polite and works for the phone company?
>>>>A: A deferential operator.
>>>>_______________________________________________
>>>>Bioconductor mailing list
>>>>Bioconductor at stat.math.ethz.ch
>>>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>Search the archives:
>>>>http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>>
>>>--
>>>
>>>Best wishes
>>> Wolfgang
>>>
>>>-------------------------------------------------------
>>>Wolfgang Huber
>>>EMBL
>>>http://www.embl.de/research/units/genome_biology/huber
>>>
>>>_______________________________________________
>>>Bioconductor mailing list
>>>Bioconductor at stat.math.ethz.ch
>>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>Search the archives:
>>>http://news.gmane.org/gmane.science.biology.informatics.conductor
>>Naomi S. Altman 814-865-3791 (voice)
>>Associate Professor
>>Dept. of Statistics 814-863-7114 (fax)
>>Penn State University 814-865-1348 (Statistics)
>>University Park, PA 16802-2111
>>_______________________________________________
>>Bioconductor mailing list
>>Bioconductor at stat.math.ethz.ch
>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>>Search the archives:
>>http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>--
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives:
>http://news.gmane.org/gmane.science.biology.informatics.conductor
Naomi S. Altman 814-865-3791 (voice)
Associate Professor
Dept. of Statistics 814-863-7114 (fax)
Penn State University 814-865-1348 (Statistics)
University Park, PA 16802-2111
More information about the Bioconductor
mailing list