[BioC] VSN, RMA, dCHIP, etc....

Tue Oct 16 17:58:29 CEST 2007

trying to add my semi-philosophical, semi-biological cents:

- i agree with your concerns, it is a jungle. and it's very difficult  
to decide where to go.

- if you are in doubt about your strategy you might want to apply it  
to a golden standard data set with maximum prior knowledge (if there  
is any - depends on the application).

- your results should be fairly similar when applying different data  
analysis strategies (this basically means that if you have 'good'  
input, the output is usually not severely compromised by different  
data processing strategies). if you get different results with  
different strategies then maybe your primary data is not good enough,  
you do not have enough data points, you do not have enough  
replicates.. etc etc.

- if your results are plausible you might be on the right track! try  
to confirm your results with different experiments/technologies.

- i think that in general one can assume that less data manipulation  
(normalization etc.) is rather not harmful. and vice versa.

- as far as i am concerned, normalization is usually not the problem,  
but whatever comes thereafter. things like filtering, significance  
testing.

best regards
T.

On Oct 16, 2007, at 2:35 PM, Stefan Thomsen wrote:

> Dear all,
>
> currently evaluating the performance of different normalization  
> strategies
> to an Affymetrix data set, I have some semi-technical, semi- 
> philosophical
> questions.
>
> Given (i) the jungle of possible normalization strategies  
> implemented in R
> and other platforms, (ii) the fact that most authors describe which
> normalization strategy they used but not why they chose this and no  
> other,
> (iii) the sparse literature on how to find the strategy most  
> suitable for a
> given design/experiment/data set, I would be very grateful for any  
> comments
> on the following questions:
>
> 1) Are there written or silently accepted guidelines to evaluate,  
> choose,
> and justify the choice of normalization strategies?
>
> 2) What could be sensible "readouts" for the performance of a given
> normalization strategy ? (Personally, I am looking at the  
> performance on
> spike-in-control and a handful of known gene profiles. I am very  
> intersted
> in complementary approaches)
>
> 3) Is there some literature on this issue that may have escaped my  
> notice?
>
>
> Any comments on this issue would be highly appreciated.
>
> Kind regards,
>
> Stefan
>
> -- 
> Dr. Stefan Thomsen
> Research Associate
>
> Department of Zoology
> University of Cambridge
> Downing Street
> Cambridge CB2 3EJ
>
> Tel.: +44 1223 336623
> Fax:  +44 1223 336679
>
> stt26 at cam.ac.uk
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/ 
> gmane.science.biology.informatics.conductor

======================================================================
Dr. Tobias Straub         Adolf-Butenandt-Institute, Molecular Biology
tel: +49-89-2180 75 439         Schillerstr. 44, 80336 Munich, Germany