[BioC] Normalization quality assessment tools?
Wolfgang Huber
whuber at embl.de
Fri Feb 25 15:38:01 CET 2011
Dear Giulio,
it will probably help you to phrase your question more precisely. You
referred to "the normalization step" without ever saying what you mean
by that. There are many different ways of "normalization" for such a
dataset, and, of course, the choice cannot be made a priori, but rather,
requires data quality assessment, identifying what the undesired
technical effects are that you want to "normalise" away, what property
of the data you want to keep high (e.g., the number of differentially
binding peptides?) and choosing an appropriate computational method.
The 'arrayQualityMetrics' package might provide some relevant plots. See
also https://stat.ethz.ch/pipermail/bioconductor/2011-February/037915.html
Best wishes
Wolfgang
Giulio Di Giovanni scripsit 25/02/11 12:10:
>
> Hi all,
> I've an experiment with almost 300 arrays, single channel (not Affymetrix, but some in-house made peptide arrays). Differently from all my past experiments, this time I have the suspicion that the normalization step it's not necessary, or worse.1) the qqplot of unnormalized intensities seems pretty normal, and normalizing only slightly (with really a small effects) improves the situation.2) after normalization I lose some signal (of course) and I lose ALL what they seem differentially recognized peptides, in a 2 groups comparison. Before normalization they stand out quite consistently in 200 vs 100 arrays.3) we are not talking about genes, so most of the usual hypothesis to be made in order to apply normalization are here not valid. For example, in this case we have a mass response, where 90% of the spots have higher intensity in one group compared to the other. So I cannot use many of the most common normalization methods. I use a linear model based method instead which
i!
> n the past, on smaller experiments, gave good results. But now even this seems to have a too drastic impact on the data.
> Besides the qqplot, or the boxplot of the slide intensities (the latter in this case gives no information at all, the 300 boxes either before and after normalization are not the same line), could you please suggest me...- some diagnostic tools, plots or packages to asses the quality of the normalization procedure.- some plots or tools used for counter-examples where the normalization it's not only not effective, but even has a negative impact in terms of data loss?
> Right now the only thing I can think about it's to convert my data matrix into an expression set and to apply affy's pseudo-MAplot to the various arrays, but I don't have big hopes ... :(
> Any help will be highly appreciated,Thanks and regards,
> Giulio
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber
More information about the Bioconductor
mailing list