[BioC] Affy: gene filtering before or after normalization??
fcollin at sbcglobal.net
Fri Oct 10 19:55:39 MEST 2003
Thanks for the referecne to our work, Wolfgang.
Hope it isn't considered spamming if I put a link to a poster presentation:
I concur with Wolfgang that residuals from fits provide great quality assessment opportunities. I differ in the opinion that chips with obvious scratches and what-nots should be summarily dismissed. On arrays where probes are scattered, which includes all of the new designs, glaring image artifacts can sometimes have very little effect on the summaries. To make this judgment, it helps to summarize the residuals in a way that reflects probe set summary precision. We recommend pooling unscaled standard errors over probe sets on a chip to judge the relative quality of chips in a set. These are relative measures within a chip set. Probe set model residual scales must be examined to assess the overall quality of a set of chips.
We will soon have a link to a more detailed presentation. A paper and software are due out shortly.
w.huber at dkfz-heidelberg.de wrote:
in my experience (and it seems to be that of others) quality control,
including gene filtering, should be done while and after normalization,
and there is not too much to be done before. I.e. if you use a model-based
normalization, you can use the residuals for QC, you can use the
reproducibility of the per-probe intensities across chips for QC, and
since with Affy genechips you have multiple probes per gene you should use
that, too. I think Francois Collins has a nice method for QC based on the
residuals of the probe-set-summary model.
All this assuming that there are no obvious scratches, fingerprints,
gradients etc on the chip image itself, in which case you should probably
send them back to the lab...
Division of Molecular Genome Analysis
German Cancer Research Center
Phone: +49 6221 424709
Fax: +49 6221 42524709
On Thu, 9 Oct 2003, Jing Shen wrote:
> I am going to work on affymetrix data analysis using Bioconductor Affy
> package. In my understanding, the procedure for data analysis should be:
> (1) import data (CEL files)
> (2) data filtering ?? --- get rid of bad or false intensities (e.g.,sth
> like filtering on flags, present or absent or expression values in
> (3) data normalization --- several different methods based on probe cell
> level or probe set level...
> (4) now you have the data for statistical analysis...
> I am wondering if anybody can give me some suggestions on data filtering
> before (or after??) my data normalization if I use RMA() or expresso()?
> Or what kind of gene filtering criteria do you guys use? or I don't need
> to do that at all?
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
[[alternative HTML version deleted]]
More information about the Bioconductor