[BioC] Question about patchwork affy pre-processing

Tue Jun 12 04:50:41 CEST 2007

Note threestep() is actually part of affyPLM, rather than gcrma.

I want to point out that P/M/A calls from MAS5 are derived from a
slightly different algorithm than the MAS5 expression values. The P/M/A
calls seem to do a reasonable job for what they were designed to do, as
opposed to the expression values which are (in my opinion) less
desirable.

That said, if you ask about the wisdom of pre-filtering (and how you
should do it) you'll get many different answers, and searching the
archives of the mailing list will bring a number of discussion threads
on it.

My personal feeling is that you don't need to do it with RMA (or for
that matter GCRMA), but I get asked this question often enough that I
tell people who are insistent on using P/A type filtering that using the
MAS5 versions of these with RMA(GCRMA) is ok if you must.

Best,

Ben

On Mon, 2007-06-11 at 16:01 -0400, Grant Izmirlian wrote:
> Hi:
> 
> I'm involved in an experiment using affy hgu133 plus2 arrays.
> I have affy, gcrma, and other relevent libraries up and running 
> on my linux system. 
> 
> I preprocessed using the 'threestep' function in the 
> gcrma library, using the following settings
> 
> normalize.method = "quantile.robust"
> summary.method = "median.polish"
> background.method = "GCRMA"
> 
> My question is this.  Someone suggested that their biostatistician
> usually preprocesses via RMA and then merges MAS-5 present/absent
> calls into the resulting dataframe, which are used to omit genes with MAS-5 
> absent calls from any further analysis.  
> 
> My feeling is that MAS-5.0 is inferior on the three steps mentioned above,
> and if present/absent calls are based upon inferior techniques they should not 
> be used.  I also believe that people are moving away from what I view as 
> a hidden level of filtering.  It is my belief that the best way to do 
> filtering is once at the stage of the analysis.
> 
> Am I right in thinking that this is a bad idea.  
> 
> 
> Grant Izmirlian
> 
> 
> 
> 
> I have followed the debate on pm only and in my mind the developement of GCRMA 
> now allows an efficient way to model mm's so that background correction can be 
> done without doubling the per gene noise. 
> 
> So normalization 
> 
> Definitely the normalization, background correction and summary methods of
> 'three-step' are all the result of research that has applied the best 
> statistical principles in lieu of rather ad-hoc techniques contained in 
> MAS-5.
> 
> 
> 
> suceeded in refining the methods of MAS-5
> My read of the literature and best practice tells me that this is not really a 
> preferable way to do things
> -- 
> Հրանդ Իզմիրլյան
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor