[BioC] GCRMA-induced correlations?
Henrik Bengtsson
hb at stat.berkeley.edu
Wed Feb 20 12:16:24 CET 2008
Hi,
another reason for adding "some noise" is to help the estimation
algorithm to converge when the discreteness of the data dominates at
lower intensities.
Details: By default, Affymetrix takes the 75% quantile of the pixel
intensities to be the probe signal, which mean if you've got 9 pixels
(common with new chip types) that becomes *exactly* the 7:th pixel
value. In other words, the pixel intensities observed in a CEL file
are often "integers" (although they are stored as floats). At low
intensities this this discreteness dominates, which you can see as a
"peacock tail" if you do a log-ratio log-intensity plot.
We observed convergence problems for the RMA norm+exp background model
for some data sets (exon arrays; 9 pixels/probe, low intensities)
because of the above. In order to help out, we have the option to add
"jitter" before fitting the model (in the 'RmaBackgroundCorrection' of
aroma.affymetrix), which seems to help.
Cheers
Henrik
On Feb 19, 2008 11:56 PM, Pierre Neuvial <pierre.neuvial at curie.fr> wrote:
> Hi Zhijin,
>
> In Lim's paper they also suggest to add some noise to truncated probes: I believe (and this is my experience as well) that otherwise they would have exactly the same signal values for truncated probes, and correlations between low intensity probes would remain...
>
> Quoting Lim's paper,
>
> "To test our speculations, we reimplemented the GCRMA procedure without adjusting GSB for uninformative probes-i.e. probes that are truncated to m after NSB adjustment. To ensure the lowest intensity rank of these probes, any other probes with GSB-adjusted value less than m were also truncated at m. Finally, an infinitesimal amount of uniformly distributed noise was added to truncated probes to avoid rank-order correlation issues."
>
> Do you plan to add this "noise" as well ? If so, how should the noise level be chosen ? And how about reproducibility of the results of GCRMA ? I think this particular issue is related to the recent thread about set.seed() in GCRMA.
>
> Best wishes,
>
> Pierre.
>
>
> Zhijin Wu a écrit :
>
> > Yes, to eliminate this artifact The truncated values will no longer be
> > adjusted in the next release of GCRMA.
> >
> > Jenny Drnevich wrote:
> >> Hi Zhijin,
> >>
> >> A client pointed out a July 2007 article by Lim et al. testing different
> >> normalization/pre-processing methods for their effects on pairwise
> >> correlations between probesets (Bioinformatics 2007 23(13):i282-i288;
> >> doi:10.1093/bioinformatics/btm201; full link below). They reported that
> >> GCRMA introduced severe artificial correlations between probesets; they
> >> looked for a cause and think it's due truncation of low-intensity values
> >> after Non-Specific Binding adjustment and then the Gene-Specific Binding
> >> adjustment on these truncated values. They also tested a specific
> >> correction to the GCRMA algorithm that appears to prevent the artificial
> >> correlation and suggest that it become an option or even a default in
> >> the R implementation of GCRMA.
> >>
> >> What do you think of this article? Are there any plans to implement
> >> their suggestion?
> >>
> >> Thanks,
> >> Jenny
> >>
> >> Comparative analysis of microarray normalization procedures: effects on
> >> reverse engineering gene networks
> >>
> >> http://bioinformatics.oxfordjournals.org/cgi/content/full/23/13/i282?maxtoshow=&HITS=10&hits=10&RESULTFORMAT=1&andorexacttitle=and&andorexacttitleabs=and&andorexactfulltext=and&searchid=1&FIRSTINDEX=0&sortspec=relevance&volume=23&firstpage=i282&resourcetype=HWCIT&eaf
> >>
> >>
> >>
> >> <http://bioinformatics.oxfordjournals.org/cgi/content/full/23/13/i282?maxtoshow=&HITS=10&hits=10&RESULTFORMAT=1&andorexacttitle=and&andorexacttitleabs=and&andorexactfulltext=and&searchid=1&FIRSTINDEX=0&sortspec=relevance&volume=23&firstpage=i282&resourcetype=HWCIT&eaf>
> >>
> >> Jenny Drnevich, Ph.D.
> >>
> >> Functional Genomics Bioinformatics Specialist
> >> W.M. Keck Center for Comparative and Functional Genomics
> >> Roy J. Carver Biotechnology Center
> >> University of Illinois, Urbana-Champaign
> >>
> >> 330 ERML
> >> 1201 W. Gregory Dr.
> >> Urbana, IL 61801
> >> USA
> >>
> >> ph: 217-244-7355
> >> fax: 217-265-5066
> >> e-mail: drnevich at uiuc.edu
> >>
> >
> >
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
More information about the Bioconductor
mailing list