[BioC] affy package RMA results difference from R2.5 to R2.8

He, Yiwen (NIH/CIT) [C] heyiwen at mail.nih.gov
Wed Dec 24 17:20:25 CET 2008


Ben,

Thank you very much! That really clarified things.

I have one last question, regarding to the rma2.C code, will removal of the vestigial references to MM change RMA results as well? (Oct 28, 2007 comment.)

Thanks again.
And Happy Holidays!

Yiwen

-----Original Message-----
From: bmb at bmbolstad.com [mailto:bmb at bmbolstad.com]
Sent: Tuesday, December 23, 2008 1:50 PM
To: He, Yiwen (NIH/CIT) [C]
Cc: Davis, Sean (NCI); Ben Bolstad; bioconductor at stat.math.ethz.ch
Subject: RE: [BioC] affy package RMA results difference from R2.5 to R2.8

A lot of the C code that was previously in affy (and affyPLM) got moved
off into preprocessCore for easier maintainability. That is where you will
find the quantile normalization code (qnorm.c). In this movement much of
how rma() did its underlying processing was re-factored. But none of those
changes should have made any appreciable difference to the output of the
algorithm.

My experience with the aforementioned change to the quantile normalization
code is that changes are typically in the forth or fifth significant digit
(on the log scale) in any expression values generated.

Ben





> Thanks Sean, I didn't see the c code in the source at first.
>
> So I was looking at the rma2.c scripts from affy 1.14 (R2.5, BioC2.0),
> 1.16 (R2.6, BioC 2.1) and 1.18 (R 2.7, BioC 2.2).
> Between 1.14 and 1.16, there is this comment:
> ** May 24, 2007 - median_polish code is now from preprocessCore package
>
> Between 1.16 and 1.18, there are these comments:
>  ** Oct 26, 2007 - add verbose flag
>  ** Oct 28, 2007 - remove any vestigial references to MM
>  ** Mar 31, 2008 - use rma background correction from preprocessCore
>
> I didn't see any mentioning of quantile normalization handling ties. Am I
> looking at the right place? Looks to me that these documented changes
> would not affect the results. Ben could you please provide some insights?
>
> Thank you so much!
>
> Yiwen
> ________________________________
> From: Davis, Sean (NCI)
> Sent: Tuesday, December 23, 2008 1:12 PM
> To: He, Yiwen (NIH/CIT) [C]
> Cc: Ben Bolstad; bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] affy package RMA results difference from R2.5 to R2.8
>
>
> On Tue, Dec 23, 2008 at 1:02 PM, He, Yiwen (NIH/CIT) [C]
> <heyiwen at mail.nih.gov<mailto:heyiwen at mail.nih.gov>> wrote:
> Hi Ben,
>
> I tried to look at the rma code, but looks like the rma.R is a wrapper for
> the C-code. How do I go further to look at the code where quantile
> normalizaton actually happened?
>
> That is available in the package source download.
>
> Sean
>
>
> Thank you very much!
>
> Yiwen
>
> -----Original Message-----
> From: Ben Bolstad [mailto:bmb at bmbolstad.com<mailto:bmb at bmbolstad.com>]
> Sent: Tuesday, December 23, 2008 9:08 AM
> To: balag Ganesan
> Cc: He, Yiwen (NIH/CIT) [C];
> bioconductor at stat.math.ethz.ch<mailto:bioconductor at stat.math.ethz.ch>
> Subject: Re: [BioC] affy package RMA results difference from R2.5 to R2.8
>
> The only change to the functionality of the rma() algorithmic code in
> the last two 18 months or so was in how the quantile normalization
> handles ties (looking in code comments this occurred around Jul 2007).
> This should only cause small changes in expression values.
>
> Ben
>
>
> On Mon, 2008-12-22 at 12:44 -0700, balag Ganesan wrote:
>> Interesting.We shifted from R2.2 to 2.6 mid-this year for one of our
>> systems
>> and notice o such difference at all.
>> BALA
>>
>> On Mon, Dec 22, 2008 at 12:07 PM, He, Yiwen (NIH/CIT) [C] <
>> heyiwen at mail.nih.gov<mailto:heyiwen at mail.nih.gov>> wrote:
>>
>> > Hi,
>> >
>> > We have been using R 2.5 and affy 1.14.0 from BioConductor 2.0
>> release.
>> > Recently, we updated our R/BioC versions to R 2.8/BioC2.3, and I
>> noticed
>> > that the RMA results from affy package rma() are slightly different.
>> >
>> > For example, I have a gene whose summarized values (in linear space)
>> were
>> > 259.2365 and 244.2026 in the older version, but are 259.2308 and
>> 244.2079 in
>> > the newer version.
>> >
>> > Although the difference for this gene is not big, other genes have
>> > differences at a much smaller scale.
>> >
>> > I haven't tested R2.6 and R2.7, but I know that R2.4 and R2.5 gave me
>> > identical results.
>> >
>> > I'm wondering if there is any change in the way rma is calculated in
>> the
>> > new affy packages.
>> >
>> > Here are my code and seesionInfo:
>> >
>> > > eset <- rma(myData)
>> > > exprs(eset) <- 2^exprs(eset)
>> >
>> > > sessionInfo()
>> > R version 2.5.0 (2007-04-23)
>> > x86_64-unknown-linux-gnu
>> >
>> > locale:
>> >
>> > LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=C;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>> >
>> > attached base packages:
>> > [1] "tools"     "stats"     "graphics"  "grDevices" "utils"
>> "datasets"
>> > [7] "methods"   "base"
>> >
>> > other attached packages:
>> >    affy   affyio  Biobase
>> > "1.14.0"  "1.4.0" "1.14.0"
>> >
>> >
>> > > sessionInfo()
>> > R version 2.8.0 (2008-10-20)
>> > x86_64-unknown-linux-gnu
>> >
>> > locale:
>> >
>> > LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=C;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>> >
>> > attached base packages:
>> > [1] tools     stats     graphics  grDevices utils     datasets
>> methods
>> > [8] base
>> >
>> > other attached packages:
>> > [1] affy_1.20.0   Biobase_2.2.1
>> >
>> > loaded via a namespace (and not attached):
>> > [1] affyio_1.10.1        preprocessCore_1.4.0
>> >
>> >
>> > Thank you for your help!
>> >
>> > Yiwen
>> >
>> > _______________________________________________
>> > Bioconductor mailing list
>> > Bioconductor at stat.math.ethz.ch<mailto:Bioconductor at stat.math.ethz.ch>
>> > https://stat.ethz.ch/mailman/listinfo/bioconductor
>> > Search the archives:
>> > http://news.gmane.org/gmane.science.biology.informatics.conductor
>> >
>>
>>       [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch<mailto:Bioconductor at stat.math.ethz.ch>
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch<mailto:Bioconductor at stat.math.ethz.ch>
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>



More information about the Bioconductor mailing list