[BioC] Normalized data in expresso and Expression Console differ

cstrato cstrato at aon.at
Tue Jun 23 18:35:46 CEST 2009


Dear Oliver

Please note that Expression Console scales the mean expression levels to 
a pre-defined target intensity, thus you need to scale your data 
accordingly, or use function mas5(..., sc=500) from package affy. 
Furthermore, the MAS5 algorithm from Affymetrix does not use quantile 
normalization.

Regarding the apparent outliers, to my knowledge there exist four 
different implementations of the MAS5 algorithm, i.e. GCOS, APT, affy 
and xps, which all result in slightly different expression levels, as 
you can e.g. see in Figure 4 of vignette APTvsXPS.pdf from package xps, see:
http://www.bioconductor.org/packages/release/bioc/vignettes/xps/inst/doc/APTvsXPS.pdf
I must admit, that I do not know why the different implementations 
differ slightly.

Best regards
Christian
_._._._._._._._._._._._._._._._._._
C.h.r.i.s.t.i.a.n   S.t.r.a.t.o.w.a
V.i.e.n.n.a           A.u.s.t.r.i.a
e.m.a.i.l:        cstrato at aon.at
_._._._._._._._._._._._._._._._._._


Oliver Stolpe wrote:
> Hello list,
>
> currently I use the expresso method from the Bioconductor package to
> analyze Affymetrix data:
>
> normalized <- expresso(data, bgcorrect.method = "mas",
>            normalize.method = "quantiles",
>            pmcorrect.method = "mas",
>            summary.method = "mas")
> matrix <- log2(exprs(normalized))
>
> As a reference I use the Expression Console by Affymetrix. My goal is
> to rebuild the normalized data (and therefore the resulting boxplot)
> from the Expression Console with R. I took the log2 after normalization
> and correction since the Expression Console delivered relative small
> values (seemed logarithmized) and the expresso data had really a big
> range. Unfortunately the results differ.
>
>
> Does anyone know why they differ that noticeable (different mean,
> many outliers)? You may have a look at the boxplots I attached.
>
>
> Even when I leave out the normalization in expresso it looks nearly
> the same.
>
> I'm glad about any suggestions.
>
> Thanks in advance,
> best regards,
> Oliver
>
> Some helpful data:
>
>  > head(matrix_expresso)
>   data1.cel.gz data2.cel.gz data3.cel.gz data4.cel.gz
>       67.16587     72.66765     73.49201     74.00240
>       72.03782     95.80303     97.60087     64.60356
>      117.65746    142.88926    138.01063    159.64211
>      185.33413    292.81031    232.82629    259.88629
>      164.88572    260.95710    243.47892    247.80303
>     1238.80516   1674.33256   1525.44652   1490.71100
>   data5.cel.gz data6.cel.gz
>        73.5097     67.97570
>        93.9136     84.26307
>       145.7278    124.94947
>       250.9573    235.76545
>       235.0867    251.55364
>      1486.8813   1523.14721
>
>  > head(matrix_expresso_log2)
>   data1.cel.gz data2.cel.gz data3.cel.gz data4.cel.gz
>       6.069657     6.183241     6.199515     6.209500
>       6.170683     6.581999     6.608822     6.013542
>       6.878449     7.158754     7.108636     7.318697
>       7.533985     8.193823     7.863110     8.021737
>       7.365323     8.027669     7.927653     7.953050
>      10.274734    10.709370    10.575016    10.541785
>   data5.cel.gz data6.cel.gz
>       6.199863     6.086947
>       6.553262     6.396829
>       7.187132     6.965201
>       7.971298     7.881209
>       7.877049     7.974722
>      10.538074    10.572840
>
>  > sessionInfo()
> R version 2.9.0 (2009-04-17)
> i686-redhat-linux-gnu
>
> locale:
> LC_CTYPE=de_DE at euro;LC_NUMERIC=C;LC_TIME=de_DE at euro;LC_COLLATE=de_DE at euro;LC_MONETARY=C;LC_MESSAGES=de_DE at euro;LC_PAPER=de_DE at euro;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=de_DE at euro;LC_IDENTIFICATION=C 
>
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
>  [1] zebrafishcdf_2.4.0 marray_1.22.0      limma_2.18.0
> RdbiPgSQL_1.18.1
>  [5] Rdbi_1.18.0        multtest_2.0.0     class_7.2-47       MASS_7.2-47
>  [9] affy_1.22.0        Biobase_2.4.1
>
> loaded via a namespace (and not attached):
> [1] affyio_1.12.0        preprocessCore_1.6.0 splines_2.9.0
> [4] survival_2.35-4      tools_2.9.0
>
>
>
> ------------------------------------------------------------------------
>
>
> ------------------------------------------------------------------------
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list