[BioC] method for removing consistent technical bias?
k. brand
k.brand at erasmusmc.nl
Mon Sep 18 15:25:22 CEST 2006
Dear BioCers,
I have a consistent, reproducible technical discrepancy resulting from
two different hybridisations. Two biological replicates for Hyb A used
expired arrays, and has significantly lower intensities than the two
biological replicates of hyb B, which behave normally. I thus have 4
biological replicates of three different tissues which cluster (K-means)
more strongly by hyb. than by tissue.
RMA does a courageous job normalising the discrepancy (see summary of
normalised and unnormalised data below), but if any one has experience
or suggestions they care to share on getting the most out of this flawed
dataset, id be very grateful.
Of note- since the 3 tissues are 'paired', ie come from the same animal,
i was also considering a paired ANOVA of the 3 tissues, reducing, if not
eliminating the need to overcome inter-hyb variation, but lack the
experience to know what/if there is an appropriate R implementation.
Furthermore i have no idea how to use the 2/4 replicates to increase the
statistical power with such an approach. Any guidance appreciated.
thanks in advance,
Karl
#Unnormalized data:
> dat <- ReadAffy()
> dat.rma <- rma(dat, normalize=FALSE)
Background correcting
Calculating Expression
> apply(exprs(dat.rma),2,summary)
Tco1A.CEL Tco2A.CEL Tco3B.CEL Tco4B.CEL Tmi1A.CEL Tmi2A.CEL
Min. 1.633 1.713 1.869 2.431 2.027 1.736
1st Qu. 2.627 2.751 3.234 3.554 2.831 2.882
Median 3.329 3.320 4.933 4.952 3.433 3.588
Mean 4.153 3.902 5.572 5.741 4.330 4.261
3rd Qu. 5.147 4.541 7.534 7.556 5.254 5.151
Max. 14.180 13.640 14.370 14.280 14.240 13.880
Tmi3B.CEL Tmi4B.CEL Tsh1A.CEL Tsh2A.CEL Tsh3B.CEL Tsh4B.CEL
Min. 1.771 2.575 1.607 1.902 1.771 2.360
1st Qu. 3.280 3.716 2.541 2.850 3.197 3.940
Median 4.986 4.978 3.183 3.357 4.833 5.488
Mean 5.629 5.762 3.983 4.032 5.493 6.092
3rd Qu. 7.606 7.449 4.882 4.629 7.426 7.892
Max. 14.300 14.180 14.070 13.990 14.230 14.340
#RMA normalized data:
> eset <- justRMA(filenames=list.celfiles())
Background correcting
Normalizing
Calculating Expression
> apply(exprs(eset),2,summary)
Tco1A.CEL Tco2A.CEL Tco3B.CEL Tco4B.CEL Tmi1A.CEL Tmi2A.CEL
Min. 1.945 2.077 2.014 2.070 1.970 2.050
1st Qu. 3.394 3.574 3.114 3.135 3.381 3.483
Median 4.469 4.533 4.552 4.435 4.474 4.472
Mean 5.191 5.110 5.221 5.192 5.199 5.133
3rd Qu. 6.539 6.114 6.947 6.866 6.589 6.260
Max. 14.180 14.170 14.120 14.170 14.170 14.180
Tmi3B.CEL Tmi4B.CEL Tsh1A.CEL Tsh2A.CEL Tsh3B.CEL Tsh4B.CEL
Min. 1.930 2.056 2.015 2.028 1.815 1.919
1st Qu. 3.106 3.190 3.436 3.524 3.132 3.134
Median 4.520 4.426 4.497 4.484 4.548 4.450
Mean 5.219 5.194 5.203 5.140 5.224 5.198
3rd Qu. 6.949 6.824 6.537 6.241 6.938 6.887
Max. 14.130 14.070 14.180 14.130 14.110 14.150
> sessionInfo()
Version 2.3.0 (2006-04-24)
i386-pc-mingw32
attached base packages:
[1] "tools" "methods" "stats" "graphics" "grDevices" "utils"
"datasets" "base"
other attached packages:
affyPLM gcrma matchprobes affydata mouse4302cdf
vsn limma affy affyio Biobase
"1.8.0" "2.4.1" "1.4.0" "1.8.0" "1.12.0"
"1.10.0" "2.7.3" "1.10.0" "1.0.0" "1.10.0"
--
Karl Brand <k.brand at erasmusmc.nl>
Department of Cell Biology and Genetics
Erasmus MC
Dr Molewaterplein 50
3015 GE Rotterdam
lab +31 (0)10 408 7409 fax +31 (0)10 408 9468
More information about the Bioconductor
mailing list