[BioC] Interpreting mdqc output
Mark Dunning
md392 at cam.ac.uk
Fri Oct 24 11:27:42 CEST 2008
Hello all,
I have been looking at the mdqc package for automatic quality assessment of
a large set of Affy SNP 6.0 data. I have already generated a set of QC stats
using Affy's own software and they exclude outlier arrays using a fixed
cut-off of the contrast QC scores (basically a measure of how separated the
three genotype clouds are). I wanted to see if mdqc would give me the same
answers.
Here are some of the contrast QC scores for the first 6 arrays (out of 140).
A value less than 0.4 in any of these columns could be a quality problem
according to Affy.
> allQC[1:6,]
Contrast.QC Contrast.QC..Random. Contrast.QC..Nsp. Contrast.QC..Sty.
Contrast.QC..Nsp.Sty.Overlap.
1 0.72 0.72 0.79 1.00
1.38
2 0.42 0.42 0.72 0.35
0.99
3 1.08 1.08 0.97 1.28
1.30
4 0.50 0.50 0.75 0.79
0.64
5 0.00 0.00 0.00 -0.22
0.00
6 0.47 0.47 0.76 0.49
0.71
As you can see Array 5 is clearly an outlier (<0.4) in all 5 columns and we
flagged it as such originally. However, when running mdqc, it does not call
array 5 an outlier at the greatest significance level. Intuitively I would
expect this array to have the most extreme quality measure.
> mout=mdqc(allQC)
> mout
Method used: nogroups Number of groups: 1
Robust estimator: S-estimatorMDs exceeding the square root of the 90 %
percentile of the Chi-Square distribution
[1] 5 8 14 16 48 63 75 78 81 86 91 114 117 122 126 131 132 134
137 138
MDs exceeding the square root of the 95 % percentile of the Chi-Square
distribution
[1] 5 8 14 48 75 78 81 86 91 114 122 126 131 132 137 138
MDs exceeding the square root of the 99 % percentile of the Chi-Square
distribution
[1] 48 78 81 86 122 126 131 137 138
Which leads me (finally!) to my questions:-
-Is mdqc getting confused by the fact that array 5 is consistently low in
all qc measures?
-Does mdqc automatically assume that higher values indicate lower array
quality or vice-versa?
Many thanks in advance for any input,
Cheers,
Mark
PS here is my sessionInfo()
> sessionInfo()
R version 2.8.0 alpha (2008-10-04 r46598)
i386-pc-mingw32
locale:
LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United
Kingdom.1252;LC_MONETARY=English_United
Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] mdqc_1.4.0 MASS_7.2-44 cluster_1.11.11
More information about the Bioconductor
mailing list