rgentlem at fhcrc.org
Tue Sep 26 19:34:34 CEST 2006
Not much to say - a few notes, but this seems very similar to what I
have proposed - so I like it. The differences don't seem to be that
Gordon K Smyth wrote:
>> Date: Fri, 22 Sep 2006 15:36:10 -0700
>> From: Robert Gentleman <rgentlem at fhcrc.org>
>> Subject: [Bioc-devel] affyQA/QC
>> To: bioc-devel at stat.math.ethz.ch
>> I am trying to put together a set of, what one might regard, as
>> standard plots and summary statistics that should be collected on any
>> set of Affymetrix microarrays (at least ones for gene expression). The
>> first pass is attached, I would appreciate any comments on it,
>> especially with regard to things that I have missed, or things I have
>> suggested that don't seem to be quite correct, or could be improved.
>> On an implementation note - I will be making use of existing software
>> and intend to work with Craig Parman to put this into the existing
>> affyQCReport package - users of that might want to let me know what
>> functionality they are relying on, but this should be strict additions.
>> Robert Gentleman, PhD
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M2-B876
>> PO Box 19024
>> Seattle, Washington 98109-1024
>> rgentlem at fhcrc.org
> I'd also be interested in reactions on a standard set of plots and summaries. Below is the set of
> tests I've been using recently (based on advice from Ken Simpson). Keith Satterley is
> implementing something close to this in the affylmGUI package for the BioC 1.9 release.
> Best wishes
> Set of affy QA plots and summaries:
> Boxplots of chip-wise intensities:
>> x <- ReadAffy(filenames=targets$FileName,celfile.path="cel")
>> narrays <- ncol(exprs(x))
> Empirical distributions of chip-wise intensities:
> RNA digestion plot:
>> deg <- AffyRNAdeg(x)
> Affy QC parameters:
> The bioB spike-ins should be present.
> All the other measures should be consistent across chips.
>> qc <- qc.affy(x)
>> qc.tab <- rbind(
> + Percent.present=qc at percent.present,
> + Scale.factor=qc at scale.factors,
> + Average.background=qc at average.background,
> + bioBCalls=qc at bioBCalls=="P",
> + t(qc at spikes),
> + t(qc at qc.probes))
>> colnames(qc.tab) <- paste("Chip",1:narrays)
> Image plots of probe level robust residuals.
> Larger residuals are darker and indicate deviations from the additive model used to summarise
> probes within each probe-set.
>> pset <- fitPLM(x)
>> oldpar <- par(mfrow=c(4,2),mar=c(1,1,2,1))
>> image(pset, type="resids") # red=positive resids, blue=negative
I thought about this, but for a lot of arrays it seems like it would
be better to come back and concentrate on those that were indicated for
> Normalized Unscaled Standard Errors (NUSE) plot.
> The standard error estimates obtained for each gene on each array from fitPLM
> are standardized across arrays so that the median standard error for that
> genes is 1 across all arrays.
> An array with elevated SEs relative to other arrays is typically of
> lower quality.
> Relative Log Expression (RLE) values.
> RLE values are computed for each probeset by comparing the expression value
> on each array against the median expression value for that probeset across all arrays.
> Assuming that most genes are not changing in expression across arrays means ideally
> most of these RLE values will be near 0.
> When examining this plot focus should be
> on the shape and position of each of the boxes.
> Typically arrays with poorer quality
> show up with boxes that are not centered about 0 and/or are more spread out.
Seems very similar -
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
rgentlem at fhcrc.org
More information about the Bioc-devel