[BioC] Questions about Affymetrix E. Coli arrays
Jenny Drnevich
drnevich at uiuc.edu
Wed Aug 22 17:15:29 CEST 2007
Hi Michael,
I'm posting this back to the Bioconductor list in case others search
on E. coli, too. From my experience with that one experiment, the
bimodality in the probe-level density plots (plotHist() in
affycoretools is great!) does indicate the degree of degradation in
your samples, especially the height of the first peak compared to the
second peak. The one sample we knew was degraded but hybed anyway had
a very small first peak and a very large second peak, and the samples
that retained a lot of signal after GC background correction (see
below) had larger first peaks and smaller second peaks. I also would
be particularly worried if there was a lot of variation in your
samples in the relative heights of the peaks. As for QC, using the
fitPLM() function from the affyPLM package and then plotting the
resulting weights (example code below) does help to show spatial
effects on the arrays.
For the pre-processing method, I would strongly suggest GCRMA,
because the GC-based background correction removed almost ALL the
signal from the one sample we knew was degraded; RT for prokaryotic
samples uses random primers because there's no polyA tail, so all
degraded fragments will get RT's and then labeled, and non-specific
binding occurs more the higher the GC content, which is why the
degraded sample had the HIGHEST raw signal values. For QC, I would
first do GCRMA without normalization and do a boxplot on the
resulting values to see which arrays still have signal and how much
variation there is among arrays (example code below). I would
probably exclude arrays that don't have much signal left, and if
there's still a lot of variation in the distributions, I would think
long and hard about using these values in the stat model, skipping
the quantile normalization completely.
As I said, I've only ever worked with this one experiment on E. coli,
but we knew we were having problems with samples being degraded, and
we decided to hyb one sample that was completely degraded, so I think
my conclusions for this experiment were valid. I assume that they
would be applicable to other E. coli experiments, but you'll have to
decide that for yourself.
HTH,
Jenny
> raw <- ReadAffy()
> PLM.data <- fitPLM(raw) #this will fit RMA, but that's fine for
spatial effects
> image(PLM.data, which=1, type="weights")
> image(PLM.data, which=2, type="weights") #etc. for all arrays
> gcrma.nonorm <- gcrma(raw, normalize=FALSE)
> boxplot(gcrma.nonorm)
At 10:26 PM 8/21/2007, you wrote:
>Hello Jenny,
>I came across a discussion you posted (May, 2006) on the
>Bioconductor forum about Affy E. Coli chips. I have recently been
>looking at data from these chips for the first time, and observed
>some of the same problems that you mention there (bimodality in
>probe-level density plots, inapplicability of usual Affy QC tools),
>and was wondering if you were able to arrive at anything like a
>preferred analysis protocol, and if you might have some advice that
>you could share. Information seems to be pretty sparse on these
>arrays, so finding your submissions on the subject was very
>encouraging. To be specific, I'm interested in any particular
>methods you might have found most useful for array or RNA quality
>assessment and a preferred normalization method (I observed a much
>larger difference in RMA- vs. GCRMA-normalized data with these chips
>than I have seen before). Also, can you tell me how you concluded
>that the bimodality in the density plots was due to RNA degradation?
>Thanks for any help,
>Michael
>
>-----
>Michael Slifker, MS
>Biostatistics Facility
>Fox Chase Cancer Center
>333 Cottman Ave.
>Philadelphia, PA 19111
>215-728-5345
>
>
Jenny Drnevich, Ph.D.
Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois, Urbana-Champaign
330 ERML
1201 W. Gregory Dr.
Urbana, IL 61801
USA
ph: 217-244-7355
fax: 217-265-5066
e-mail: drnevich at uiuc.edu
More information about the Bioconductor
mailing list