[BioC] degraded RNA and background correction

Jenny Drnevich drnevich at uiuc.edu
Tue Feb 28 20:43:53 CET 2006


Hi everyone,

I have an interesting situation which involved samples with degraded RNA. I 
would like to get some comments on it, and a search of the archives 
indicated that an example of what degraded RNA looks like on an Affy chip 
would be useful for others.

The samples are from E. coli, which has notoriously unstable RNA with 
half-lifes on the order of minutes. When grown in anaerobic conditions, the 
RNA becomes even more unstable. Many of the total RNA samples that our core 
facility was receiving from the researchers were degraded, even though they 
were supposedly fine after extraction. Due to a variety of reasons, our 
core decided to label and hyb a sample that was completely degraded 
according to a Bioanalyzer. We eventually were able to get non-degraded 
total RNA for all of the samples.  When I compare the degraded sample to 
the other samples, it is indeed an outlier, but it has HIGHER pm and mm 
signals than the non-degraded samples. The density plots show an 
interesting bimodal distribution for all the samples, and a disturbing 
trend towards the shape of the degraded sample. I say disturbing because 
the samples grown in anaerobic conditions are closer to the degraded sample 
than the samples grown in aerobic conditions. The pm, mm and both density 
plots can be seen here (the degraded sample is the green line with the 
largest right-hand peak): ftp://ftp.biotec.uiuc.edu/pub/Ecoli_figures/

Oddly, when I look at the RNA digestion plot (which may or may not show 
degradation, according to what I found in the archives), all of the samples 
included the degraded sample have flat slopes; only a couple had p<0.05, 
and they had slightly negative slops; see above link for plot (degraded 
sample in green, anaerobic samples in red; plotting was done with 
transform="neither").

What also surprises me is the huge difference between gc-based background 
correction of GCRMA and either of the background corrections used by RMA or 
MAS5 (I've traced it to the background correction). When the gc-based 
background correction is used then median polish to summarize the values 
(no normalization), the degraded sample has, as expected, almost no signal, 
but using either RMA or MAS5 (without normalization) results in the 
degraded sample having the HIGHEST signals. And the closer a sample's raw 
distribution was to the degraded sample, the more it followed the same 
pattern for background correction (see the above link for boxplots; the 
degraded sample is the last one on the right, #423; even numbers are 
aerobic samples, odd are anaerobic samples).

Based on the behavior of the degraded sample, I would say that the gc-based 
background correction is the one to use. It also appears that all of the 
aerobic samples, but only 3 of the anaerobic samples (417, 419, 423B) have 
relatively little degradation, and the rest of the anaerobic samples are 
severely degraded. Re-doing these samples is unfortunately NOT an option at 
this point. The researchers want me to go ahead with the statistical 
analysis, even though I have told them that any changes will be primarily 
driven by degradation, and not real expression levels. I probably should 
not use any normalization method because the assumption of few changes is 
not met.

Would you agree with my conclusions above, or do you have alternative 
interpretations and suggestions? Is it possible that prokaryotic RNA and 
its N-terminus labeling method are different enough from eukaryotic RNA and 
the biotin-labeled nucleotide labeling method so that these functions 
(particularly gc-based bg correction) should not be used?

Thanks in advance,
Jenny


Jenny Drnevich, Ph.D.

Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois, Urbana-Champaign

330 ERML
1201 W. Gregory Dr.
Urbana, IL 61801
USA

ph: 217-244-7355
fax: 217-265-5066
e-mail: drnevich at uiuc.edu



More information about the Bioconductor mailing list