[BioC] Testing biased microarray data
Simon Anders
anders at embl.de
Mon Mar 28 17:49:13 CEST 2011
Hi January
On 03/28/2011 05:02 PM, January Weiner wrote:
> the following problem: samples are either RNA, or RNA with selective
> depletion of some forms of RNA. In short, the relative abundance in
> the second group of samples should always be equal to or smaller than
> that in the control, but never higher. The difference in abundance
> might concern a substantial fraction of mRNAs (10-50%).
>
> Naturally, when the samples are normalised, since the total transcript
> abundance in the experimental group is significantly lower, the
> relative abundance of transcripts with no change will be higher in the
> experimental group, and artifacts will occur: we will observe genes
> that are apparently up-regulated, although in reality their levels
> remain stable.
We faced the same problem a while ago in a project comparing mRNA from
fertilized vs unfertilized Drosophila eggs. In Drosophila eggs, an mRNA
degradation machinery is activated when the egg is layed, and many
maternally deposited transcripts get degraded within a couple of hours.
We had three time points, and in the unfertilized eggs, the transcript
levels could only be lower but not higher in the later compared to the
earlier time points, similar to your setting.
We solved the issue by first using VSN (with an increased trimming
quantile), followed by LOESS and then RMA, and this worked very well.
Have a look at this image:
http://www.embl.de/~anders/misc_pub/FlyEggs_mod_vs_loess.png
Each panel is an MA plot, comparing the indicated array with an average
over all arrays. The two lines are the LOESS fit lines (with two
slightly different settings).
Look, for example, at the four arrays for the late unfertilized time
point ("unf.3"): The triangle towards the bottom left corner are the
decayed genes. They are lower than average (i.e., below y=0) and, as
they are gone, also to the left -- hence the triangle. The LOESS line
clearly follows the bulk of non-decayed genes and is not deterred by the
triangle. Other normalization techniques such as RMA only (without
preceding LOESS) or quantile normalization did not do the job.
I can send you a code example if you need it. For further details,
please see our paper and especially page 4 of the supplement:
Thomsen S, Anders S, Chandra Janga S, Huber W, Alonso CR. Genome-wide
analysis of mRNA decay patterns during early Drosophila development.
Genome Biology, 11 (2010) R93.
http://genomebiology.com/2010/11/9/R93
Simon
+---
| Dr. Simon Anders, Dipl.-Phys.
| European Molecular Biology Laboratory (EMBL), Heidelberg
| office phone +49-6221-387-8632
| preferred (permanent) e-mail: sanders at fs.tum.de
More information about the Bioconductor
mailing list