[BioC] RNA degradation tends & options for analysis
Juanma Vaquerizas
jvaquerizas at ebi.ac.uk
Thu Feb 23 10:31:16 CET 2006
Dear list,
I'm trying to analyse some Affy arrays for my PhD thesis but I'm a
little bit stuck, so any comments on the following would be very
welcome.
Basically I'm analysing a set of Affy arrays coming form 10 different
labs (3 biological replicates per lab) where each lab is using a
different RNA source. I've done some quality control using affyPLM
and the chips seem to be ok.
If I have a look at the RNA digestion plot, 2 different trends are
clearly visible (half of the arrays follow one trend with a slope
around 1 and the other half with a slope around 3).
I want to make some contrasts between the different RNA sources that
have been used, but as I've read in (Bolstad et al., 2005,
Bioinformatics and Computational Biology Solutions Using R and
Bioconductor, Springer) and in some previous messages in this list,
mixing arrays with very different slopes in the RNA digestion plots
is not a very good idea.
The options I'm thinking about at the moment are the following:
Option 1:
1.- Split the arrays by the lab of origin.
2.- Preprocess them separately using GCRMA.
3.- Combine the resulting esets into one eset.
4.- Analyse using limma, modeling for 3 factors (RNA type, lab
effect, trend in the RNA digestion plot)
5.- Extract the contrasts I am interested in (the RNA type ones)
Option 2:
1.- Split the arrays by the trend of the RNA digestion plot.
2.- Preprocess them separately using GCRMA.
3.- Combine the resulting esets into one eset.
4.- Analyse using limma, modeling for 3 factors (RNA type, lab
effect, trend in the RNA digestion plot)
5.- Extract the contrasts I am interested in (the RNA type ones)
Option 3:
1.- Do not split the arrays in groups.
2.- Preprocess all of them using GCRMA.
3.- Analyse using limma, modeling for 3 factors (RNA type, lab
effect, trend in the RNA digestion plot)
4.- Extract the contrasts I am interested in (the RNA type ones)
Unfortunately I can't figure out which would be the best way to
proceed, or even if modeling for the trend is something that would be
acceptable. I've seen in the vignette of the affycoretools package
that the arrays coming from different RNA protocols are preprocessed
separately and then mixed for the linear model, although it is not
clear for me why is this option better that any of the others.
On the other hand, some messages to the list last week were for
preprocessing all the experiments at once...
My understanding is that there is not a clear consensus about what to
do in those cases and I don't really know the consequences and the
differences between following the different approaches, so any
comments would be very much appreciated.
Thank you very much for your help.
Best wishes,
Juanma.
Juanma Vaquerizas
PhD Student
Regulation Group
EMBL-EBI
Wellcome Trust Genome Campus
Cambridge CB10 1SD
UK
More information about the Bioconductor
mailing list