[BioC] Normalization of Agilent miRNA arrays using spike-ins

Tue Oct 21 08:44:53 CEST 2008

Hello all,

like the topic already says, I am trying to use spike-ins for 
normalization purposes.
I ran two Agilent miRNA microarrays and when I plot the raw data I see 
slight variation
between the two arrays, meaning the values of Array 2 are around 
1.3-fold higher than
the ones from Array 1. Therfore I decided to do normalization.
I already read a lot about loading  the data into BioC and some 
preprocessing strategies.
Since there is still a large debate about the right preprocessing 
strategy I decided
to figure out which of them is working best for me.
I used limma so far for loading the data and for some preprocessing, but 
I also
loaded the data by read.table() and then used vsn() and other.
Unfortunately after normalization using the implemented strategies 
(quantile, vsn)
all differentially expressed genes are gone, meaning that values which 
have been
different by 10-fold and more in the raw dataset, are now almost equal.
Therfore I decided to do spike.in normalization.

The array has several "control" spots as well as negative and positive 
controls.
I read a lot about normalization with spike-ins but I just can't figure 
out how to do it.
I know that vsn2() offers this by fitting a transformation only to a 
small number of
probes and than normalizing the bulk data "towards" this transformed data.
However, if I extract the supposed spike-ins from my data and store it 
in a new
matrix, doing the transformation just on these and storing the 
transformed values in
a vsn object, I can't do the transformation of the remaining genes on 
this vsn object.

 > fit=vsn2(mrawObj_TGS.spikeins, lts.quantile = 0.7)
vsn: 912 x 32 matrix (1 stratum).   0% done.
100% done.
Please use 'meanSdPlot' to verify the fit.
 > fit2=vsn2(mrawObj_TGS.genesubset, fit, lts.quantile = 0.7)
Fehler in vsnMatrix(y, reference, strata, ...) :
  'nrow(reference)' must be equal to 'nrow(x)'.

I understand that the matrix containing the transformed values 
(reference) mus have the same
number of rows as the object I wan't to be transformed.
But there is just no way for me to get to that point since I only have 
around 200 spikes but more than 12.000 genes.
I also fitted a vsn transformation on the data from Array 1 and used 
these transformation
for vsn transformation of Array 2 to get rid of the between array 
normalization.
But I don't now if this is the right thing to do!

So my question is if anybody already has experience in normalization 
using spike-ins and
how your advice would be what I shall do now.

Thanks a lot in advance!

Christian Eisen