[BioC] Normalization of Agilent miRNA arrays using spike-ins

Wed Oct 29 15:44:24 CET 2008

Dear Christian
Two points:

1.) Please have a look at the "predict" method for "vsn" objects, this
does what you want.

 library("vsn")
 method ? predict("vsn")

E.g.:

data("kidney")
spikeInOnly = sample(nrow(kidney), 100)

fit = vsn2(kidney[spikeInOnly, ])
nkid = predict(fit, kidney)

2.) Perhaps you need to face the possibility that the arrays just do not
detect any differential expression (e.g. that it is swamped out by
background signal?) - let us know!

Best wishes
 Wolfgang

------------------------------------------------------------------
Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber

21/10/2008 07:44 Christian Eisen scripsit
> Hello all,
> 
> like the topic already says, I am trying to use spike-ins for
> normalization purposes.
> I ran two Agilent miRNA microarrays and when I plot the raw data I see
> slight variation
> between the two arrays, meaning the values of Array 2 are around
> 1.3-fold higher than
> the ones from Array 1. Therfore I decided to do normalization.
> I already read a lot about loading  the data into BioC and some
> preprocessing strategies.
> Since there is still a large debate about the right preprocessing
> strategy I decided
> to figure out which of them is working best for me.
> I used limma so far for loading the data and for some preprocessing, but
> I also
> loaded the data by read.table() and then used vsn() and other.
> Unfortunately after normalization using the implemented strategies
> (quantile, vsn)
> all differentially expressed genes are gone, meaning that values which
> have been
> different by 10-fold and more in the raw dataset, are now almost equal.
> Therfore I decided to do spike.in normalization.
> 
> The array has several "control" spots as well as negative and positive
> controls.
> I read a lot about normalization with spike-ins but I just can't figure
> out how to do it.
> I know that vsn2() offers this by fitting a transformation only to a
> small number of
> probes and than normalizing the bulk data "towards" this transformed data.
> However, if I extract the supposed spike-ins from my data and store it
> in a new
> matrix, doing the transformation just on these and storing the
> transformed values in
> a vsn object, I can't do the transformation of the remaining genes on
> this vsn object.
> 
>> fit=vsn2(mrawObj_TGS.spikeins, lts.quantile = 0.7)
> vsn: 912 x 32 matrix (1 stratum).   0% done.
> 100% done.
> Please use 'meanSdPlot' to verify the fit.
>> fit2=vsn2(mrawObj_TGS.genesubset, fit, lts.quantile = 0.7)
> Fehler in vsnMatrix(y, reference, strata, ...) :
>  'nrow(reference)' must be equal to 'nrow(x)'.
> 
> I understand that the matrix containing the transformed values
> (reference) mus have the same
> number of rows as the object I wan't to be transformed.
> But there is just no way for me to get to that point since I only have
> around 200 spikes but more than 12.000 genes.
> I also fitted a vsn transformation on the data from Array 1 and used
> these transformation
> for vsn transformation of Array 2 to get rid of the between array
> normalization.
> But I don't now if this is the right thing to do!
> 
> So my question is if anybody already has experience in normalization
> using spike-ins and
> how your advice would be what I shall do now.
> 
> Thanks a lot in advance!
> 
> Christian Eisen
>