[BioC] info su illumina

Mon Mar 6 15:26:14 CET 2006

Sean,

It seems to me that the problem with normalising via BeadStudio is that 
all of the normalisation methods they implement (cubic spline, average 
scaling, rank invariant etc...) are automatically accompanied by a 
"background normalisation" which is a subtraction using an average of 
negative controls. The subtraction is done on the unlogged scale so 
that lower intensity genes get affected much more so we see a 
characteristic fanning effect. I don't recall seeing an explanation of 
why Illumina prefer this method. However, in the BeadStudio manual they 
do say "Applying the technique [of background normalisation] allows for 
more quantitative assessments of fold-change differences, especially 
for genes with dim signals". Note that this background normalisation is 
in addition to a local background correction which is done to obtain 
the bead level intensities.

Within BeadStudio it is possible to output data without any 
normalisation (this is the only way to avoid this background 
subtraction normalisation that Illumina does)  and this can be read 
into R using beadarray or other methods. The non-normalised data is 
generally high quality and we see good reproducibility between arrays. 
So far I have used quantile and qspline methods (on both bead level and 
bead summary data) and I don't think there is a need for anything more 
complicated. Of course, normalisation on bead level data is probably 
going to be preferable, but I don't know what affect this actually has 
(yet!). I haven't yet incorporated the detection scores into the 
analysis anywhere.

Regards,

Mark

On 6 Mar 2006, at 13:03, Sean Davis wrote:

>
>
>
> On 3/4/06 1:58 PM, "Mark Dunning" <md392 at cam.ac.uk> wrote:
>
>> btw do you know how the data in geneprofile.csv was normalised by
>> BeadStudio? Illumina recommend a rank invariant normalisation plus
>> background correction, but that seems like a very bad idea to me 
>> looking at
>> some of the data it produces!
>
> Mark,
>
> I'd be curious to hear what you have found to be the best way to 
> normalize
> these arrays.  We have also seen some problems with the invariant set
> method, but haven't come up with a good alternative, given that a 
> majority
> of the data on the array isn't even measured (according to the 
> detection
> p-values).
>
> Sean
>
>