michael watson (IAH-C)
michael.watson at bbsrc.ac.uk
Wed Oct 8 17:31:27 MEST 2003
I assume you mean you have lots of negative values after subtracting background?
There are many options, none ideal. The problem is that microarrays really don't handle data very well where the gene is off in one channel and on in another. By definition, off is zero, and we are obsessed by calculating ratios, where a zero value really screws things up :-(
Anyway, your options are:
- set all neg. values to zero (makes most sense, but this will screw up ratios)
- set all neg. values to one, or some other nominally small figure (this won't screw up ratios but it is, after all, simply inventing numbers)
- adjust your whole distribution such that 95% of spots are > 0 (adjust by adding/subtracting the 5th percentile value from your distribution) - this is quite popular, though again, dubious in it's validity
- do not subtract background - after all, no one has proved that the relationship of background to foreground intensity is an additive one, nor that it has any effect whatsoever. So if you have what appears to be a nice uniform background, both within and across your slides, then why bother subtracting background at all?
This is a problem that troubles me greatly too and I have yet to find a suitable answer. Personally, I set all negative values to 1 and then create a flag that basically says "don't trust the magnitude of this ratio" :-)
I do none of this in Bioconductor by the way. I use Perl and/or a SQL database and munge it before putting it in BioC.
From: Kaushik, Narendra K [mailto:n.kaushik at imperial.ac.uk]
Sent: 08 October 2003 15:23
To: 'bioconductor at stat.math.ethz.ch'
Subject: [BioC] Normalization
I have lots of negative values. What is the best way to get rid of them or
to normalize the data. I am working with Avg_diif values.
Imperial College of Medicine,
London SW3 6NP
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
More information about the Bioconductor