[BioC] replace negative values in Agilent miRNA data

Mon Oct 27 16:57:16 CET 2008

Hi Christian,

On Oct 24, 2008, at 11:33 AM, Christian Eisen wrote:

> Hello everyone,
>
> as some of you may be familiar with Agilent's miRNA Array, the  
> intensity files the
> Agilent Feature Extraction Software delivers contain negative values.
> These result from the background being substracted from the spot  
> intensity and if
> the background is higher than the spot (i.e. no gene detected) it  
> gets a negative value.
> Needless to say, log-transformation is not working on these.
>
> So therefore I came up with an alternative to make log- 
> transformations work.
> I replaced all negative values in my data with the smallest positive  
> value in my data
> something like 0.00103.
> However, as you already might expect, upon log2-transformation, this  
> value becomes really small (almost -10)
> Still there are non-manipulated intensity values in my data being as  
> small.
> So my question is, if this is correct if I replace negative values  
> by a very small positive value or not?
> Upon analysis of differential expression, I get genes which are  
> enriched in these manipulated values in
> either of the two groups I concider.
> But my justification for doing this is, if a gene has a negative  
> value, i.e is not detected, there should be a significant
> difference, concidering a large enough intensity in the group  
> compared to, between the two.
>
> Am I wrong on this?

I don't have a definitive answer for this, but wanted to respond with  
a suggestion in hopes that this message lands on the radar of someone  
who is more experienced with dealing with this type of stuff.

You might want to read through this paper referenced in the  
limmaUserGuide regarding different aspects of background normalization:

http://bioinformatics.oxfordjournals.org/cgi/content/full/23/20/2700

By briefly skimming, it appears that they are arguing that a "smarter"  
method of just bg-subtraction is in order  that better deals with the  
variance/intensity dependance seen in typical MA data. They present a  
comparison for some set of different bg-correction techniques (all of  
which are accessible via the limma package, I believe).

I know that typical agilent data has the normalized signals in the r/ 
gProcessedSignal columns which you can compare against (I'm not sure  
about the normalization details, though -- do those have negative  
values?).

If you'd like to use the background corrected methods listed in the  
paper, you can load your raw data into an RGList object (from  
limma::read.maimages function) and test them out.

Perhaps someone with more experience can better list some pro's and  
con's they've come across when using these different techniques in  
practice.

Hope that helps,
-steve

--
Steve Lianoglou
Graduate Student: Physiology, Biophysics and Systems Biology
Weill Medical College of Cornell University

http://cbio.mskcc.org/~lianos