[BioC] replace negative values in Agilent miRNA data
Steve Lianoglou
mailinglist.honeypot at gmail.com
Mon Oct 27 16:57:16 CET 2008
Hi Christian,
On Oct 24, 2008, at 11:33 AM, Christian Eisen wrote:
> Hello everyone,
>
> as some of you may be familiar with Agilent's miRNA Array, the
> intensity files the
> Agilent Feature Extraction Software delivers contain negative values.
> These result from the background being substracted from the spot
> intensity and if
> the background is higher than the spot (i.e. no gene detected) it
> gets a negative value.
> Needless to say, log-transformation is not working on these.
>
> So therefore I came up with an alternative to make log-
> transformations work.
> I replaced all negative values in my data with the smallest positive
> value in my data
> something like 0.00103.
> However, as you already might expect, upon log2-transformation, this
> value becomes really small (almost -10)
> Still there are non-manipulated intensity values in my data being as
> small.
> So my question is, if this is correct if I replace negative values
> by a very small positive value or not?
> Upon analysis of differential expression, I get genes which are
> enriched in these manipulated values in
> either of the two groups I concider.
> But my justification for doing this is, if a gene has a negative
> value, i.e is not detected, there should be a significant
> difference, concidering a large enough intensity in the group
> compared to, between the two.
>
> Am I wrong on this?
I don't have a definitive answer for this, but wanted to respond with
a suggestion in hopes that this message lands on the radar of someone
who is more experienced with dealing with this type of stuff.
You might want to read through this paper referenced in the
limmaUserGuide regarding different aspects of background normalization:
http://bioinformatics.oxfordjournals.org/cgi/content/full/23/20/2700
By briefly skimming, it appears that they are arguing that a "smarter"
method of just bg-subtraction is in order that better deals with the
variance/intensity dependance seen in typical MA data. They present a
comparison for some set of different bg-correction techniques (all of
which are accessible via the limma package, I believe).
I know that typical agilent data has the normalized signals in the r/
gProcessedSignal columns which you can compare against (I'm not sure
about the normalization details, though -- do those have negative
values?).
If you'd like to use the background corrected methods listed in the
paper, you can load your raw data into an RGList object (from
limma::read.maimages function) and test them out.
Perhaps someone with more experience can better list some pro's and
con's they've come across when using these different techniques in
practice.
Hope that helps,
-steve
--
Steve Lianoglou
Graduate Student: Physiology, Biophysics and Systems Biology
Weill Medical College of Cornell University
http://cbio.mskcc.org/~lianos
More information about the Bioconductor
mailing list