[BioC] M vs A plot

Naomi Altman naomi at stat.psu.edu
Tue Feb 17 15:11:01 MET 2004


Dear Richard and other participants in this discussion,

Loess uses a kernel weight to downweight the effects of more distant data 
values in the local regression.  For bandwidth greater than 1, all of the 
data values are used, but the more distant values are still 
downweighted.  As the bandwidth increases, there is less downweighting.

If you increase the bandwidth during the normalization process, the curve 
used for normalization gets flatter, until at very high bandwidth you are 
just doing ordinary linear regression.  The normalized values are the 
residuals from this curve.  As a result, if there is curvature on the MvA 
plots, large bandwidths lead to normalized data that still have curvature 
(since only the linear trend is removed.)  Small bandwidths lead to 
normalized data that are flatter.

My understanding is that Richard then visualized the curvature in the 
normalized data also using loess.  On these plots we are looking at the 
curve fitted to the normalized data.  So, if a large bandwidth is used to 
fit the curves, these curves should be flat.  However, if a large bandwidth 
is used for normalization, and the default bandwidth is used to visualize 
the normalized data, there will be excess curvature in the normalized data.

--Naomi


At 05:03 PM 2/9/2004, Richard Friedman wrote:
>Dear Sean (Wolfgang, Naomi, and Everybody),
>
>         The original command that I used was
>
> > ira.norm <- maNorm(ira.raw, norm ="p")\
>
>The command that I used with the altered span is
>
>
> > ira.f8.norm <- maNormMain(ira.raw, f.loc = list(maNormLoess(x = "maA",
>+               y = "maM", z= "maPrintTip", w = NULL, subset =TRUE, span 
>=0.8)),
>+               Mloc = TRUE,  Mscale = TRUE, echo =FALSE)
>
>This command still gave pronounced curvature at in the middle of one of 
>the printtip blocks and
>at the ends of several printtip blocks.
>I did not use a span greater than .8 because that was counteridicated 
>either in the
>micorarray or loess literature.
>Thank you f all for your suggestion of going to vsn. However,
>as this program is new to me,  I ask if anyone knows a rule of thumb as to 
>how flat the
>printtip loess line should be in order to be acceptable? I would prefer 
>not to change horses
>unless necessary
>
>Thanks and best wishes,
>Rich
>
>On Jan 30, 2004, at 2:48 PM, Sean Davis wrote:
>
>>Richard,
>>
>>The print-tip-loess lines should (I think) be straight and on the x-axis
>>(y=0) after print-tip-normalization.  If that isn't the case, perhaps you
>>could post exactly the commands you used to do your normalization.
>>That may
>>help people determine better what is going on.
>>
>>In reference to ridding you of intensity-dependent variability,
>>loess-normalization is designed to locally center the data but does not, in
>>itself, deal with the variability that may be intensity-dependent.
>>For that
>>problem, you may need to look into something like vsn or other scaling
>>method.
>>
>>Sean
>>
>>
>>On 1/30/04 2:35 PM, "Richard Friedman" <friedman at cancercenter.columbia.edu>
>>wrote:
>>
>>>Mick,
>>>
>>>Thanks for the help. What concerns  me however is not a single
>>>point being an outlier, but the whole loess fit to all the points leading
>>>the lowess curve for a few printips to deviate significantly from being
>>>a straight line practically colinear with the x-axis (abcissa). The two
>>>test cases on which I learned to use marray - the apoE data that comes
>>>with spot, and the swirl data that comes with marray, all had
>>>significantly expressed genes - however they also had flat normalized
>>>lowess curves. Significant curvature in the lowess curve leads me
>>>to be concerned that the spots associated with that region of
>>>the curve are improperly normalized.
>>>
>>>Can anyone out there give me:
>>>
>>>1. Guidelines as to how flat the lowess curve should be for the
>>>  data to be considered normalized.
>>>
>>>2. Advice as to what to do if the printtip normalization option
>>>  in marray did not remove intensity dependence.
>>>
>>>If anyone is willing to look at the M vs A curve, I would be grateful.
>>>
>>>Thanks and best wishes,
>>>Rich
>>>
>>>
>>>
>>>On Fri, 30 Jan 2004, michael watson (IAH-C) wrote:
>>>
>>>>Richard
>>>>
>>>>The nature of any normalisation means that we will always have outliers -
>>>>those spots that deviate from all the rest.  There could be two reasons -
>>>>that spot represents a differentially expressed gene or the spot is
>>>>unreliable and comes from a "bad" spot.
>>>>
>>>>I'd take the common sense approach to these outliers:
>>>>
>>>>i) Check any replicate spots - if all replicate spots are outliers then you
>>>>have evidence that it's a differentially expressed gene.  However, if the
>>>>replicates disagree, this is evidence that the outlier comes from an
>>>>unreliable / bad measurement
>>>>
>>>>ii) Go take a look at the spot on the original image.  Does it look "good"?
>>>>
>>>>You are likely always to find outliers after normalisation.  This is, after
>>>>all, what we are looking for, isn't it?  The key is to be able to say, when
>>>>you see an outlier, if that spot is of reliable quality or not.
>>>>
>>>>Thanks
>>>>Mick
>>>>
>>>>-----Original Message-----
>>>>From: Richard Friedman [mailto:friedman at cancercenter.columbia.edu]
>>>>Sent: 29 January 2004 22:26
>>>>To: 'Bioconductor Mail List'
>>>>Cc: IRA A TABAS
>>>>Subject: [BioC] M vs A plot
>>>>
>>>>
>>>>Dear Bioconductors,
>>>>
>>>>I have normalized a series of arrays using print-tip normalization.
>>>>Where as the systematic error in the unnormalized data was pronounced,
>>>>The systematic error on the normalized array was reduced greatly.
>>>>The M vs. A curve was flat for most of the 48 print-tips. However for a
>>>>few
>>>>printips, for A>12 M deviates from close to zero. in one case, M rises
>>>>as high
>>>>as M=1/2. at A=15. This only involves a small fraction of the spots (It
>>>>is hard to
>>>>estimate what proportion).
>>>>
>>>>Does this sound serious?
>>>>
>>>>If so, what should I do about it?
>>>>
>>>>Is anyone willing to look at the JPEg file (I did not attach it
>>>>because I don't
>>>>know if I am allowed to do so).
>>>>
>>>>Thanks and best wishes,
>>>>Rich
>>>>------------------------------------------------------------
>>>>Richard A. Friedman, PhD
>>>>Associate Research Scientist
>>>>Herbert Irving Comprehensive Cancer Center
>>>>Oncoinformatics Core
>>>>Lecturer
>>>>Department of Biomedical Informatics
>>>>Box 95, Room 130BB or P&S 1-420C
>>>>Columbia University Medical Center
>>>>630 W. 168th St.
>>>>New York, NY 10032
>>>>(212)305-6901 (5-6901) (voice)
>>>>friedman at cancercenter.columbia.edu
>>>>http://cancercenter.columbia.edu/~friedman/
>>>>
>>>>"Spring, Summer, and Winter.
>>>>Then Fall came along,
>>>>and that's the end of our song,
>>>>and the pigeons never hibernate at all".
>>>>-Rose Friedman, age 7
>>>>(These are the correct lyrics and supersede
>>>>the version previously at the end of my sig)
>>>>
>>>>_______________________________________________
>>>>Bioconductor mailing list
>>>>Bioconductor at stat.math.ethz.ch
>>>>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>>>
>>>------------------------------------------------------------
>>>Richard A. Friedman, PhD
>>>Associate Research Scientist
>>>Herbert Irving Comprehensive Cancer Center
>>>Oncoinformatics Core
>>>Lecturer
>>>Department of Biomedical Informatics
>>>Box 95, Room 130BB or P&S 1-420C
>>>Columbia University Medical Center
>>>630 W. 168th St.
>>>New York, NY 10032
>>>(212)305-6901 (5-6901) (voice)
>>>friedman at cancercenter.columbia.edu
>>>http://cancercenter.columbia.edu/~friedman/
>>>
>>>"Spring, Summer, and Winter.
>>>Then Fall came along,
>>>and that's the end of our song,
>>>and the pigeons never hibernate at all".
>>>-Rose Friedman, age 7
>>>(These are the correct lyrics and supersede
>>>the version previously at the end of my sig)
>>>
>>>_______________________________________________
>>>Bioconductor mailing list
>>>Bioconductor at stat.math.ethz.ch
>>>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>>
>>Naomi S. Altman                                814-865-3791 (voice)
>>Associate Professor
>>Bioinformatics Consulting Center
>>Dept. of Statistics                              814-863-7114 (fax)
>>Penn State University                         814-865-1348 (Statistics)
>>University Park, PA 16802-2111



More information about the Bioconductor mailing list