[BioC] M vs A plot

Naomi Altman naomi at stat.psu.edu
Wed Feb 25 04:06:24 MET 2004


If the residuals are smoothed using the same bandwidth as the original 
curve, the result should be flat.  I am puzzled as well.

Smoothing twice is the same as using a larger bandwidth and a somewhat 
different kernel weight.  It should be fine.

--Naomi

At 10:43 AM 2/17/2004, Richard Friedman wrote:
>Dear Naomi (and everybody),
>
>         Thank you for your reply,
>         Since the normalized curve is displayed with the same overall 
> command that performed the
>normalization it is not clear to me why you suggest that the display curve 
>is fitted with a different
>bandwith that that which was input. Also, once the data is normalized, 
>shouldn't the default
>parameters yield a flat curve?
>         In any even I achieved flattening the curve by two successive 
> loess normalizations with default parameters.  Do you see any 
> disadvantage to that procedure?
>
>Thanks and best wishes,
>Rich
>------------------------------------------------------------
>Richard A. Friedman, PhD
>Associate Research Scientist
>Herbert Irving Comprehensive Cancer Center
>Oncoinformatics Core
>Lecturer
>Department of Biomedical Informatics
>Box 95, Room 130BB or P&S 1-420C
>Columbia University Medical Center
>630 W. 168th St.
>New York, NY 10032
>(212)305-6901 (5-6901) (voice)
>friedman at cancercenter.columbia.edu
>http://cancercenter.columbia.edu/~friedman/
>
>In Memoriam, Julius Schwartz
>
>On Feb 17, 2004, at 9:11 AM, Naomi Altman wrote:
>
>>Dear Richard and other participants in this discussion,
>>
>>Loess uses a kernel weight to downweight the effects of more distant data 
>>values in the local regression.  For bandwidth greater than 1, all of the 
>>data values are used, but the more distant values are still 
>>downweighted.  As the bandwidth increases, there is less downweighting.
>>
>>If you increase the bandwidth during the normalization process, the curve 
>>used for normalization gets flatter, until at very high bandwidth you are 
>>just doing ordinary linear regression.  The normalized values are the 
>>residuals from this curve.  As a result, if there is curvature on the MvA 
>>plots, large bandwidths lead to normalized data that still have curvature 
>>(since only the linear trend is removed.)  Small bandwidths lead to 
>>normalized data that are flatter.
>>
>>My understanding is that Richard then visualized the curvature in the 
>>normalized data also using loess.  On these plots we are looking at the 
>>curve fitted to the normalized data.  So, if a large bandwidth is used to 
>>fit the curves, these curves should be flat.  However, if a large 
>>bandwidth is used for normalization, and the default bandwidth is used to 
>>visualize the normalized data, there will be excess curvature in the 
>>normalized data.
>>
>>--Naomi
>>
>>
>>At 05:03 PM 2/9/2004, Richard Friedman wrote:
>>>Dear Sean (Wolfgang, Naomi, and Everybody),
>>>
>>>         The original command that I used was
>>>
>>> > ira.norm <- maNorm(ira.raw, norm ="p")\
>>>
>>>The command that I used with the altered span is
>>>
>>>
>>> > ira.f8.norm <- maNormMain(ira.raw, f.loc = list(maNormLoess(x = "maA",
>>>+               y = "maM", z= "maPrintTip", w = NULL, subset =TRUE, span 
>>>=0.8)),
>>>+               Mloc = TRUE,  Mscale = TRUE, echo =FALSE)
>>>
>>>This command still gave pronounced curvature at in the middle of one of 
>>>the printtip blocks and
>>>at the ends of several printtip blocks.
>>>I did not use a span greater than .8 because that was counteridicated 
>>>either in the
>>>micorarray or loess literature.
>>>Thank you f all for your suggestion of going to vsn. However,
>>>as this program is new to me,  I ask if anyone knows a rule of thumb as 
>>>to how flat the
>>>printtip loess line should be in order to be acceptable? I would prefer 
>>>not to change horses
>>>unless necessary
>>>
>>>Thanks and best wishes,
>>>Rich
>>>
>>>On Jan 30, 2004, at 2:48 PM, Sean Davis wrote:
>>>
>>>>Richard,
>>>>
>>>>The print-tip-loess lines should (I think) be straight and on the x-axis
>>>>(y=0) after print-tip-normalization.  If that isn't the case, perhaps you
>>>>could post exactly the commands you used to do your normalization.
>>>>That may
>>>>help people determine better what is going on.
>>>>
>>>>In reference to ridding you of intensity-dependent variability,
>>>>loess-normalization is designed to locally center the data but does not, in
>>>>itself, deal with the variability that may be intensity-dependent.
>>>>For that
>>>>problem, you may need to look into something like vsn or other scaling
>>>>method.
>>>>
>>>>Sean
>>>>
>>>>
>>>>On 1/30/04 2:35 PM, "Richard Friedman" <friedman at cancercenter.columbia.edu>
>>>>wrote:
>>>>
>>>>>Mick,
>>>>>
>>>>>Thanks for the help. What concerns  me however is not a single
>>>>>point being an outlier, but the whole loess fit to all the points leading
>>>>>the lowess curve for a few printips to deviate significantly from being
>>>>>a straight line practically colinear with the x-axis (abcissa). The two
>>>>>test cases on which I learned to use marray - the apoE data that comes
>>>>>with spot, and the swirl data that comes with marray, all had
>>>>>significantly expressed genes - however they also had flat normalized
>>>>>lowess curves. Significant curvature in the lowess curve leads me
>>>>>to be concerned that the spots associated with that region of
>>>>>the curve are improperly normalized.
>>>>>
>>>>>Can anyone out there give me:
>>>>>
>>>>>1. Guidelines as to how flat the lowess curve should be for the
>>>>>  data to be considered normalized.
>>>>>
>>>>>2. Advice as to what to do if the printtip normalization option
>>>>>  in marray did not remove intensity dependence.
>>>>>
>>>>>If anyone is willing to look at the M vs A curve, I would be grateful.
>>>>>
>>>>>Thanks and best wishes,
>>>>>Rich
>>>>>
>>>>>
>>>>>
>>>>>On Fri, 30 Jan 2004, michael watson (IAH-C) wrote:
>>>>>
>>>>>>Richard
>>>>>>
>>>>>>The nature of any normalisation means that we will always have outliers -
>>>>>>those spots that deviate from all the rest.  There could be two reasons -
>>>>>>that spot represents a differentially expressed gene or the spot is
>>>>>>unreliable and comes from a "bad" spot.
>>>>>>
>>>>>>I'd take the common sense approach to these outliers:
>>>>>>
>>>>>>i) Check any replicate spots - if all replicate spots are outliers 
>>>>>>then you
>>>>>>have evidence that it's a differentially expressed gene.  However, if the
>>>>>>replicates disagree, this is evidence that the outlier comes from an
>>>>>>unreliable / bad measurement
>>>>>>
>>>>>>ii) Go take a look at the spot on the original image.  Does it look 
>>>>>>"good"?
>>>>>>
>>>>>>You are likely always to find outliers after normalisation.  This is, 
>>>>>>after
>>>>>>all, what we are looking for, isn't it?  The key is to be able to 
>>>>>>say, when
>>>>>>you see an outlier, if that spot is of reliable quality or not.
>>>>>>
>>>>>>Thanks
>>>>>>Mick
>>>>>>
>>>>>>-----Original Message-----
>>>>>>From: Richard Friedman [mailto:friedman at cancercenter.columbia.edu]
>>>>>>Sent: 29 January 2004 22:26
>>>>>>To: 'Bioconductor Mail List'
>>>>>>Cc: IRA A TABAS
>>>>>>Subject: [BioC] M vs A plot
>>>>>>
>>>>>>
>>>>>>Dear Bioconductors,
>>>>>>
>>>>>>I have normalized a series of arrays using print-tip normalization.
>>>>>>Where as the systematic error in the unnormalized data was pronounced,
>>>>>>The systematic error on the normalized array was reduced greatly.
>>>>>>The M vs. A curve was flat for most of the 48 print-tips. However for a
>>>>>>few
>>>>>>printips, for A>12 M deviates from close to zero. in one case, M rises
>>>>>>as high
>>>>>>as M=1/2. at A=15. This only involves a small fraction of the spots (It
>>>>>>is hard to
>>>>>>estimate what proportion).
>>>>>>
>>>>>>Does this sound serious?
>>>>>>
>>>>>>If so, what should I do about it?
>>>>>>
>>>>>>Is anyone willing to look at the JPEg file (I did not attach it
>>>>>>because I don't
>>>>>>know if I am allowed to do so).
>>>>>>
>>>>>>Thanks and best wishes,
>>>>>>Rich
>>>>>>------------------------------------------------------------
>>>>>>Richard A. Friedman, PhD
>>>>>>Associate Research Scientist
>>>>>>Herbert Irving Comprehensive Cancer Center
>>>>>>Oncoinformatics Core
>>>>>>Lecturer
>>>>>>Department of Biomedical Informatics
>>>>>>Box 95, Room 130BB or P&S 1-420C
>>>>>>Columbia University Medical Center
>>>>>>630 W. 168th St.
>>>>>>New York, NY 10032
>>>>>>(212)305-6901 (5-6901) (voice)
>>>>>>friedman at cancercenter.columbia.edu
>>>>>>http://cancercenter.columbia.edu/~friedman/
>>>>>>
>>>>>>"Spring, Summer, and Winter.
>>>>>>Then Fall came along,
>>>>>>and that's the end of our song,
>>>>>>and the pigeons never hibernate at all".
>>>>>>-Rose Friedman, age 7
>>>>>>(These are the correct lyrics and supersede
>>>>>>the version previously at the end of my sig)
>>>>>>
>>>>>>_______________________________________________
>>>>>>Bioconductor mailing list
>>>>>>Bioconductor at stat.math.ethz.ch
>>>>>>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>>>>>
>>>>>------------------------------------------------------------
>>>>>Richard A. Friedman, PhD
>>>>>Associate Research Scientist
>>>>>Herbert Irving Comprehensive Cancer Center
>>>>>Oncoinformatics Core
>>>>>Lecturer
>>>>>Department of Biomedical Informatics
>>>>>Box 95, Room 130BB or P&S 1-420C
>>>>>Columbia University Medical Center
>>>>>630 W. 168th St.
>>>>>New York, NY 10032
>>>>>(212)305-6901 (5-6901) (voice)
>>>>>friedman at cancercenter.columbia.edu
>>>>>http://cancercenter.columbia.edu/~friedman/
>>>>>
>>>>>"Spring, Summer, and Winter.
>>>>>Then Fall came along,
>>>>>and that's the end of our song,
>>>>>and the pigeons never hibernate at all".
>>>>>-Rose Friedman, age 7
>>>>>(These are the correct lyrics and supersede
>>>>>the version previously at the end of my sig)
>>>>>
>>>>>_______________________________________________
>>>>>Bioconductor mailing list
>>>>>Bioconductor at stat.math.ethz.ch
>>>>>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>>>>
>>>>Naomi S. Altman                                814-865-3791 (voice)
>>>>Associate Professor
>>>>Bioinformatics Consulting Center
>>>>Dept. of Statistics                              814-863-7114 (fax)
>>>>Penn State University                         814-865-1348 (Statistics)
>>>>University Park, PA 16802-2111
>>
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor

Naomi S. Altman                                814-865-3791 (voice)
Associate Professor
Bioinformatics Consulting Center
Dept. of Statistics                              814-863-7114 (fax)
Penn State University                         814-865-1348 (Statistics)
University Park, PA 16802-2111



More information about the Bioconductor mailing list