[BioC] M vs A plot
Naomi Altman
naomi at stat.psu.edu
Wed Feb 25 04:06:24 MET 2004
If the residuals are smoothed using the same bandwidth as the original
curve, the result should be flat. I am puzzled as well.
Smoothing twice is the same as using a larger bandwidth and a somewhat
different kernel weight. It should be fine.
--Naomi
At 10:43 AM 2/17/2004, Richard Friedman wrote:
>Dear Naomi (and everybody),
>
> Thank you for your reply,
> Since the normalized curve is displayed with the same overall
> command that performed the
>normalization it is not clear to me why you suggest that the display curve
>is fitted with a different
>bandwith that that which was input. Also, once the data is normalized,
>shouldn't the default
>parameters yield a flat curve?
> In any even I achieved flattening the curve by two successive
> loess normalizations with default parameters. Do you see any
> disadvantage to that procedure?
>
>Thanks and best wishes,
>Rich
>------------------------------------------------------------
>Richard A. Friedman, PhD
>Associate Research Scientist
>Herbert Irving Comprehensive Cancer Center
>Oncoinformatics Core
>Lecturer
>Department of Biomedical Informatics
>Box 95, Room 130BB or P&S 1-420C
>Columbia University Medical Center
>630 W. 168th St.
>New York, NY 10032
>(212)305-6901 (5-6901) (voice)
>friedman at cancercenter.columbia.edu
>http://cancercenter.columbia.edu/~friedman/
>
>In Memoriam, Julius Schwartz
>
>On Feb 17, 2004, at 9:11 AM, Naomi Altman wrote:
>
>>Dear Richard and other participants in this discussion,
>>
>>Loess uses a kernel weight to downweight the effects of more distant data
>>values in the local regression. For bandwidth greater than 1, all of the
>>data values are used, but the more distant values are still
>>downweighted. As the bandwidth increases, there is less downweighting.
>>
>>If you increase the bandwidth during the normalization process, the curve
>>used for normalization gets flatter, until at very high bandwidth you are
>>just doing ordinary linear regression. The normalized values are the
>>residuals from this curve. As a result, if there is curvature on the MvA
>>plots, large bandwidths lead to normalized data that still have curvature
>>(since only the linear trend is removed.) Small bandwidths lead to
>>normalized data that are flatter.
>>
>>My understanding is that Richard then visualized the curvature in the
>>normalized data also using loess. On these plots we are looking at the
>>curve fitted to the normalized data. So, if a large bandwidth is used to
>>fit the curves, these curves should be flat. However, if a large
>>bandwidth is used for normalization, and the default bandwidth is used to
>>visualize the normalized data, there will be excess curvature in the
>>normalized data.
>>
>>--Naomi
>>
>>
>>At 05:03 PM 2/9/2004, Richard Friedman wrote:
>>>Dear Sean (Wolfgang, Naomi, and Everybody),
>>>
>>> The original command that I used was
>>>
>>> > ira.norm <- maNorm(ira.raw, norm ="p")\
>>>
>>>The command that I used with the altered span is
>>>
>>>
>>> > ira.f8.norm <- maNormMain(ira.raw, f.loc = list(maNormLoess(x = "maA",
>>>+ y = "maM", z= "maPrintTip", w = NULL, subset =TRUE, span
>>>=0.8)),
>>>+ Mloc = TRUE, Mscale = TRUE, echo =FALSE)
>>>
>>>This command still gave pronounced curvature at in the middle of one of
>>>the printtip blocks and
>>>at the ends of several printtip blocks.
>>>I did not use a span greater than .8 because that was counteridicated
>>>either in the
>>>micorarray or loess literature.
>>>Thank you f all for your suggestion of going to vsn. However,
>>>as this program is new to me, I ask if anyone knows a rule of thumb as
>>>to how flat the
>>>printtip loess line should be in order to be acceptable? I would prefer
>>>not to change horses
>>>unless necessary
>>>
>>>Thanks and best wishes,
>>>Rich
>>>
>>>On Jan 30, 2004, at 2:48 PM, Sean Davis wrote:
>>>
>>>>Richard,
>>>>
>>>>The print-tip-loess lines should (I think) be straight and on the x-axis
>>>>(y=0) after print-tip-normalization. If that isn't the case, perhaps you
>>>>could post exactly the commands you used to do your normalization.
>>>>That may
>>>>help people determine better what is going on.
>>>>
>>>>In reference to ridding you of intensity-dependent variability,
>>>>loess-normalization is designed to locally center the data but does not, in
>>>>itself, deal with the variability that may be intensity-dependent.
>>>>For that
>>>>problem, you may need to look into something like vsn or other scaling
>>>>method.
>>>>
>>>>Sean
>>>>
>>>>
>>>>On 1/30/04 2:35 PM, "Richard Friedman" <friedman at cancercenter.columbia.edu>
>>>>wrote:
>>>>
>>>>>Mick,
>>>>>
>>>>>Thanks for the help. What concerns me however is not a single
>>>>>point being an outlier, but the whole loess fit to all the points leading
>>>>>the lowess curve for a few printips to deviate significantly from being
>>>>>a straight line practically colinear with the x-axis (abcissa). The two
>>>>>test cases on which I learned to use marray - the apoE data that comes
>>>>>with spot, and the swirl data that comes with marray, all had
>>>>>significantly expressed genes - however they also had flat normalized
>>>>>lowess curves. Significant curvature in the lowess curve leads me
>>>>>to be concerned that the spots associated with that region of
>>>>>the curve are improperly normalized.
>>>>>
>>>>>Can anyone out there give me:
>>>>>
>>>>>1. Guidelines as to how flat the lowess curve should be for the
>>>>> data to be considered normalized.
>>>>>
>>>>>2. Advice as to what to do if the printtip normalization option
>>>>> in marray did not remove intensity dependence.
>>>>>
>>>>>If anyone is willing to look at the M vs A curve, I would be grateful.
>>>>>
>>>>>Thanks and best wishes,
>>>>>Rich
>>>>>
>>>>>
>>>>>
>>>>>On Fri, 30 Jan 2004, michael watson (IAH-C) wrote:
>>>>>
>>>>>>Richard
>>>>>>
>>>>>>The nature of any normalisation means that we will always have outliers -
>>>>>>those spots that deviate from all the rest. There could be two reasons -
>>>>>>that spot represents a differentially expressed gene or the spot is
>>>>>>unreliable and comes from a "bad" spot.
>>>>>>
>>>>>>I'd take the common sense approach to these outliers:
>>>>>>
>>>>>>i) Check any replicate spots - if all replicate spots are outliers
>>>>>>then you
>>>>>>have evidence that it's a differentially expressed gene. However, if the
>>>>>>replicates disagree, this is evidence that the outlier comes from an
>>>>>>unreliable / bad measurement
>>>>>>
>>>>>>ii) Go take a look at the spot on the original image. Does it look
>>>>>>"good"?
>>>>>>
>>>>>>You are likely always to find outliers after normalisation. This is,
>>>>>>after
>>>>>>all, what we are looking for, isn't it? The key is to be able to
>>>>>>say, when
>>>>>>you see an outlier, if that spot is of reliable quality or not.
>>>>>>
>>>>>>Thanks
>>>>>>Mick
>>>>>>
>>>>>>-----Original Message-----
>>>>>>From: Richard Friedman [mailto:friedman at cancercenter.columbia.edu]
>>>>>>Sent: 29 January 2004 22:26
>>>>>>To: 'Bioconductor Mail List'
>>>>>>Cc: IRA A TABAS
>>>>>>Subject: [BioC] M vs A plot
>>>>>>
>>>>>>
>>>>>>Dear Bioconductors,
>>>>>>
>>>>>>I have normalized a series of arrays using print-tip normalization.
>>>>>>Where as the systematic error in the unnormalized data was pronounced,
>>>>>>The systematic error on the normalized array was reduced greatly.
>>>>>>The M vs. A curve was flat for most of the 48 print-tips. However for a
>>>>>>few
>>>>>>printips, for A>12 M deviates from close to zero. in one case, M rises
>>>>>>as high
>>>>>>as M=1/2. at A=15. This only involves a small fraction of the spots (It
>>>>>>is hard to
>>>>>>estimate what proportion).
>>>>>>
>>>>>>Does this sound serious?
>>>>>>
>>>>>>If so, what should I do about it?
>>>>>>
>>>>>>Is anyone willing to look at the JPEg file (I did not attach it
>>>>>>because I don't
>>>>>>know if I am allowed to do so).
>>>>>>
>>>>>>Thanks and best wishes,
>>>>>>Rich
>>>>>>------------------------------------------------------------
>>>>>>Richard A. Friedman, PhD
>>>>>>Associate Research Scientist
>>>>>>Herbert Irving Comprehensive Cancer Center
>>>>>>Oncoinformatics Core
>>>>>>Lecturer
>>>>>>Department of Biomedical Informatics
>>>>>>Box 95, Room 130BB or P&S 1-420C
>>>>>>Columbia University Medical Center
>>>>>>630 W. 168th St.
>>>>>>New York, NY 10032
>>>>>>(212)305-6901 (5-6901) (voice)
>>>>>>friedman at cancercenter.columbia.edu
>>>>>>http://cancercenter.columbia.edu/~friedman/
>>>>>>
>>>>>>"Spring, Summer, and Winter.
>>>>>>Then Fall came along,
>>>>>>and that's the end of our song,
>>>>>>and the pigeons never hibernate at all".
>>>>>>-Rose Friedman, age 7
>>>>>>(These are the correct lyrics and supersede
>>>>>>the version previously at the end of my sig)
>>>>>>
>>>>>>_______________________________________________
>>>>>>Bioconductor mailing list
>>>>>>Bioconductor at stat.math.ethz.ch
>>>>>>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>>>>>
>>>>>------------------------------------------------------------
>>>>>Richard A. Friedman, PhD
>>>>>Associate Research Scientist
>>>>>Herbert Irving Comprehensive Cancer Center
>>>>>Oncoinformatics Core
>>>>>Lecturer
>>>>>Department of Biomedical Informatics
>>>>>Box 95, Room 130BB or P&S 1-420C
>>>>>Columbia University Medical Center
>>>>>630 W. 168th St.
>>>>>New York, NY 10032
>>>>>(212)305-6901 (5-6901) (voice)
>>>>>friedman at cancercenter.columbia.edu
>>>>>http://cancercenter.columbia.edu/~friedman/
>>>>>
>>>>>"Spring, Summer, and Winter.
>>>>>Then Fall came along,
>>>>>and that's the end of our song,
>>>>>and the pigeons never hibernate at all".
>>>>>-Rose Friedman, age 7
>>>>>(These are the correct lyrics and supersede
>>>>>the version previously at the end of my sig)
>>>>>
>>>>>_______________________________________________
>>>>>Bioconductor mailing list
>>>>>Bioconductor at stat.math.ethz.ch
>>>>>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>>>>
>>>>Naomi S. Altman 814-865-3791 (voice)
>>>>Associate Professor
>>>>Bioinformatics Consulting Center
>>>>Dept. of Statistics 814-863-7114 (fax)
>>>>Penn State University 814-865-1348 (Statistics)
>>>>University Park, PA 16802-2111
>>
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
Naomi S. Altman 814-865-3791 (voice)
Associate Professor
Bioinformatics Consulting Center
Dept. of Statistics 814-863-7114 (fax)
Penn State University 814-865-1348 (Statistics)
University Park, PA 16802-2111
More information about the Bioconductor
mailing list