[BioC] Issue with limma and normalization of Agilent data generated with a 20-bit scan
White, Peter
Peter.White at nationwidechildrens.org
Tue Mar 16 03:26:01 CET 2010
One more PNG showing the difference between using span values of 0.1 or 0.01.
> -----Original Message----
> From: michael watson (IAH-C) [mailto:michael.watson at bbsrc.ac.uk]
> Sent: Monday, March 15, 2010 6:30 PM
> To: Wolfgang Huber; White, Peter
> Cc: 'Gordon K Smyth'; 'Bioconductor mailing list'
> Subject: RE: [BioC] Issue with limma and normalization of Agilent data
> generated with a 20-bit scan
>
> I think what Wolfgang is saying is that the data is so affected by
> technical bias at the tail that even if you could get loess
> normalisation to get that tail straight, you might not want to believe
> anything that comes from there as the data is unreliable.
>
>
> I have no idea why the ready built functions don't touch your tail, but
> you loess normalisation isn't *that* much of a complicated procedure -
> you should be able to fit a model using the loess() function and do the
> normalisation yourself.
> ________________________________________
> From: bioconductor-bounces at stat.math.ethz.ch [bioconductor-
> bounces at stat.math.ethz.ch] On Behalf Of Wolfgang Huber [whuber at embl.de]
> Sent: 15 March 2010 22:03
> To: White, Peter
> Cc: 'Gordon K Smyth'; 'Bioconductor mailing list'
> Subject: Re: [BioC] Issue with limma and normalization of Agilent data
> generated with a 20-bit scan
>
> Dear Peter
>
> what is the "saturation point"?
>
> Non-linear response / saturation may occur even well below the nominal
> maximal value (2^20-1) of the detector, and perhaps this need not even
> be related to the detector, but rather to other steps in the process.
> How else do you explain the shape of the data before normalisation?
> (Try also looking at the data in the normal scatterplot.)
>
> Best wishes
> Wolfgang
>
>
> Il giorno Mar 15, 2010, alle ore 10:47 PM, White, Peter ha scritto:
>
> Hi Wolfgang,
>
> So with the new scanner from Agilent this data is not saturated. The
> scanner went from 16-bit (0-65,000) to 20-bit (0-1,048,576). All of
> these values are well below the new saturation point, yet they are not
> being normalized.
>
> Thanks,
>
> Peter
>
> > -----Original Message-----
> > From: Wolfgang Huber [mailto:whuber at embl.de]
> > Sent: Monday, March 15, 2010 5:25 PM
> > To: White, Peter
> > Cc: 'Gordon K Smyth'; 'Bioconductor mailing list'
> > Subject: Re: [BioC] Issue with limma and normalization of Agilent
> data
> > generated with a 20-bit scan
> >
> >
> > Dear Peter
> >
> > have you tried with different (i.e. smaller) values of the "span"
> > parameter for the loess fit?
> >
> > The data seem badly saturated... I'd prefer avoiding the kind of
> > saturation such as seen in the data you posted by better settings of
> > the
> > scanner, rather than doing post hoc loess normalisation.
> >
> > Best wishes
> > Wolfgang
> >
> >
> > White, Peter scripsit 15/03/10 15:53:
> >> Dear Gordon,
> >>
> >> The plots are visible in the blog view on gmane.org:
> >>
> >>
> >
> http://permalink.gmane.org/gmane.science.biology.informatics.conductor/
> > 27731
> >>
> >> I thought you may be on to something with the weights but I tried it
> > with and without a flag function (also double checked the Agilent
> file
> > and the high intensity spots are not flagged). It really does look
> like
> > the loess is just not fitted beyond for elements with an A value >
> > 16??? These 20-bit scans from Agilent are quite new and I suspect
> most
> > folks with just use the Agilent normalized data rather than starting
> > with the raw data, so maybe this just hasn't been observed before
> now?
> >>
> >> Thanks,
> >>
> >> Peter
> >>
> >> Below is the code I used:
> >>
> >> library(limma)
> >> agilentFiles <- list.files(pattern="U")
> >> rawObj <- read.maimages(agilentFiles,
> >> columns = list(G = "gMedianSignal", Gb = "gBGMedianSignal",
> >> R = "rMedianSignal", Rb = "rBGMedianSignal"),
> >> annotation= c("ProbeName", "SystematicName","ControlType"))
> >> #Remove spike controls and remove background signals
> >> bgObj <- rawObj
> >> posControls <- grep(T,rawObj$genes$ControlType == 1)
> >> bgObj$G[posControls,] <- NA
> >> bgObj$R[posControls,] <- NA
> >> bgObj$Gb <- bgObj$Rb <- NULL
> >> #Loess normalize
> >> normObj <- normalizeWithinArrays(bgObj, method="loess",
> > weights=NULL)
> >> #Plot MvA
> >> for (i in 1:ncol(normObj)) {
> >> figureName <- paste(i, " MvA Plots")
> >> mat <- matrix(c(3,1,2),nrow=3,ncol=1)
> >> layout(mat,heights=c(1,10,10))
> >> plotMA(rawObj, array=i, main = "Pre-Normalization MvA",
> >> ylim=c(-3.5,3.5), zero.weights=TRUE)
> >> abline(0,0)
> >> plotMA(normObj, array=i, main = "Normalized MvA",
> >> ylim=c(-3.5,3.5), zero.weights=TRUE)
> >> abline(0,0)
> >> layout(1)
> >> mtext(figureName, cex=1.25, line=3)
> >> savePlot(filename=figureName, type=c("png"), device=dev.cur())
> >> }
> >>
> >>> sessionInfo()
> >> R version 2.10.1 (2009-12-14)
> >> i386-pc-mingw32
> >>
> >> locale:
> >> [1] LC_COLLATE=English_United States.1252
> >> [2] LC_CTYPE=English_United States.1252
> >> [3] LC_MONETARY=English_United States.1252
> >> [4] LC_NUMERIC=C
> >> [5] LC_TIME=English_United States.1252
> >>
> >> attached base packages:
> >> [1] grDevices datasets splines graphics stats tcltk
> utils
> >> [8] methods base
> >>
> >> other attached packages:
> >> [1] limma_3.2.2 svSocket_0.9-48 TinnR_1.0.3 R2HTML_1.59-1
> >> [5] Hmisc_3.7-0 survival_2.35-9
> >>
> >> loaded via a namespace (and not attached):
> >> [1] cluster_1.12.1 grid_2.10.1 lattice_0.18-3 svMisc_0.9-56
> > tools_2.10.1
> >>
> >>> -----Original Message-----
> >>> From: Gordon K Smyth [mailto:smyth at wehi.EDU.AU]
> >>> Sent: Saturday, March 13, 2010 6:39 PM
> >>> To: White, Peter
> >>> Cc: Bioconductor mailing list
> >>> Subject: [BioC] Issue with limma and normalization of Agilent data
> >>> generated with a 20-bit scan
> >>>
> >>> Dear Peter,
> >>>
> >>> You can't send attachments to the Bioconductor mailing list, so I
> > have
> >>> not
> >>> seen your plots. However I am not aware of any issue such as you
> >>> describe. The limma function normalizeWithinArrays includes all
> > spots
> >>> in
> >>> the normalization, regardless of how large the A-value is. You
> > haven't
> >>> shown us any code, or any problem we can reproduce, so we can't
> tell
> >>> whether or not you're doing something wrong. We don't know whether
> >>> you're
> >>> using probe weights, whether you've filtered control spots, etc
> etc.
> >>>
> >>> Best wishes
> >>> Gordon
> >>>
> >>>> Date: Fri, 12 Mar 2010 10:21:41 -0500
> >>>> From: "White, Peter" <Peter.White at nationwidechildrens.org>
> >>>> To: "'bioconductor at stat.math.ethz.ch'"
> >>>> <bioconductor at stat.math.ethz.ch>
> >>>> Subject: [BioC] Issue with limma and normalization of Agilent data
> >>>> generated with a 20-bit scan
> >>>> Content-Type: text/plain
> >>>>
> >>>> I have noticed an issue with the limma normalizeWithinArrays
> > function
> >>>> (and also with marray and maNorm). When normalizing two color data
> >>>> generated with an Agilent 20-bt scanner it fails to normalize the
> >>> high
> >>>> intensity data (i.e. any points with an A value > 16). In our
> > dataset
> >>> we
> >>>> have in excess of 400 elements with red and green intensities
> > ranging
> >>>> from 65500 to 475100. When we loess normalize the data any points
> >>> beyond
> >>>> A=16 appear to be untouched by the normalization. If the attached
> >>>> figures come through this should be clear - when using maNorm and
> >>> maPlot
> >>>> it will plot the loess line and you can see it stop at 16.
> >>>>
> >>>> Is it possible for loess normalization to be extended to this
> > higher
> >>>> intensity data? Or am I just doing something wrong?
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Peter
> >>>>
> >>>>
> >>>> Peter White, Ph.D.
> >>>> Director, Biomedical Genomics
> > Core<http://genomics.nchresearch.org/>
> >>>> Research Assistant Professor of Pediatrics
> >>>> The Research Institute at
> >>>> Nationwide Children's Hospital and
> >>>> The Ohio State University
> >>>>
> >>>> Mailing Address:
> >>>>
> >>>> The Research Institute at
> >>>> Nationwide Children's Hospital
> >>>> 700 Children's Drive, W510
> >>>> Columbus, OH 43205
> >>>>
> >>>> Assistant (Jennifer Neelans): (614) 722-2915
> >>>> Office: (614) 355-2671
> >>>> Lab: (614) 355-5252
> >>>> Fax: (614) 722-2818
> >>>> Web: http://genomics.nchresearch.org/
> >>>
> >
> ______________________________________________________________________
> >>> The information in this email is confidential and intended solely
> > for
> >>> the addressee.
> >>> You must not disclose, forward, print or use it without the
> > permission
> >>> of the sender.
> >>>
> >
> ______________________________________________________________________
> >>
> >> Confidentiality Notice: The following mail message, including any
> > attachments, is for the sole use of the intended recipient(s) and may
> > contain confidential and privileged information. The recipient is
> > responsible to maintain the confidentiality of this information and
> to
> > use the information only for authorized purposes. If you are not the
> > intended recipient (or authorized to receive information for the
> > intended recipient), you are hereby notified that any review, use,
> > disclosure, distribution, copying, printing, or action taken in
> > reliance on the contents of this e-mail is strictly prohibited. If
> you
> > have received this communication in error, please notify Nationwide
> > Children's Hospital immediately by replying to this e-mail and
> destroy
> > all copies of the original message. Thank you.
> >>
> >>
> >>
> >>
> >> --------------------------------------------------------------------
> -
> > ---
> >>
> >>
> >> --------------------------------------------------------------------
> -
> > ---
> >>
> >> _______________________________________________
> >> Bioconductor mailing list
> >> Bioconductor at stat.math.ethz.ch
> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> Search the archives:
> > http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
> >
> > --
> >
> > Best wishes
> > Wolfgang
> >
> >
> > --
> > Wolfgang Huber
> > EMBL
> > http://www.embl.de/research/units/genome_biology/huber/contact
> >
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Agilent Span 0.1 vs 0.01 MvA Plots.png
Type: image/png
Size: 9854 bytes
Desc: Agilent Span 0.1 vs 0.01 MvA Plots.png
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20100315/2f849782/attachment.png>
More information about the Bioconductor
mailing list