[BioC] Re: Limma: four suggestions related to weights and
background correction
Gordon K Smyth
smyth at wehi.EDU.AU
Sat Jan 24 00:39:38 MET 2004
Dear Ramon,
Thanks for your constructive suggestions. I leave today for holiday and
so won't be able to reply substantially for a while. Quick response to 2.
below.
> Dear Gordon and other bioconductor-users,
>
> We have encountered a few recurrent problem when using limma. If I may,
> I'd like to make four suggestions for changes which might benefit other
> people:
>
>
> 1. When "backgroundCorrect(temp1, method = "none")" is used columns for
> Rb and Gb are eliminated from the RGList object. We have found this
> problematic, because, even if one does not want to do background
> subtraction, one might still want to look at image plots of background,
> just to check nothing funny has happened with the background. The
> problem is that the subsequent step of using "normalizeWithinArrays"
> calls MA.RG, which itself uses the presence or absence of the Rb and Gb
> to do (or not) the subtraction. Then, if we still want to plot
> background and do the normalization, we either have to make sure we are
> done with all background business before we do the normalization (since
> after that all background info will be lost) or we need to keep the
> normalized and the unnormalized objects around. None of these seem
> ideal. We wonder if it would be possible to pass the type of
> subtraction to "normalizeWithinArrays" and to "MA.RG" as a parameter,
> so that the original object preservers all the information, but still
> we can do no background subtraction. (We are hacking the code here, for
> our internal use).
>
>
> 2. We have found it slightly confussing that the M value of a point with
> weight 0 for the normalization is not an NA. Sure, the points with
> weight = 0 do not get used to fit the loess curve, but they still have
> a residual. But, so far, with the wet lab people we have worked, it has
> always been the case that if a point was deemed unsuitable for the
> normalization it was also unsuitable for further analyses (though,
> occasionally, points suitable for fitting the normalization curve have
> been deemed unsuitable for further analyses).
I think the current behavior is correct. I often give spike-in ratio
controls zero weight in the normalization step, but they're definitely not
missing! There are many reasons why zero weights are not the same as
missing.
> Are we missing something obvious here?
>
> 3. Occasionally, a complet print-tip group might be missing which will
> make "normalizeWithinArrays" to fail. We have added the following to
> that function, in the print-tip loess case, right before the line
> "spots <- spots + nspots" (and substituting the direct call to
> loessFit):
>
> ******************************************
> num.fill <- length(object$M[spots, j])
> object$M[spots, j] <- rep(NA, num.fill)
> no.objects <-
> try(object$M[spots, j]
> <- loessFit(y, x, w, span = span,
> iterations = iterations)$residuals)
> if(class(no.objects) == "try-error")
> print(paste("WARNING: in print-tip loess normalization",
> "no samples in array", j, "grid row", gridr,
> "grid column", gridc))
> ********************************************
>
>
> 4. When all weights are either 0 or 1, there are slight differences in
> the results of the normalization (of points with weight = 1) depending
> on whether or not the cases with weight of 0 are passed to subsequent
> calls. Sure, these differences are irrelevant, but it can be
> disconcerting. Wouldn't it be appropriate to check for the binary
> situation of weights either 0 or 1, and if so, exclude points with
> weight 0, which would allow us to call directly ".C("lowess"", instead
> of ".vsimpleLoess" (inside "loessFit")?
The problem isn't so much with limma, as with the different interpolation
methods used by lowess and loess. The change that you suggest would make
0, 1 identical to no weights which is nice, but would then make weight
1e-6 discontinuous from weight 0 which is undesirable. So there is no
perfect solution.
Gordon
> ***************
> The following is an example.
> (The data sets are at:
> http://bioinfo.cnio.es/~rdiaz/00G66.txt)
>
>
> temp01 <- read.maimages(files = "00G66.txt",
> source = "genepix",
> wt.fun=wtflags(0))
>
> temp01.b <- temp01
> set.to.na <- which(temp01.b$weights == 0)
> temp01.b$R[set.to.na] <- NA
> temp01.b$G[set.to.na] <- NA
>
> temp01.c <- temp01.b
> temp01.c$weights <- NULL
>
> n1c <- normalizeWithinArrays(temp01.c, layout = temp.layout,
> iterations = 5,
> method = "loess")
> n1b <- normalizeWithinArrays(temp01.b, layout = temp.layout,
> iterations = 5,
> method = "loess")
> n1 <- normalizeWithinArrays(temp01, layout = temp.layout,
> iterations = 5,
> method = "loess")
>
> pairs(cbind(n1$M, n1b$M, n1c$M))
> summary(n1$M - n1b$M)
>
>
> ******************
>
> Best,
>
> R.
>
> --
> Ramón Díaz-Uriarte
> Bioinformatics Unit
> Centro Nacional de Investigaciones Oncológicas (CNIO)
> (Spanish National Cancer Center)
> Melchor Fernández Almagro, 3
> 28029 Madrid (Spain)
> Fax: +-34-91-224-6972
> Phone: +-34-91-224-6900
>
> http://bioinfo.cnio.es/~rdiaz
> PGP KeyID: 0xE89B3462
> (http://bioinfo.cnio.es/~rdiaz/0xE89B3462.asc)
More information about the Bioconductor
mailing list