[BioC] Re: Limma: four suggestions related to weights and background correction

Sat Jan 24 00:39:38 MET 2004

Dear Ramon,

Thanks for your constructive suggestions.  I leave today for holiday and
so won't be able to reply substantially for a while.  Quick response to 2.
below.

> Dear Gordon and other bioconductor-users,
>
> We have encountered a few recurrent problem when using limma. If I may,
> I'd  like to make four suggestions for changes which might benefit other
> people:
>
>
> 1. When "backgroundCorrect(temp1, method = "none")" is used columns for
> Rb and  Gb are eliminated from the RGList object. We have found this
> problematic,  because, even if one does not want to do background
> subtraction, one might  still want to look at image plots of background,
> just to check nothing funny  has happened with the background. The
> problem is that the subsequent step of  using "normalizeWithinArrays"
> calls MA.RG, which itself uses the presence or  absence of the Rb and Gb
> to do (or not) the subtraction. Then, if we still  want to plot
> background and do the normalization, we either have to make sure  we are
> done with all background business before we do the normalization  (since
> after that all background info will be lost) or we need to keep the
> normalized and the unnormalized objects around. None of these seem
> ideal. We  wonder if it would be possible to pass the type of
> subtraction to  "normalizeWithinArrays" and to "MA.RG" as a parameter,
> so that the original  object preservers all the information, but still
> we can do no background  subtraction. (We are hacking the code here, for
> our internal use).
>
>
> 2. We have found it slightly confussing that the M value of a point with
>  weight 0 for the normalization is not an NA. Sure, the points with
> weight = 0  do not get used to fit the loess curve, but they still have
> a residual. But,  so far, with the wet lab people we have worked, it has
> always been the case  that if a point was deemed unsuitable for the
> normalization it was also  unsuitable for further analyses (though,
> occasionally, points suitable for  fitting the normalization curve have
> been deemed unsuitable for further  analyses).

I think the current behavior is correct.  I often give spike-in ratio
controls zero weight in the normalization step, but they're definitely not
missing!  There are many reasons why zero weights are not the same as
missing.

> Are we missing something obvious here?
>
> 3. Occasionally, a complet print-tip group might be missing which will
> make  "normalizeWithinArrays" to fail. We have added the following to
> that  function, in the print-tip loess case, right before the line
> "spots <- spots  + nspots" (and substituting the direct call to
> loessFit):
>
> ******************************************
> num.fill <- length(object$M[spots, j])
> object$M[spots, j] <- rep(NA, num.fill)
> no.objects <-
>     try(object$M[spots, j]
>         <- loessFit(y, x, w, span = span,
>                     iterations = iterations)$residuals)
> if(class(no.objects) == "try-error")
>     print(paste("WARNING: in print-tip loess normalization",
>                 "no samples in array", j, "grid row", gridr,
>                 "grid column", gridc))
> ********************************************
>
>
> 4. When all weights are either 0 or 1, there are slight differences in
> the  results of the normalization (of points with weight = 1) depending
> on whether  or not the cases with weight of 0 are passed to subsequent
> calls. Sure, these  differences are irrelevant, but it can be
> disconcerting. Wouldn't it be  appropriate to check for the binary
> situation of weights either 0 or 1, and  if so, exclude points with
> weight 0, which would allow us to call directly  ".C("lowess"", instead
> of ".vsimpleLoess" (inside "loessFit")?

The problem isn't so much with limma, as with the different interpolation
methods used by lowess and loess.  The change that you suggest would make
0, 1 identical to no weights which is nice, but would then make weight
1e-6 discontinuous from weight 0 which is undesirable.  So there is no
perfect solution.

Gordon

> ***************
> The following is an example.
> (The data sets are at:
> http://bioinfo.cnio.es/~rdiaz/00G66.txt)
>
>
> temp01 <- read.maimages(files = "00G66.txt",
>                        source = "genepix",
>                        wt.fun=wtflags(0))
>
> temp01.b <- temp01
> set.to.na <- which(temp01.b$weights == 0)
> temp01.b$R[set.to.na] <- NA
> temp01.b$G[set.to.na] <- NA
>
> temp01.c <- temp01.b
> temp01.c$weights <- NULL
>
> n1c <- normalizeWithinArrays(temp01.c, layout = temp.layout,
>                              iterations = 5,
>                              method = "loess")
> n1b <- normalizeWithinArrays(temp01.b, layout = temp.layout,
>                              iterations = 5,
>                              method = "loess")
> n1 <- normalizeWithinArrays(temp01, layout = temp.layout,
>                              iterations = 5,
>                             method = "loess")
>
> pairs(cbind(n1$M, n1b$M, n1c$M))
> summary(n1$M - n1b$M)
>
>
> ******************
>
> Best,
>
> R.
>
> --
> Ramón Díaz-Uriarte
> Bioinformatics Unit
> Centro Nacional de Investigaciones Oncológicas (CNIO)
> (Spanish National Cancer Center)
> Melchor Fernández Almagro, 3
> 28029 Madrid (Spain)
> Fax: +-34-91-224-6972
> Phone: +-34-91-224-6900
>
> http://bioinfo.cnio.es/~rdiaz
> PGP KeyID: 0xE89B3462
> (http://bioinfo.cnio.es/~rdiaz/0xE89B3462.asc)