[BioC] normalizeBetweenArrays gives an error

Gordon K Smyth smyth at wehi.EDU.AU
Sat Nov 11 00:37:05 CET 2006


Dear Michael,

First of all, an incidental but important point.  The subject of your email says
"normalizeBetweenArrays gives an error" but this is not true.  Your error came when you tried to
convert an marrayNorm object into an MAList object.  You never got to the point of using
normalizeBetweenArrays.  You would have a better chance of getting help from the right people if
your email subject said "converting marrayNorm to MAList gives an error" or similar.

The purpose of the maSub slot is to record incomplete gpr files.  The fact that maSub has length
7000 with 6912 TRUE values means that the number of print-tip groups on your arrays implies 7000
spots but actually 88 of these spots are omitted from your gpr files, probably because they are
"empty" or "blank" spots.  Since you are in a Bioinformatics group, you're probably a computer
savvy person, so you can probably confirm easily that your gpr files contain only 6912 data rows.

I was the original author of the coerce function which is called by as() to convert an marrayNorm
object created by the marray package into an MAlist object used by the limma package, although I
haven't been the maintainer for several years.  It would appear that the coerce function is not
handling the maSub slot correctly.  This needs some care because limma doesn't use the maSub slot.

In your case, you have already completed print-tip loess normalization on your data object.  So
you no longer have any need to keep track of which spots are from which print-tip group.  So you
could simply set maSub(gn) <- TRUE without any problems.

Best wishes
Gordon

> Date: Thu, 9 Nov 2006 14:48:52 +0100
> From: "Michael Nuhn" <nuhn at rhrk.uni-kl.de>
> Subject: [BioC] normalizeBetweenArrays gives an error
> To: <bioconductor at stat.math.ethz.ch>
>
> Hi everyone!
>
> I was hoping that someone here might have some insight on this problem:
>
> I do normalization between arrays. For this I have written a little program
> which creates an R/Bioconductor program which does this for me. It usually
> works fine but now I have a set of GPR files where it fails. This is how it
> goes:
>
> First it loads the libraries:
>
> library(marray)
> library(convert)
>
> Then it loads the GPR files:
>
> files <- c("gpr_file__0", "gpr_file__1", "gpr_file__2")
>
> Rf_label <- "F633 Median"
> Rb_label <- "B633 Median"
> Gf_label <- "F543 Median"
> Gb_label <- "B543 Median"
>
> g <- read.GenePix(files, name.Gf = Gf_label, name.Gb = Gb_label, name.Rf =
> Rf_label, name.Rb = Rb_label)
>
> A print tip-loess normalization for every slide:
>
> gn <- maNorm(g, norm="p")
>
> Then comes the normalization between arrays. I use the
> "normalizeBetweenArrays" command here. I basically just copied the commands
> from the book "Bioinformatics and Computational Biology Solutions Using R
> and Bioconductor" on page 65:
>
> gn at maW <- matrix(0,0,0)
> gn.MA <- as(gn, "MAList")
> g.nbta <- normalizeBetweenArrays(gn.MA, method="quantile")
>
> This has always worked. But in this case I get the following message after
> the second command:
>
>> gn.MA <- as(gn, "MAList")
> Error in "$<-.data.frame"(`*tmp*`, "Sub", value = c(TRUE, TRUE, TRUE,  :
>  replacement has 7200 rows, data has 6912
> Execution halted
>
> This, at first glance rather puzzling message, seems to have something to do
> with the "maSub" Slot of the GPR files. According to the docs, this field is
> "indicating which spots are currently being considered." It is of type
> boolean and 6912 values are set to TRUE and the others are FALSE. Huh? Why
> aren't all spots being considered?
>
> If I insert
>
> maSub(gn)<-TRUE
>
> before the command that fails, the normalization seems to work just fine.
>
> However, I am wondering where the values from maSub come from and if it is a
> good idea to just override them.
>
> Does anybody know this?
>
> Thanks in advance for hints,
> Michael.
>
> --
> -----------------------------------------------------------
> Dipl.-Inform. Michael Nuhn
> Bioinformatik
> Zentrum f?r Nanostrukturtechnologie und
> Molekularbiologische Technologie
>
> +49 (0)631 - 205 4334
> nuhn at rhrk.uni-kl.de
> http://nbc3.biologie.uni-kl.de/
>
>
>
> ------------------------------



More information about the Bioconductor mailing list