[BioC] marrayLayout difficulties

Jeremy Gollub jgollub at genome.stanford.edu
Fri Oct 1 07:47:29 CEST 2004


Hi, all -

I'm experiencing very poor performance using the marray package (20
minutes to normalize a single <32,000 spot microarray).  Can someone
tell me whether this is normal, or what I'm doing wrong?

In the process of hunting down some errors, I also noticed some odd (to
me) behavior in the marrayLayout maSub slot assignment method, described
below.  An attempt to "correct" this results in a much faster
normalization (~1 minute), which looks good according to the MA plot
but produces different numbers in maM than the slower calculation.

It seems unlikely that either result is correct (I can choose between
suspiciously bad performance, or messing with the marrayLayout object's
internals).

Thanks for any suggestions - details follow.

I'm using R version 1.9.0 on a sparc system running Solaris 2.9.  My
marray version is 1.5.14.

I have a text file, "dat.txt," containing the data I want to normalize.
10 columns, all numeric: in order,
	FEATURE		spot number 1 - 31736
	SECTOR		unnecessary and unused
	ROW		"
	COL		"
	PLATE		ID of printing plate
	Gf		green channel foreground
	Rf		red channel foreground
	Gb		green channel background
	Rb		red channel background
	W		spot weights, either 0 or 1

Array parameters are: Ngr = 8, Ngc = 4, Nsr = 31, Nsc = 32, Nspots =
31744.  Not all spots are printed (ragged ends to each block).  Only
printed spots are included in the data file, so there are gaps in the
FEATURE column sequence but no blank lines in the file.

The session:

> library(marray)
>
> # Read file.
>
> dat <- read.table('dat.txt', header = TRUE)
> 
> # Construct maSub: 1 for each printed spot, 0 for absent spots.
>
> seq <- c(1:31744)
> int <- intersect(seq, as.numeric(dat[,1]))
> sub <- rep(0, 31744)
> sub[int] <- 1
> 
> # Note contents of sub around the end of the first block and beginning
> # of the second:
>
> print(sub[980:1000])
[1] 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
> # total of 31488 present spots
> sum(sub)
[1] 31488
> 
> # Construct marrayLayout object.
>
> ml <- new("marrayLayout", maNgr = 8, maNgc = 4, maNsr = 31, maNsc = 32,
+           maNspots = 31744)
> maSub(ml) <- sub
> maPlate(ml) <- as.factor(dat[,5])
> 
> # Note contents of maSub:
>
> sum(ml at maSub)
[1] 1
> length(ml at maSub)
[1] 31744
> print(sub[1:20])
 [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
> print(ml at maSub[1:20])
 [1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> print(ml at maSub[980:1000])
 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> 
> # Now meddle with ml at maSub (set it back the way I think it should be).
> # Or don't - see comment on maNormMain step, below.
>
> maSub(ml)[int] <- TRUE
> 
> # construct marrayRaw object.
>
> mr <- new("marrayRaw",
+         maGf = matrix(dat[,6], ncol = 1),
+         maRf = matrix(dat[,7], ncol = 1),
+         maGb = matrix(dat[,8], ncol = 1),
+         maRb = matrix(dat[,9], ncol = 1),
+         maW =  matrix(dat[,10], ncol = 1),
+         maLayout = ml)
>
> # This step takes about one minute if I do maSub(ml)[int] <- TRUE
> # as indicated above.  If I don't, it takes about 20 minutes.
> # The results differ, although the MA plot looks normalized either way.
>
> mn <- maNormMain(mr, f.loc = list(maNormLoess(x="maA", y="maM",
+                         z="maPrintTip", w=NULL, subset=TRUE, span =
0.4)),
+                  f.scale = list(maNormMAD(x = "maPrintTip", y = "maM",
+                         geo = FALSE, subset = TRUE)),
+                 Mloc = TRUE, Mscale = TRUE)

--
Jeremy Gollub, Ph.D.
jgollub at genome.stanford.edu
(W) 650/736-0075



More information about the Bioconductor mailing list