[BioC] marrayLayout difficulties

Jean Yee Hwa Yang jean at biostat.ucsf.edu
Fri Oct 1 09:52:18 CEST 2004


Hi Jeremy,

That sounds very slow from my experience.  Which image analysis software
did you get your data from?  If you send me an example file off-line, I
will take a look at it for you, I need to take a look to see if maSub was
set properly, as this does make a big different in print-tip
normalization.
 
Alternatively, try the latest verion 1.5.17 that is temporary place at
http://arrays.ucsf.edu/software/

maNorm was previously very slow for global lowess normalization for larget
number of spots but in the new version, we have speed up the code with
sampling.  However, I don't think this was your problem.

I will also suggest trying the swirl data within the marray package and
see how long that take on yoru computer

data(swirl)
norm <- maNorm(swirl)

If that takes a min or so that there is something wrong with your data
setup.

Cheers

Jean


On Thu, 30 Sep 2004, Jeremy Gollub wrote:

> Hi, all -
> 
> I'm experiencing very poor performance using the marray package (20
> minutes to normalize a single <32,000 spot microarray).  Can someone
> tell me whether this is normal, or what I'm doing wrong?
> 
> In the process of hunting down some errors, I also noticed some odd (to
> me) behavior in the marrayLayout maSub slot assignment method, described
> below.  An attempt to "correct" this results in a much faster
> normalization (~1 minute), which looks good according to the MA plot
> but produces different numbers in maM than the slower calculation.
> 
> It seems unlikely that either result is correct (I can choose between
> suspiciously bad performance, or messing with the marrayLayout object's
> internals).
> 
> Thanks for any suggestions - details follow.
> 
> I'm using R version 1.9.0 on a sparc system running Solaris 2.9.  My
> marray version is 1.5.14.
> 
> I have a text file, "dat.txt," containing the data I want to normalize.
> 10 columns, all numeric: in order,
> 	FEATURE		spot number 1 - 31736
> 	SECTOR		unnecessary and unused
> 	ROW		"
> 	COL		"
> 	PLATE		ID of printing plate
> 	Gf		green channel foreground
> 	Rf		red channel foreground
> 	Gb		green channel background
> 	Rb		red channel background
> 	W		spot weights, either 0 or 1
> 
> Array parameters are: Ngr = 8, Ngc = 4, Nsr = 31, Nsc = 32, Nspots =
> 31744.  Not all spots are printed (ragged ends to each block).  Only
> printed spots are included in the data file, so there are gaps in the
> FEATURE column sequence but no blank lines in the file.
> 
> The session:
> 
> > library(marray)
> >
> > # Read file.
> >
> > dat <- read.table('dat.txt', header = TRUE)
> > 
> > # Construct maSub: 1 for each printed spot, 0 for absent spots.
> >
> > seq <- c(1:31744)
> > int <- intersect(seq, as.numeric(dat[,1]))
> > sub <- rep(0, 31744)
> > sub[int] <- 1
> > 
> > # Note contents of sub around the end of the first block and beginning
> > # of the second:
> >
> > print(sub[980:1000])
> [1] 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
> > # total of 31488 present spots
> > sum(sub)
> [1] 31488
> > 
> > # Construct marrayLayout object.
> >
> > ml <- new("marrayLayout", maNgr = 8, maNgc = 4, maNsr = 31, maNsc = 32,
> +           maNspots = 31744)
> > maSub(ml) <- sub
> > maPlate(ml) <- as.factor(dat[,5])
> > 
> > # Note contents of maSub:
> >
> > sum(ml at maSub)
> [1] 1
> > length(ml at maSub)
> [1] 31744
> > print(sub[1:20])
>  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
> > print(ml at maSub[1:20])
>  [1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> FALSE
> [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> > print(ml at maSub[980:1000])
>  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> FALSE
> [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> > 
> > # Now meddle with ml at maSub (set it back the way I think it should be).
> > # Or don't - see comment on maNormMain step, below.
> >
> > maSub(ml)[int] <- TRUE
> > 
> > # construct marrayRaw object.
> >
> > mr <- new("marrayRaw",
> +         maGf = matrix(dat[,6], ncol = 1),
> +         maRf = matrix(dat[,7], ncol = 1),
> +         maGb = matrix(dat[,8], ncol = 1),
> +         maRb = matrix(dat[,9], ncol = 1),
> +         maW =  matrix(dat[,10], ncol = 1),
> +         maLayout = ml)
> >
> > # This step takes about one minute if I do maSub(ml)[int] <- TRUE
> > # as indicated above.  If I don't, it takes about 20 minutes.
> > # The results differ, although the MA plot looks normalized either way.
> >
> > mn <- maNormMain(mr, f.loc = list(maNormLoess(x="maA", y="maM",
> +                         z="maPrintTip", w=NULL, subset=TRUE, span =
> 0.4)),
> +                  f.scale = list(maNormMAD(x = "maPrintTip", y = "maM",
> +                         geo = FALSE, subset = TRUE)),
> +                 Mloc = TRUE, Mscale = TRUE)
> 
> --
> Jeremy Gollub, Ph.D.
> jgollub at genome.stanford.edu
> (W) 650/736-0075
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>



More information about the Bioconductor mailing list