[BioC] marrayLayout difficulties
Jean Yee Hwa Yang
jean at biostat.ucsf.edu
Fri Oct 1 09:52:18 CEST 2004
Hi Jeremy,
That sounds very slow from my experience. Which image analysis software
did you get your data from? If you send me an example file off-line, I
will take a look at it for you, I need to take a look to see if maSub was
set properly, as this does make a big different in print-tip
normalization.
Alternatively, try the latest verion 1.5.17 that is temporary place at
http://arrays.ucsf.edu/software/
maNorm was previously very slow for global lowess normalization for larget
number of spots but in the new version, we have speed up the code with
sampling. However, I don't think this was your problem.
I will also suggest trying the swirl data within the marray package and
see how long that take on yoru computer
data(swirl)
norm <- maNorm(swirl)
If that takes a min or so that there is something wrong with your data
setup.
Cheers
Jean
On Thu, 30 Sep 2004, Jeremy Gollub wrote:
> Hi, all -
>
> I'm experiencing very poor performance using the marray package (20
> minutes to normalize a single <32,000 spot microarray). Can someone
> tell me whether this is normal, or what I'm doing wrong?
>
> In the process of hunting down some errors, I also noticed some odd (to
> me) behavior in the marrayLayout maSub slot assignment method, described
> below. An attempt to "correct" this results in a much faster
> normalization (~1 minute), which looks good according to the MA plot
> but produces different numbers in maM than the slower calculation.
>
> It seems unlikely that either result is correct (I can choose between
> suspiciously bad performance, or messing with the marrayLayout object's
> internals).
>
> Thanks for any suggestions - details follow.
>
> I'm using R version 1.9.0 on a sparc system running Solaris 2.9. My
> marray version is 1.5.14.
>
> I have a text file, "dat.txt," containing the data I want to normalize.
> 10 columns, all numeric: in order,
> FEATURE spot number 1 - 31736
> SECTOR unnecessary and unused
> ROW "
> COL "
> PLATE ID of printing plate
> Gf green channel foreground
> Rf red channel foreground
> Gb green channel background
> Rb red channel background
> W spot weights, either 0 or 1
>
> Array parameters are: Ngr = 8, Ngc = 4, Nsr = 31, Nsc = 32, Nspots =
> 31744. Not all spots are printed (ragged ends to each block). Only
> printed spots are included in the data file, so there are gaps in the
> FEATURE column sequence but no blank lines in the file.
>
> The session:
>
> > library(marray)
> >
> > # Read file.
> >
> > dat <- read.table('dat.txt', header = TRUE)
> >
> > # Construct maSub: 1 for each printed spot, 0 for absent spots.
> >
> > seq <- c(1:31744)
> > int <- intersect(seq, as.numeric(dat[,1]))
> > sub <- rep(0, 31744)
> > sub[int] <- 1
> >
> > # Note contents of sub around the end of the first block and beginning
> > # of the second:
> >
> > print(sub[980:1000])
> [1] 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
> > # total of 31488 present spots
> > sum(sub)
> [1] 31488
> >
> > # Construct marrayLayout object.
> >
> > ml <- new("marrayLayout", maNgr = 8, maNgc = 4, maNsr = 31, maNsc = 32,
> + maNspots = 31744)
> > maSub(ml) <- sub
> > maPlate(ml) <- as.factor(dat[,5])
> >
> > # Note contents of maSub:
> >
> > sum(ml at maSub)
> [1] 1
> > length(ml at maSub)
> [1] 31744
> > print(sub[1:20])
> [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
> > print(ml at maSub[1:20])
> [1] TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> FALSE
> [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> > print(ml at maSub[980:1000])
> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> FALSE
> [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> >
> > # Now meddle with ml at maSub (set it back the way I think it should be).
> > # Or don't - see comment on maNormMain step, below.
> >
> > maSub(ml)[int] <- TRUE
> >
> > # construct marrayRaw object.
> >
> > mr <- new("marrayRaw",
> + maGf = matrix(dat[,6], ncol = 1),
> + maRf = matrix(dat[,7], ncol = 1),
> + maGb = matrix(dat[,8], ncol = 1),
> + maRb = matrix(dat[,9], ncol = 1),
> + maW = matrix(dat[,10], ncol = 1),
> + maLayout = ml)
> >
> > # This step takes about one minute if I do maSub(ml)[int] <- TRUE
> > # as indicated above. If I don't, it takes about 20 minutes.
> > # The results differ, although the MA plot looks normalized either way.
> >
> > mn <- maNormMain(mr, f.loc = list(maNormLoess(x="maA", y="maM",
> + z="maPrintTip", w=NULL, subset=TRUE, span =
> 0.4)),
> + f.scale = list(maNormMAD(x = "maPrintTip", y = "maM",
> + geo = FALSE, subset = TRUE)),
> + Mloc = TRUE, Mscale = TRUE)
>
> --
> Jeremy Gollub, Ph.D.
> jgollub at genome.stanford.edu
> (W) 650/736-0075
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
More information about the Bioconductor
mailing list