[BioC] normalization data with ..txt or ..xls file by marray
or limma
Matthew Ritchie
mritchie at wehi.edu.au
Fri Mar 5 03:58:38 MET 2004
Hi Darwin,
>Once I get my cDNA microarray data, when should I delete the poor quality spots, before the normalization or after the normalization. I think it needs to be done before the normalization. In such case, what rule should I use?
>
A good question, and not an easy one to answer. Firstly defining poor
quality spots can be done in many ways (there are a number of papers on
the subject, which I can send you the references for if you're
interested). Most involve coming up with a spot specific measure, and
filtering (removing) genes with an unfavourable value of this measure
from subsequent analysis.
In limma, spot quality weights can be used in the normalization and
linear models to do this. Log-ratios from spots which are assigned low
weights (close to 0) have less influence in the normalization and linear
model fit compared to spots with high weights (around 1). Spots with 0
weights are ignored.
These relative weights can be automatically determined from data coming
out of the image analysis programs Spot and GenePix. The weights for
Spot are based on the ideal spot size (spots smaller and larger than
ideal are down-weighted), and for GenePix, they are derived from the
quality flags (good spot - 0 flag, full weight, bad spot - negative
flag, low weight). Specifying the 'weights' argument in
normalizeWithinArrays() and lmFit() makes use of the weights in the
normalization and linear model analysis. At the end of this message is
an example which might be helpful.
Does the image analysis package you're using provide any quality flags
that you might be able to use?
Sorry I don't have a more definite answer to your question. Best wishes,
Matt Ritchie
# Set up a random dataset of 6 replicate arrays with 100 genes on each array
RG <- new("RGList", list(R=matrix(rnorm(100*6, 1000, 300), 100, 6),
G=matrix(rnorm(100*6, 1000, 300), 100, 6), Rb=NULL, Gb=NULL))
RG$printer <- list(nspot.r=5, nspot.c=4, ngrid.r=1, ngrid.c=5) #
specify the array grid layout
RG$weights <- matrix(1, 100, 6) # define the weights. All spots are
given full weight (1), except
RG$weights[1,] <- 0 # for the observations for gene 1
(deemed to be poor quality)
RG$weights[,1] <- 0 # and the observations from array 1
(bad array)
# spots with 0 weights (from array 1, and gene 1 in this example are
ignored in the normalization and linear model fit
MA <- normalizeWithinArrays(RG, weights=RG$weights)
fit <- lmFit(MA, weights=RG$weights)
fit <- eBayes(fit)
>Thanks in advance!
>
>Darwin
>
More information about the Bioconductor
mailing list