[BioC] Normalizing single-channel data [was: is my
normalization right?]
Gordon Smyth
smyth at wehi.edu.au
Wed Jun 2 01:44:25 CEST 2004
Dear Xiaopeng,
You are raising the issue of normalizing single channel (non-Affy)
microarray data. This is not yet documented but is not difficult using
between-array normalization methods provided in limma or vsn.
Firstly, let me point out that your text file doesn't contain the "raw
data" from Genepix since it doesn't contain background intensities. Have
you already subtracted the background or have you just ignored it? What did
you do with the Genepix flags?
1. Given a text file like you describe, you can read into R using the basic
function read.table()
Data <- read.table("myfile.txt",sep="\t") # I assume your file is
tab-delimited
y <- as.matrix(Data[,-1])
rownames(y) <- as.character(Data[,1])
Now you have two major normalization choices, quantile or vsn normalization.
library(limma)
y2 <- normalizeBetweenArrays(log2(y), method="quantile")
or
y2 <- normalizeBetweenArrays(y, method="vsn")
Now you are ready to go straight into analysis differential expression
using limma like
fit <- lmFit(y2, design)
If you use quantile normalization, you must make sure that all your
intensities are positive before normalizing, for example by
y <- pmax(1, y)
2. You never did need to extract the intensity data from the Genepix gpr
files in the first place. You could have proceeded in limma as
targets <- readTargets() # Always good practice to make a targets file
RG <- read.maimages(targets$FileName, source="genepix",
columns=list(Rf="F532 Mean",Gf="F532 Mean",Rb="B532 Median",Gb="B532 Median"))
y2 <- normalizeBetweenArrays(RG$G, method="quantile")
Or you might choose to apply backgroundCorrect() before
normalizeBetweenArrays()
Gordon
>xpzhang xpzhang at genetics.ac.cn
>Sat May 29 09:21:55 CEST 2004
>
>
>Thank you for your answer!
>
>My raw-data was from GenePix. Because I used only Cy3 in my whole
>microarray experiment, I only extract data by the software,and try to
>normalize the data by Bioconductor.
>
>I made a .txt file for the raw data, it was just like this:
>
>Gene Name Contrl(intensity) Treat1(intensity) Treat2(intensity)
>Treat3(intensity)
>1
>2
>3
>4
>5
>...
>
>I want to use mutiple slides normalization with intensity dependent, is
>it appropriate? And could you tell me howto? I am trying to find out
>ways by reading Bioconductor's document and help files,but I feel really
>difficult.
>
>Thank you very much!
More information about the Bioconductor
mailing list