[BioC] HELP! lmFit and duplicateCorrelation

Tue Oct 9 07:20:43 CEST 2007

Hello,

I had arrays with 4 replicate spots per gene. I used limma package for
data analysis.

> targets
         SlideNumber FileName Cy3 Cy5   Name
Field1           1 13617731 WBM  WC Field1
Field2           2 13617730 WBM  WC Field2
Field3           3 13617724  WC WBM Field3
Field4           4 13617627  WC WBM Field4
Field5           5 13617626 WBM  WC Field5

After read in data, normalization, I used the following codes for
within-array replicate spots.

design <- modelMatrix(targets, ref="WC")
corfit <- duplicateCorrelation(MA, design, ndups=4) # A slow computation!
fit <- lmFit(MA, design, ndups=4, correlation=corfit$consensus, 
method="ls")
fit2 <- lmFit(MA, design, ndups=4, correlation=corfit$consensus, 
method="robust")

# eBayes
fit5 <- eBayes(fit)
fit6 <- eBayes(fit2)

Then, topTable(fit5, number=30, adjust="BH") gives me a list of
differentially expressed genes. However, some of genes show up mutiple
times, such as 37A-C02.g in the list. According to limma guide, 11.6, gene
names should appear only once.

  	Block	Row	Column	ID	Name
597	4	15	15	37B-B10.g	P25782
519	4	2	17	37A-C02.g	P00765
531	4	4	17	37A-C02.g	P00765
314	2	24	18	37A-B08.g	AC186398
308	2	23	19	35B-F12.g	AC122261
513	4	1	17	37A-C02.g	P00765

Another question, is there any difference between differentially 
expressed genes using methods of "ls" and "robust" in lmFit?

> sessionInfo()
R version 2.6.0 (2007-10-03)
i486-pc-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] MASS_7.2-37   statmod_1.3.1 limma_2.12.0

loaded via a namespace (and not attached):
[1] rcompgen_0.1-15

Best wishes,

Tiandao