[BioC] HELP! lmFit and duplicateCorrelation
Tiandao Li
Tiandao.Li at usm.edu
Tue Oct 9 07:20:43 CEST 2007
Hello,
I had arrays with 4 replicate spots per gene. I used limma package for
data analysis.
> targets
SlideNumber FileName Cy3 Cy5 Name
Field1 1 13617731 WBM WC Field1
Field2 2 13617730 WBM WC Field2
Field3 3 13617724 WC WBM Field3
Field4 4 13617627 WC WBM Field4
Field5 5 13617626 WBM WC Field5
After read in data, normalization, I used the following codes for
within-array replicate spots.
design <- modelMatrix(targets, ref="WC")
corfit <- duplicateCorrelation(MA, design, ndups=4) # A slow computation!
fit <- lmFit(MA, design, ndups=4, correlation=corfit$consensus,
method="ls")
fit2 <- lmFit(MA, design, ndups=4, correlation=corfit$consensus,
method="robust")
# eBayes
fit5 <- eBayes(fit)
fit6 <- eBayes(fit2)
Then, topTable(fit5, number=30, adjust="BH") gives me a list of
differentially expressed genes. However, some of genes show up mutiple
times, such as 37A-C02.g in the list. According to limma guide, 11.6, gene
names should appear only once.
Block Row Column ID Name
597 4 15 15 37B-B10.g P25782
519 4 2 17 37A-C02.g P00765
531 4 4 17 37A-C02.g P00765
314 2 24 18 37A-B08.g AC186398
308 2 23 19 35B-F12.g AC122261
513 4 1 17 37A-C02.g P00765
Another question, is there any difference between differentially
expressed genes using methods of "ls" and "robust" in lmFit?
> sessionInfo()
R version 2.6.0 (2007-10-03)
i486-pc-linux-gnu
locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] MASS_7.2-37 statmod_1.3.1 limma_2.12.0
loaded via a namespace (and not attached):
[1] rcompgen_0.1-15
Best wishes,
Tiandao
More information about the Bioconductor
mailing list