[BioC] Collation of differentially expressed genes in Limma
Donald Dunbar
donald.dunbar at ed.ac.uk
Tue Aug 23 17:36:00 CEST 2005
Hello all,
I'd be grateful some advice on the best way to collate my significantly
differentially expressed gene data from Limma.
I have three sets of five Affy chips arranged as one factor with three
levels (two treatments and a control).
I build the design matrix:
mydesign <- model.matrix(~ -1+factor(c(1,1,1,2,2,2,3,3,3,1,1,2,2,3,3)))
colnames(mydesign) <- c("Control", "T1", "T2")
to give:
Control T1 T2
1 1 0 0
2 1 0 0
3 1 0 0
4 0 1 0
5 0 1 0
6 0 1 0
7 0 0 1
8 0 0 1
9 0 0 1
10 1 0 0
11 1 0 0
12 0 1 0
13 0 1 0
14 0 0 1
15 0 0 1
then fit the model:
fit <- lmFit(eset, mydesign)
then I make the contrasts:
contrast.matrix <- makeContrasts(T1-Control,T2-Control,T1-T2,
levels=mydesign)
and fit the contrast to the model:
fit2 <- contrasts.fit(fit, contrast.matrix)
apply empirical Bayes:
fit2 <- eBayes(fit2)
now I have three sets of p.values in fit2$p.value, but of course I need to
adjust for multiple testing. I can do this in topTable:
topTable_T1_Control <- topTable(fit2, coef=1, adjust='BH')
topTable_T2_Control <- topTable(fit2, coef=2, adjust='BH')
topTable_T1_T2 <- topTable(fit2, coef=3, adjust='BH')
but that does it individually and returns genes for each contrast.
I'd like the list of genes significant below a particular adjusted p-value
threshold then a score for each contrast (-1, 0, 1).
I can use mt.rawp2adjp from multtest:
multadjust <- mt.rawp2adjp(fit2$p.value, proc=c("BH"))
eBpvalues <- multadjust$adjp[order(multadjust$index),]
but I think I'm applying this wrongly: I think I need to apply it to each
contrast's p.value column separately, but I'm not sure how.
The other option I thought to use was decideTests, but that doesn't seem to
allow me to use BH as an adjustment option:
groups <- decideTests(fit2, method="separate",
adjust.method="fdr",p.value=0.05)
vennDiagram(groups)
So if anyone would give some advice on the best way to collate my
significant genes, I'd like to output a matrix with a row for each gene that
is significantly differentially expressed in any of the three contrasts,
then columns for adjusted p.value, M (ratio), and score (-1, 0, 1) for each
gene:
p.value T1-C M T1-C scoreT1-C p.valueT2-C etc...
gene1
gene2
etc...
I'm using:
platform powerpc-apple-darwin7.9.0
arch powerpc
os darwin7.9.0
system powerpc, darwin7.9.0
status
major 2
minor 1.1
year 2005
month 06
day 20
language R
Hope that's clear. Thanks in advance.
Donald
--
Dr Donald Dunbar
Head, Bioinformatics Team
Centres for Cardiovascular Science and Inflammation Research,
University of Edinburgh,
Queen¹s Medical Research Institute,
47 Little France Crescent
Edinburgh EH16 4TJ
United Kingdom
T: +44 131 242 6700
E: donald.dunbar at ed.ac.uk
W: www.bioinf.mvm.ed.ac.uk
More information about the Bioconductor
mailing list