[BioC] 2fold sigchanges from limma
Ivan Baxter
ibaxter at purdue.edu
Sat Dec 2 00:04:51 CET 2006
Greetings-
I am using limma to analyze a fairly simple multiple treatment
experiment and I would like to pull out all the genes which are
significantly expressed and change at least two fold. I have figured out
a way to do this, but it is a little complicated and I am running into
problems downstream when I try to format output tables using annaffy
(described below). Is there a simple way to filter the output of
topTable to only get the genes which are significant and change more
than a give fold-change cutoff?
The way that I am going about this is....
cont.matrix <- makeContrasts(
EGvsC = EG.C.C.C - C.C.C.C,
ECvsC = C.EC.C.C - C.C.C.C,
EGECvsC = EG.EC.C.C - C.C.C.C,
GTvsC = C.C.G.C - C.C.C.C,
GT_APvsC = C.C.G.A - C.C.C.C,
GT_APvsGT = C.C.G.A - C.C.G.C,
levels=design)
fit2 <- contrasts.fit(fit, cont.matrix)
fit2 <- eBayes(fit2)
results <- decideTests(fit2)
idx <- abs(fit2$coefficients) > 1
#make a dataframe like results where 1/-1 indicates sigchange greater
than 2 fold
comb <- data.frame(gene = rownames(results))
for(i in 1:length(rownames(results))){
for(j in 1:length(colnames(results))){
if(results[i,j] != 0 & idx[i,j] == "TRUE"){
comb[i,j+1] <- results[i,j]
}
else{comb[i,j+1] <-0}
}
}
#but then I want to make an output table with gene names, gene
annotations, fold change, pvalue and the expression values #across the
arrays......
cax1_genes <- unlist(as.list(comb$gene[comb$GTAPvsC != 0]))
syms <- unlist(mget(cax1_genes, hgu133a2GENENAME))
test <- match(cax1_genes, geneNames(human.eps2))
anncols <- aaf.handler()[c(1:4,7)]
anntable <- aafTableAnn(geneNames(human.eps2)[test], "hgu133a2", anncols)
#now I need to get the fold- change and adj.p.value for each gene, and
here is where I run into trouble.
#I tried pulling out all the significant changes using topTable and then
pulling out the list of genes that met my criteria, but...
contp <- 5 # this is the contrast which is being tested
caxsig <- length(which(p.adjust(fit2$p.value[,contp], method = "BH") <
0.05))
cax1sig <- topTable(fit2, coef =contp, number =caxsig, adjust.method =
"none", sort.by = "p", resort.by = "M")
testtable <- aafTable("log2 change" = format(cax1sig$M[cax1sig$ID ==
cax1_genes], digits = 2),
"pval" =
format(cax1sig$adj.P.Val[cax1sig$ID == cax1_genes], digits = 2))
anntablep <- merge(anntable, testtable)
# doesn't work, I get the following error:
> testtable <- aafTable("log2 change" = format(cax1sig$M[cax1sig$ID ==
cax1_genes], digits = 2), "pval" = format(cax1sig$adj.P.Val[cax1sig$ID
== cax1_genes], digits = 2))
Warning messages:
1: longer object length
is not a multiple of shorter object length in: cax1sig$ID ==
cax1_genes
2: longer object length
is not a multiple of shorter object length in: cax1sig$ID ==
cax1_genes
If this is actually a good way to pull out the two fold genes, could
anyone tell me what I am doing wrong with this last step?
thanks in advance.
Ivan
> sessionInfo()
Version 2.3.0 (2006-04-24)
i386-pc-mingw32
attached base packages:
[1] "grid" "splines" "tools" "methods" "stats"
"graphics" "grDevices" "utils" "datasets" "base"
other attached packages:
annaffy KEGG GO hgu133a2 RColorBrewer
geneplotter annotate hexbin colorspace lattice
genefilter
"1.4.0" "1.12.0" "1.12.0" "1.12.0" "0.2-3"
"1.10.0" "1.10.0" "1.6.0" "0.9" "0.13-8" "1.10.1"
survival limma gcrma matchprobes affy
affyio Biobase
"2.24" "2.7.3" "2.4.1" "1.4.0" "1.10.0"
"1.0.0" "1.10.0"
--
**************************************************************
Ivan Baxter
Research Scientist
Bindley Bioscience Center
Purdue University
765-543-7288
ibaxter at purdue.edu
More information about the Bioconductor
mailing list