[BioC] Extracting genes within venn diagram

Celine Carret ckc at sanger.ac.uk
Thu Nov 6 14:21:36 CET 2008


Hi all,

I have a venn diagram obtained from Limma using decideTests() default
and I would like to extract the gene lists specific for each contrasts. 

Here is what I did:

>matrix <- read.csv("cells-HumanDB.csv") ##columns 1 and 2 are gene
names and uniprot labels, numerical values start on column 3

>design <- model.matrix(~
-1+factor(c(1,1,1,2,2,2,3,3,4,4,4,5,5,5,6,6,6)))
>colnames(design) <- c("GFPneg12h",
"GFPpos12h","GFPneg2h","GFPpos2h","DEXpos2h","NI")
>contrast.matrix <-
makeContrasts((GFPpos12h-GFPneg12h)-NI,(GFPpos2h-GFPneg2h)-NI,(DEXpos2h-
GFPneg2h)-NI,((GFPpos2h-GFPneg2h)-NI)-((DEXpos2h-GFPneg2h)-NI),((GFPpos1
2h-GFPneg12h)-NI)-((GFPpos2h-GFPneg2h)-NI), levels=design)
>fit <- lmFit(matrix[,3:19], design,weights=arrayw)
>fit2 <- contrasts.fit(fit, contrast.matrix)
>fit2 <- eBayes(fit2)
>results <- decideTests(fit2)
>vennDiagram(results[,2:3],cex=0.7)

and I get as numbers: 43 (682) 58 
and outside circles 712

Looking at the mail list archive I found a line of code from Gordon to
extract gene list within a venn diagram such as:

>alls <- apply(results[,2:3],1,all)
There were 50 or more warnings (use warnings() to see the first 50)
> fit2$genes[alls,]
And I get 682 genes corresponding to the intersection

>alls <- apply(!results[,2:3],1,all)
> fit2$genes[alls,]
Gave me 712 genes that are outside the venn diagram

But how can I get the genes specific for either contrast? (so in this
case 43 genes on one hand and 58 genes on another)

I also found from Jim this code using affycoretools functions but it
doesn't give me the right numbers and I can't understand why:

> alls <- affycoretools:::makeIndices(results[,2:3])
> alls.genes <- lapply(alls, function(x) row.names(results[,2:3])[x])
> alls.genes
[[1]] gives 56 genes
[[2]] gives 71 genes
[[3]] gives 669 genes
Shouldn't I get [[1]] 43; [[2]] 58 and [[3]] 682? 

I will be grateful for any help or advice on this matter.
Best wishes
Celine

Here is my R session info
> sessionInfo()
R version 2.8.0 (2008-10-20) 
i386-pc-mingw32 

locale:
LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United
Kingdom.1252;LC_MONETARY=English_United
Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252

attached base packages:
[1] splines   tools     stats     graphics  grDevices utils     datasets
methods   base     

other attached packages:
 [1] affycoretools_1.14.0 annaffy_1.14.0       KEGG.db_2.2.5
gcrma_2.14.1         matchprobes_1.14.0  
 [6] biomaRt_1.16.0       GOstats_2.8.0        Category_2.8.0
genefilter_1.22.0    survival_2.34-1     
[11] RBGL_1.18.0          annotate_1.20.0      xtable_1.5-4
GO.db_2.2.5          RSQLite_0.7-1       
[16] DBI_0.2-4            AnnotationDbi_1.4.0  graph_1.20.0
affy_1.20.0          limma_2.16.2        
[21] Biobase_2.2.0       

loaded via a namespace (and not attached):
[1] affyio_1.10.1        cluster_1.11.11      GSEABase_1.4.0
preprocessCore_1.4.0 RCurl_0.91-0        
[6] XML_1.94-0.1


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.



More information about the Bioconductor mailing list