[BioC] Venn Diagram

Simon Noël simon.noel.2 at ulaval.ca
Thu Jul 2 23:27:56 CEST 2009


I speak to some collegue and it's possible to split my data in 2 group of 5 list
so a 5 set diagram can be possible...  But now how to create that?  I need to do
it with a pensil or there is a way to do it in R or with a program?

Selon Thomas Girke <thomas.girke at ucr.edu>, 02.07.2009:

> To get an impression how "pretty and confusingly complex" venn diagrams
> with more than 5 sets would look like, one can take a look at this page
> from combinatorics.org:
> http://www.combinatorics.org/Surveys/ds5/VennSymmEJC.html.
>
> Also, here is a small collection of methods/ideas for analyzing intersect
> relationships among large numbers of sample sets:
>
http://faculty.ucr.edu/~tgirke/Documents/R_BioCond/R_BioCondManual.html#R_graphics_overlapper
> These approaches are much more scalable than venn comparisons, but lack
> their logical 'not in' relations. The function for computing 'All
> Possible Intersects' is utility wise the closest alternative to venn
> diagrams.
>
> Thomas
>
>
> On Wed, Jul 01, 2009 at 09:41:25PM -0700, hpages at fhcrc.org wrote:
> > Oops, this is wrong, sorry! See a modified version of
> > makeVennTable() below that hopefully does the right thing.
> >
> > Quoting Hervé Pagès <hpages at fhcrc.org>:
> >
> > >Hi Simon,
> > >
> > >Simon Noël wrote:
> > >>Hello every one.
> > >>
> > >>I have ten list of between 4 to 3000 genes and I woudlike to put them all
> > >>together in a venn diagram.
> > >>
> > >>I have try to load the library ABarray and to use doVennDiagram but
> > >> it can only
> > >>une 3 list.
> > >>
> > >>Does any one know a way to put all of my ten list in the same venn
> > >>diagram?
> > >
> > >A venn diagramm is a 2-D drawing of all the possible intersections
> > >between 2 or 3 sets where each set is represented by a simple 2-D
> > >shape (typically a circle). In the case of 3 sets, the resulting
> > >diagram defines a partitioning of the 2-D plane in 8 regions.
> > >Some people have tried (with more or less success) to put 4 sets on
> > >the diagram but then they need to use more complicated shapes and
> > >the resulting diagram is not as easy to read anymore. With 10 sets,
> > >you would end up with 1024 (2^10) regions in your drawing and you
> > >would need to use extremely complicated shapes for each region
> > >making it really hard to read! Maybe in that case it's easier
> > >to generate the table below.
> > >
> > >## Let's say your genes are in 'set1', 'set2', etc... Put all the
> > >## sets in a big list:
> > >
> > >mysets <- list(set1, set2, ..., set10)
> > >
> > >makeVennTable <- function(sets)
> > >{
> > >  mkAllLogicalVect <- function(length)
> > >  {
> > >    if (length == 0L)
> > >      return(logical(0))
> > >    ans0 <- mkAllLogicalVect(length - 1L)
> > >    ans1 <- cbind(TRUE, ans0)
> > >    ans2 <- cbind(FALSE, ans0)
> > >    rbind(ans1, ans2)
> > >  }
> > >  lm <- mkAllLogicalVect(length(sets))
> > >  subsets <- apply(lm, MARGIN=1,
> > >               function(ii)
> > >               {
> > >                 s <- sets[ii]
> > >                 if (length(s) == 0)
> > >                   return("")
> > >                 paste(sort(unique(unlist(s))), collapse=",")
> > >               })
> > >  data.frame(lm, subsets)
> > >}
> > >
> > >Then call makeVennTable() on 'mysets'. For example, with 5 small sets:
> > >
> > >  > mysets <- list(c(1,5,12,4,9,29),
> > >                  c(4,11,3,18),
> > >                  c(22,4,12,19,8),
> > >                  c(7,12,4,5,3),
> > >                  c(25,24,4,2))
> > >
> > >  > makeVennTable(mysets)
> > >        X1    X2    X3    X4    X5                                 subsets
> > >  1   TRUE  TRUE  TRUE  TRUE  TRUE 1,2,3,4,5,7,8,9,11,12,18,19,22,24,25,29
> > >  2   TRUE  TRUE  TRUE  TRUE FALSE         1,3,4,5,7,8,9,11,12,18,19,22,29
> > >  3   TRUE  TRUE  TRUE FALSE  TRUE   1,2,3,4,5,8,9,11,12,18,19,22,24,25,29
> > >  4   TRUE  TRUE  TRUE FALSE FALSE           1,3,4,5,8,9,11,12,18,19,22,29
> > >  5   TRUE  TRUE FALSE  TRUE  TRUE         1,2,3,4,5,7,9,11,12,18,24,25,29
> > >  6   TRUE  TRUE FALSE  TRUE FALSE                 1,3,4,5,7,9,11,12,18,29
> > >  7   TRUE  TRUE FALSE FALSE  TRUE           1,2,3,4,5,9,11,12,18,24,25,29
> > >  8   TRUE  TRUE FALSE FALSE FALSE                   1,3,4,5,9,11,12,18,29
> > >  9   TRUE FALSE  TRUE  TRUE  TRUE       1,2,3,4,5,7,8,9,12,19,22,24,25,29
> > >  10  TRUE FALSE  TRUE  TRUE FALSE               1,3,4,5,7,8,9,12,19,22,29
> > >  11  TRUE FALSE  TRUE FALSE  TRUE           1,2,4,5,8,9,12,19,22,24,25,29
> > >  12  TRUE FALSE  TRUE FALSE FALSE                   1,4,5,8,9,12,19,22,29
> > >  13  TRUE FALSE FALSE  TRUE  TRUE               1,2,3,4,5,7,9,12,24,25,29
> > >  14  TRUE FALSE FALSE  TRUE FALSE                       1,3,4,5,7,9,12,29
> > >  15  TRUE FALSE FALSE FALSE  TRUE                   1,2,4,5,9,12,24,25,29
> > >  16  TRUE FALSE FALSE FALSE FALSE                           1,4,5,9,12,29
> > >  17 FALSE  TRUE  TRUE  TRUE  TRUE        2,3,4,5,7,8,11,12,18,19,22,24,25
> > >  18 FALSE  TRUE  TRUE  TRUE FALSE                3,4,5,7,8,11,12,18,19,22
> > >  19 FALSE  TRUE  TRUE FALSE  TRUE            2,3,4,8,11,12,18,19,22,24,25
> > >  20 FALSE  TRUE  TRUE FALSE FALSE                    3,4,8,11,12,18,19,22
> > >  21 FALSE  TRUE FALSE  TRUE  TRUE                2,3,4,5,7,11,12,18,24,25
> > >  22 FALSE  TRUE FALSE  TRUE FALSE                        3,4,5,7,11,12,18
> > >  23 FALSE  TRUE FALSE FALSE  TRUE                       2,3,4,11,18,24,25
> > >  24 FALSE  TRUE FALSE FALSE FALSE                               3,4,11,18
> > >  25 FALSE FALSE  TRUE  TRUE  TRUE              2,3,4,5,7,8,12,19,22,24,25
> > >  26 FALSE FALSE  TRUE  TRUE FALSE                      3,4,5,7,8,12,19,22
> > >  27 FALSE FALSE  TRUE FALSE  TRUE                    2,4,8,12,19,22,24,25
> > >  28 FALSE FALSE  TRUE FALSE FALSE                            4,8,12,19,22
> > >  29 FALSE FALSE FALSE  TRUE  TRUE                      2,3,4,5,7,12,24,25
> > >  30 FALSE FALSE FALSE  TRUE FALSE                              3,4,5,7,12
> > >  31 FALSE FALSE FALSE FALSE  TRUE                               2,4,24,25
> > >  32 FALSE FALSE FALSE FALSE FALSE
> >
> > The above table is clearly not the expected thing because the subsets
> > in the last column are not a partition of the initial set of genes
> > (some ids appear in several rows).
> > Try this instead:
> >
> > makeVennTable <- function(sets)
> > {
> >    mkAllLogicalVect <- function(length)
> >    {
> >      if (length == 0L)
> >        return(logical(0))
> >      ans0 <- mkAllLogicalVect(length - 1L)
> >      ans1 <- cbind(TRUE, ans0)
> >      ans2 <- cbind(FALSE, ans0)
> >      rbind(ans1, ans2)
> >    }
> >    minter.int <- function(...)
> >    {
> >      args <- list(...)
> >      if (length(args) == 0)
> >        return(integer(0))
> >      if (length(args) == 1)
> >        return(args[[1]])
> >      intersect(args[[1]], do.call(minter.int, args[-1]))
> >    }
> >    munion.int <- function(...)
> >    {
> >      unique(unlist(list(...)))
> >    }
> >    lm <- mkAllLogicalVect(length(sets))
> >    parts <- apply(lm, MARGIN=1,
> >                 function(ii)
> >                 {
> >                   s1 <- do.call(minter.int, sets[ii])
> >                   s2 <- do.call(munion.int, sets[!ii])
> >                   part <- setdiff(s1, s2)
> >                   if (length(part) == 0)
> >                     return("")
> >                   paste(sort(part), collapse=",")
> >                 })
> >    data.frame(lm, parts)
> > }
> >
> > Then:
> >
> > >makeVennTable(mysets)
> >       X1    X2    X3    X4    X5   parts
> > 1   TRUE  TRUE  TRUE  TRUE  TRUE       4
> > 2   TRUE  TRUE  TRUE  TRUE FALSE
> > 3   TRUE  TRUE  TRUE FALSE  TRUE
> > 4   TRUE  TRUE  TRUE FALSE FALSE
> > 5   TRUE  TRUE FALSE  TRUE  TRUE
> > 6   TRUE  TRUE FALSE  TRUE FALSE
> > 7   TRUE  TRUE FALSE FALSE  TRUE
> > 8   TRUE  TRUE FALSE FALSE FALSE
> > 9   TRUE FALSE  TRUE  TRUE  TRUE
> > 10  TRUE FALSE  TRUE  TRUE FALSE      12
> > 11  TRUE FALSE  TRUE FALSE  TRUE
> > 12  TRUE FALSE  TRUE FALSE FALSE
> > 13  TRUE FALSE FALSE  TRUE  TRUE
> > 14  TRUE FALSE FALSE  TRUE FALSE       5
> > 15  TRUE FALSE FALSE FALSE  TRUE
> > 16  TRUE FALSE FALSE FALSE FALSE  1,9,29
> > 17 FALSE  TRUE  TRUE  TRUE  TRUE
> > 18 FALSE  TRUE  TRUE  TRUE FALSE
> > 19 FALSE  TRUE  TRUE FALSE  TRUE
> > 20 FALSE  TRUE  TRUE FALSE FALSE
> > 21 FALSE  TRUE FALSE  TRUE  TRUE
> > 22 FALSE  TRUE FALSE  TRUE FALSE       3
> > 23 FALSE  TRUE FALSE FALSE  TRUE
> > 24 FALSE  TRUE FALSE FALSE FALSE   11,18
> > 25 FALSE FALSE  TRUE  TRUE  TRUE
> > 26 FALSE FALSE  TRUE  TRUE FALSE
> > 27 FALSE FALSE  TRUE FALSE  TRUE
> > 28 FALSE FALSE  TRUE FALSE FALSE 8,19,22
> > 29 FALSE FALSE FALSE  TRUE  TRUE
> > 30 FALSE FALSE FALSE  TRUE FALSE       7
> > 31 FALSE FALSE FALSE FALSE  TRUE 2,24,25
> > 32 FALSE FALSE FALSE FALSE FALSE
> >
> > H.
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> > http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>


Simon Noël
VP Externe CADEUL
Association des étudiants et étudiantes en Biochimie, Bio-
informatique et Microbiologie de l'Université Laval
CdeC



More information about the Bioconductor mailing list