[BioC] package or code to quantify the significance of the venn overlap between 2 or 3 lists of genes
whuber at embl.de
Wed Mar 17 14:48:52 CET 2010
I don't think what you need here is necessarily a package - the required
computations, if possible, are one or a few lines of R using standard
functions e.g. in the "stats" package such as phyper.
Perhaps the more important thing to do is to precisely define the
questions you want to be asking. For this, discussion with a local
statistician might be helpful. Once you have that, the answer will
probably be fairly obvious from a basic text book on combinatorics
(probability theory on discrete variables).
Karl Brand scripsit 17/03/10 12:26:
> Dear BioCers,
> I've got six lists of gene's which i'm focused on the overlaps between.
> What i'm searching for is a package or code to quantify the significance
> of the overlap between both a pair of gene lists, and also between three
> gene-lists. Six might be interesting, but not necessary.
> Specifically, what would the overlap be expected by chance, and how many
> standard deviations my actual overlap is from the estimated chance overlap?
> Whilst some of my lists are independent, others are not in being derived
> from tissues of the same origin. I understand this would exclude such
> tests like Fishers Rxact test which assume independence.
> By using the same numbers of chip-background probes and short-listed
> probes of interest, randomly selected and checking the overlap,
> performed say 10,000 times, i think i could obtain the estimates i'm
> looking for in a 'statistically acceptable' manner.
> Does anyone know of a package or code written for this purpose? I failed
> to find anything in BioConductor or in the BioC lists. As simple as
> coding it no doubt is, my lack of R knowledge would make doing it with a
> calculator the faster option :)
> Look forward to any recommendations or suggestions with appreciation,
More information about the Bioconductor