[BioC] Gene enrichment question

Alex Gutteridge alexg at ruggedtextile.com
Wed Aug 15 17:02:16 CEST 2012

On 15.08.2012 14:51, Aliaksei Holik wrote:
> Dear listers,
> Apologies if my question is not strictly related to Bioconductor,
> though one never knows, maybe there's a package that does what I 
> need.
> I am analysing a list of differentially expressed genes from an
> Illumina microarray. In particular I'm trying to compare the list of
> differentially expressed genes to an existing list of genes
> preferentially expressed in the stem cell population (stem cell
> signature). When I do so, 10% of DE genes belong to the stem cell
> signature. What I'd like to do now is to find out, how likely that
> would happen by chance, i.e. put a p value on it.
> At the moment I know:
> There're 17119 unique genes in my dataset.
> Of them 86 are differentially expressed.
> The stem cell signature contains 510 genes.
> It is combined from several platforms, which makes it hard to
> establish the total number of unique genes, but it's at least 20819
> (the size of the largest platform).
> There are 9 overlapping genes between DE genes and the stem cell 
> signature.
> So I wonder:
> 1) If there's an accepted way to calculate a p value using these
> data. For instance could I run a like of a chi squared test? E.g. 
> stem
> cell specific genes represent 510/20819=2.4% of total dataset. So
> expected number of the stem cell genes in my DE genes would be
> 86x2.4%=2. So my chi squared test would be based on 9 observed vs 2
> expected.

Hypergeometric test?

> phyper(9-1,86,17119-86,510,lower.tail=F)
[1] 0.001035456

For the total number of genes I used your lower estimate to be 
conservative. To be completely correct I think you would need to remove 
any of the 510 genes that are not in your 17,119 gene dataset. That will 
only boost the P value though (as they cannot be called DE if they are 
not in your dataset) and it is already 'significant' by most journals 

Alex Gutteridge

More information about the Bioconductor mailing list