[BioC] simple functional analysis of gene lists

Sean Davis seandavi at gmail.com
Wed Mar 31 19:05:35 CEST 2010

On Wed, Mar 31, 2010 at 8:29 AM, Rainer Tischler <rainer_t62 at yahoo.de> wrote:
> Dear all,
> I have a script that generates a large number of gene sets (vectors of gene names) and would like to apply a functional analysis (e.g. how many transcription factors occur in each gene set? how many kinases... etc.). I can convert the gene names into different formats using biomaRt, however, the only functional analysis tools I have found in R apply an enrichment analysis on either GO or KEGG gene sets. Is there a package that allows me to answer more simple questions, e.g. just counting the number of transcription factors in a gene set by connecting to a public database?

Hi, Rainer.

It will depend a bit, but for transcription factors, for example, I
think it suffices to get the genes that are annotated with or
descendents of GO:0003700 (transcription factor activity).


> library(org.Hs.eg.db)
> egs = keys(org.Hs.egGO)
> length(egs)
[1] 45469
> randGenes = sample(egs,100)
> tfGenes = get('GO:0003700',org.Hs.egGO2ALLEGS)
> intersect(tfGenes,randGenes)

> sessionInfo()
R version 2.11.0 Under development (unstable) (2009-11-03 r50304)

[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base

other attached packages:
[1] org.Hs.eg.db_2.4.0  RSQLite_0.8-4       DBI_0.2-5
[4] AnnotationDbi_1.9.6 Biobase_2.7.5

loaded via a namespace (and not attached):
[1] tools_2.11.0

More information about the Bioconductor mailing list