[BioC] GO term enrichment analysis over whole genome, copy number aberration investigation

Marc Carlson mcarlson at fhcrc.org
Fri Sep 5 17:35:53 CEST 2008


Hi Nathan,

Bioconductor does have packages that try to address the annotations for 
an entire organism based on their entrez gene IDs instead of their 
affymetrix (or other) IDs.  These are called the organism packages.  
They have are named in a format like this example:  "org.Hs.eg.db" which 
would be the "organism" package for "Homo sapiens" based on "Entrez 
Gene" IDs.  If you think this will help you, please try it out.

  Marc




Nathan Harmston wrote:
> Hi,
>
> I currently have a list of HUGO gene ids which relate to genes in
> areas of gain over a whole chromosome, and would like to perform GO
> enrichment analysis on them. So I have 2 problems:
>
> 1. currently i have been defining my gene universe based on affymetrix
> arrays, however now I am working over the whole genome. gene_universe
> = getBM(c("entrezgene"), mart = ensembl) ......however this leaves me
> with a gene_universe of 20275 gene ids (is this right?)
> 2. moving from my HUGO identifiers to entrez gene ids? I can do this
> using biomaRt
> test = getBM(c("entrezgene"), filters = "hgnc_symbol", values =
> stGained, mart = ensembl)
>
> however, this is not the same length as my number of hugo gene
> identifiers (in my case 30 are missing). Why is this? Is this just
> some weird annotation bug that can't be fixed or is it the way I m
> doing it. Does the bioconductor have the GO information for all genes
> in the genome and not just those in the annotation files for the
> affymetrix arrays?
>
> Finally.....what are the statistical implications of performing GO
> enrichment (Im using a conditional test) over a whole genome, would it
> be better to run the gene set enrichment analysis on each chromosome
> (I don think so)? I m trying to find evidence that genes relating to
> certain functions are gained over the whole chromosome (cancer study).
> I've ran a test one and have found some things which make sense.
>
> Many thanks in advance,
>
> Nathan
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>



More information about the Bioconductor mailing list