[BioC] How to plot gene on their chromosome?
Hervé Pagès
hpages at fhcrc.org
Thu May 21 19:46:44 CEST 2009
Hi Simon,
Not a good idea to start a new thread by replying to a different thread
you started previously. Then it shows up under the previous thread even
if you changed the subject.
more below...
Simon Noël wrote:
> Hello every one. I have a question. I have a gene list in a .xls like
>
> probeID Symbol
> 1030431 ACSL1
> 4610431 ACTG2
> 4810575 ADAMTSL2
> 1510750 ADH1C
> 4060519 ADORA1
> 5720523 ADRA2A
> 2810482 AHNAK
> 1260270 AIM2
> 4180768 ALAS2
> ... ...
>
> I want to plote all of those genes on their chromosome. How can I do this?
So first you need to map each gene to its chromosome location.
You can use one of the org.*.eg.db annotation packages for
this (pick up the one for your organism):
http://bioconductor.org/packages/release/data/annotation/
and use the SYMBOL2EG map to map your gene symbols to their corresponding
Entrez IDs and then the CHRLOC map to map your Entrez IDs to their chromosome
locations.
Example:
library(org.Hs.eg.db)
mysymbols <- c("ACSL1", "ACTG2", "ADAMTSL2", "ADH1C",
"ADORA1", "ADRA2A", "AHNAK", "AIM2", "ALAS2")
myEgIDs <- unlist(mget(mysymbols, org.Hs.egSYMBOL2EG))
mylocs <- unname(unlist(mget(myEgIDs, org.Hs.egCHRLOC)))
One thing to be aware of is that those mappings are not necessarily
one-to-one e.g. the same symbol can be associated with different genes:
> flat <- toTable(org.Hs.egSYMBOL2EG)
> names(flat)
[1] "gene_id" "symbol"
> any(duplicated(flat$gene_id))
[1] FALSE
> any(duplicated(flat$symbol))
[1] TRUE
The same thing happens with the org.Hs.egCHRLOC map (I'm not sure
why we have this though, may be others on the list can explain).
Anyway this explains why 'mylocs' can have more elements than 'mysymbols'.
Cheers,
H.
>
> Simon Noël
> VP Externe CADEUL
> Association des étudiants et étudiantes en Biochimie, Bio-
> informatique et Microbiologie de l'Université Laval
> CdeC
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioconductor
mailing list