[BioC] How to get gene information
Saroj K Mohapatra
saroj at vt.edu
Fri May 29 05:00:19 CEST 2009
Hi, I might have misunderstood your question the first time. Is it that
you have a list of gene ids and you need to find their start and end
locations on the chromosome? If so, I show an example below.
I have a list with three genes:
> glist
[1] "CRIPAK" "CAND2" "STK25"
I get the entrez gene ids:
> eglist=as.character(unlist(mget(glist, revmap(org.Hs.egSYMBOL))))
> eglist
[1] "285464" "23066" "10494"
I find out which chromosomes these belong to:
> mget(eglist, org.Hs.egCHR)
$`285464`
[1] "4"
$`23066`
[1] "3"
$`10494`
[1] "2"
Find the start position:
> mget(eglist, org.Hs.egCHRLOC)
$`285464`
4
1375339
$`23066`
3
12813170
$`10494`
2
-242083104
And the end positions:
> mget(eglist, org.Hs.egCHRLOCEND)
$`285464`
4
1379782
$`23066`
3
12851301
$`10494`
2
-242096707
Is this what you are looking for?
Best,
Saroj
Kay Jaja wrote:
> I have a list of 80 genes in a txt file and I am looking to use a data base, for example NCBI to get information on each of these gene. I need get the start and the end base pair position for each gene listed in my file? Any idea how to get started or what to use?
>
> Your help is greatly appreciated
>
>
>
> [[alternative HTML version deleted]]
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list