[BioC] How to get gene information

Saroj K Mohapatra saroj at vt.edu
Fri May 29 05:00:19 CEST 2009


Hi, I might have misunderstood your question the first time. Is it that 
you have a list of gene ids and you need to find their start and end 
locations on the chromosome? If so, I show an example below.
I have a list with three genes:
 > glist
[1] "CRIPAK" "CAND2"  "STK25"

I get the entrez gene ids:
 > eglist=as.character(unlist(mget(glist, revmap(org.Hs.egSYMBOL))))
 > eglist
[1] "285464" "23066"  "10494"

I find out which chromosomes these belong to:
 > mget(eglist, org.Hs.egCHR)
$`285464`
[1] "4"

$`23066`
[1] "3"

$`10494`
[1] "2"

Find the start position:
 > mget(eglist, org.Hs.egCHRLOC)
$`285464`
      4
1375339

$`23066`
       3
12813170

$`10494`
         2
-242083104

And the end positions:
 > mget(eglist, org.Hs.egCHRLOCEND)
$`285464`
      4
1379782

$`23066`
       3
12851301

$`10494`
         2
-242096707

Is this what you are looking for?

Best,

Saroj


Kay Jaja wrote:
> I have a list of 80 genes in a txt file and I am looking to use a data base, for example NCBI to get information on each of these gene. I need get the start and the end base pair position for each gene listed in my file? Any idea how to get started or what to use?
>  
> Your help is greatly appreciated
>
>
>       
> 	[[alternative HTML version deleted]]
>
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list