[BioC] How to get gene information
    Saroj K Mohapatra 
    saroj at vt.edu
       
    Fri May 29 05:00:19 CEST 2009
    
    
  
Hi, I might have misunderstood your question the first time. Is it that 
you have a list of gene ids and you need to find their start and end 
locations on the chromosome? If so, I show an example below.
I have a list with three genes:
 > glist
[1] "CRIPAK" "CAND2"  "STK25"
I get the entrez gene ids:
 > eglist=as.character(unlist(mget(glist, revmap(org.Hs.egSYMBOL))))
 > eglist
[1] "285464" "23066"  "10494"
I find out which chromosomes these belong to:
 > mget(eglist, org.Hs.egCHR)
$`285464`
[1] "4"
$`23066`
[1] "3"
$`10494`
[1] "2"
Find the start position:
 > mget(eglist, org.Hs.egCHRLOC)
$`285464`
      4
1375339
$`23066`
       3
12813170
$`10494`
         2
-242083104
And the end positions:
 > mget(eglist, org.Hs.egCHRLOCEND)
$`285464`
      4
1379782
$`23066`
       3
12851301
$`10494`
         2
-242096707
Is this what you are looking for?
Best,
Saroj
Kay Jaja wrote:
> I have a list of 80 genes in a txt file and I am looking to use a data base, for example NCBI to get information on each of these gene. I need get the start and the end base pair position for each gene listed in my file? Any idea how to get started or what to use?
>  
> Your help is greatly appreciated
>
>
>       
> 	[[alternative HTML version deleted]]
>
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
    
    
More information about the Bioconductor
mailing list