[BioC] converting gene name to affy probe

Martin Morgan mtmorgan at fhcrc.org
Tue May 6 05:28:29 CEST 2008


Hi Ruppert --

The annotation package hgu133a.db contains a 'map' from probe id to
symbol (I guess this is what you mean by 'gene name')

> library(hgu133a.db)
> ls(2)
 [1] "hgu133a"              "hgu133aACCNUM"        "hgu133aALIAS2PROBE"  
 [4] "hgu133aCHR"           "hgu133aCHRLENGTHS"    "hgu133aCHRLOC"       
 [7] "hgu133a_dbconn"       "hgu133a_dbfile"       "hgu133a_dbInfo"      
[10] "hgu133a_dbschema"     "hgu133aENSEMBL"       "hgu133aENSEMBL2PROBE"
[13] "hgu133aENTREZID"      "hgu133aENZYME"        "hgu133aENZYME2PROBE" 
[16] "hgu133aGENENAME"      "hgu133aGO"            "hgu133aGO2ALLPROBES" 
[19] "hgu133aGO2PROBE"      "hgu133aMAP"           "hgu133aMAPCOUNTS"    
[22] "hgu133aOMIM"          "hgu133aORGANISM"      "hgu133aPATH"         
[25] "hgu133aPATH2PROBE"    "hgu133aPFAM"          "hgu133aPMID"         
[28] "hgu133aPMID2PROBE"    "hgu133aPROSITE"       "hgu133aREFSEQ"       
[31] "hgu133aSYMBOL"        "hgu133aUNIGENE"

You'd like to reverse the map so it goes from SYMBOL to probe id

> rmap = revmap(hgu133aSYMBOL)

and then look up all your symbols

> syms = c("NAT1", "TCF3")
> mget(syms, rmap)
$NAT1
[1] "214440_at"

$TCF3
 [1] "209151_x_at" "209152_s_at" "209153_s_at" "210776_x_at" "213730_x_at"
 [6] "213731_s_at" "213732_at"   "213809_x_at" "213811_x_at" "215260_s_at"
[11] "216645_at"   "216647_at"  

Not sure where you want to go from here, though?

Martin

Ruppert Valentino <ruppert7 at hotmail.com> writes:

> Hello, I am trying to convert a file of gene names to corresponding
> affy probe names. I managed to write a script that puts the genes in
> an array then I use the feat = getFeature(symbol = gensym, type =
> "affy_hg_u133a", mart = mart) in biomaRt however I seem to hit a snag
> when there is more than probe for a gene name. Does anyone know of an
> existing script that can do this? thanks Ruppert
> _________________________________________________________________ Win
> Indiana Jones prizes with Live Search
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________ Bioconductor mailing
> list Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the
> archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793



More information about the Bioconductor mailing list