[BioC] converting gene name to affy probe
Martin Morgan
mtmorgan at fhcrc.org
Tue May 6 05:28:29 CEST 2008
Hi Ruppert --
The annotation package hgu133a.db contains a 'map' from probe id to
symbol (I guess this is what you mean by 'gene name')
> library(hgu133a.db)
> ls(2)
[1] "hgu133a" "hgu133aACCNUM" "hgu133aALIAS2PROBE"
[4] "hgu133aCHR" "hgu133aCHRLENGTHS" "hgu133aCHRLOC"
[7] "hgu133a_dbconn" "hgu133a_dbfile" "hgu133a_dbInfo"
[10] "hgu133a_dbschema" "hgu133aENSEMBL" "hgu133aENSEMBL2PROBE"
[13] "hgu133aENTREZID" "hgu133aENZYME" "hgu133aENZYME2PROBE"
[16] "hgu133aGENENAME" "hgu133aGO" "hgu133aGO2ALLPROBES"
[19] "hgu133aGO2PROBE" "hgu133aMAP" "hgu133aMAPCOUNTS"
[22] "hgu133aOMIM" "hgu133aORGANISM" "hgu133aPATH"
[25] "hgu133aPATH2PROBE" "hgu133aPFAM" "hgu133aPMID"
[28] "hgu133aPMID2PROBE" "hgu133aPROSITE" "hgu133aREFSEQ"
[31] "hgu133aSYMBOL" "hgu133aUNIGENE"
You'd like to reverse the map so it goes from SYMBOL to probe id
> rmap = revmap(hgu133aSYMBOL)
and then look up all your symbols
> syms = c("NAT1", "TCF3")
> mget(syms, rmap)
$NAT1
[1] "214440_at"
$TCF3
[1] "209151_x_at" "209152_s_at" "209153_s_at" "210776_x_at" "213730_x_at"
[6] "213731_s_at" "213732_at" "213809_x_at" "213811_x_at" "215260_s_at"
[11] "216645_at" "216647_at"
Not sure where you want to go from here, though?
Martin
Ruppert Valentino <ruppert7 at hotmail.com> writes:
> Hello, I am trying to convert a file of gene names to corresponding
> affy probe names. I managed to write a script that puts the genes in
> an array then I use the feat = getFeature(symbol = gensym, type =
> "affy_hg_u133a", mart = mart) in biomaRt however I seem to hit a snag
> when there is more than probe for a gene name. Does anyone know of an
> existing script that can do this? thanks Ruppert
> _________________________________________________________________ Win
> Indiana Jones prizes with Live Search
>
> [[alternative HTML version deleted]]
>
> _______________________________________________ Bioconductor mailing
> list Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the
> archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M2 B169
Phone: (206) 667-2793
More information about the Bioconductor
mailing list