[BioC] probe to entrezID mapping with aafLocusLink

James W. MacDonald jmacdon at med.umich.edu
Mon Dec 7 19:58:41 CET 2009


Hi Merja,

Merja Heinaniemi wrote:
> Hi!
> 
> I was mapping probeIDs from 133plus2 arrays to entrezIDs using aafLocusLink, some months ago with an earlier version of the package, and now with the current annaffy and hgu133plus2 packages. I compared my results and some probes no longer got mapped with the new package version, e.g POU5F1. The gene does have probes on the array, all just happen to be x_at probes. So I thought maybe all those less specific probes lack entrez mappings but another gene with x_at does have a matching entrezID. So why is e.g POU5F1 missing one? I include below the R code that can be used to reproduce my problem (even the first part if any hgu133Plus2 arrays are read in), sessionInfo is given at the end.
> 
> And more importantly, how do I get such probes mapped to an entrezID using Bioconductor? I was assuming the hgu133plus2 package contains all manufacturer annotations so I should find a match, or am I wrong?

As you note, this symbol is no longer mapped to 208286_x_at in the 
current hgu133plus2.db package. I don't know why; netaffx still claims 
this mapping. Perhaps Marc Carlson can shed some light.

You could map the Affy IDs to Entrez Gene using biomaRt as well, and 
that mapping still exists:

 > getBM("entrezgene","affy_hg_u133_plus_2", "208286_x_at", mart)
   entrezgene
1       5460
2       5462

I assume you are using aafLocusLink() because you are creating HTML or 
text tables for your output. Or perhaps you don't know that you can 
simply do:

 > mget(c("208286_x_at","215600_x_at"), hgu133plus2ENTREZID)
$`208286_x_at`
[1] NA

$`215600_x_at`
[1] "285231"

to do the mapping?

<self promotion>

If you are trying to create tables and would like to do the mappings via 
biomaRt, you could use either limma2biomaRt() or probes2tableBM() in the 
affycoretools package, which will output HTML or text tables with links 
to various databases, like you get with annaffy (but without the sweet 
css candy that colors the expression values according to the expression 
level).

</self promotion>

Best,

Jim


> 
> thanks in advance!
> 
> Merja
> 
> 
> 
> ##R commands:
> 
> #affybatch=read.affybatch(filenames=Filenames)
> #eset=rma(affybatch)
> #grep("208286_x_at",featureNames(eset))
> #[1] 17711
> 
> library(annaffy)
> library(hgu133plus2.db)
> probeID1="208286_x_at" ##this is POU5F1 entrezID 5460
> probeID2="215600_x_at"  ##this is FBXW12 entrezID 285231
> entrezID1=aafLocusLink(probeID1, "hgu133plus2.db")
> entrezID1
> #integer()
> entrezID2=aafLocusLink(probeID2, "hgu133plus2.db")
> entrezID2
> #[1] 285231
> 
> x <- hgu133plus2ENTREZID
> ## Get the probe identifiers that are mapped to an ENTREZ Gene ID
> mapped_probes <- mappedkeys(x)
> ## Convert to a list
> xx <- as.list(x[mapped_probes])
> xx[xx=="5460"]
> #list()
> xx[xx=="285231"]
> #$`1564138_at`
> #[1] "285231"
> 
> #$`215600_x_at`
> #[1] "285231"
> 
>> sessionInfo()
> #R version 2.10.0 (2009-10-26)
> #i386-apple-darwin9.8.0
> 
> #locale:
> #[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
> 
> #attached base packages:
> #[1] stats     graphics  grDevices utils     datasets  methods   base
> 
> #other attached packages:
> # [1] hgu133plus2cdf_2.5.0 hgu133plus2.db_2.3.5 org.Hs.eg.db_2.3.6   annaffy_1.18.0       KEGG.db_2.3.5        GO.db_2.3.5
> # [7] RSQLite_0.7-3        DBI_0.2-4            AnnotationDbi_1.8.1  affy_1.24.2          Biobase_2.6.0
> 
> #loaded via a namespace (and not attached):
> #[1] affyio_1.14.0        preprocessCore_1.8.0 tools_2.10.0
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 



More information about the Bioconductor mailing list