[BioC] probe to entrezID mapping with aafLocusLink
James W. MacDonald
jmacdon at med.umich.edu
Mon Dec 7 19:58:41 CET 2009
Hi Merja,
Merja Heinaniemi wrote:
> Hi!
>
> I was mapping probeIDs from 133plus2 arrays to entrezIDs using aafLocusLink, some months ago with an earlier version of the package, and now with the current annaffy and hgu133plus2 packages. I compared my results and some probes no longer got mapped with the new package version, e.g POU5F1. The gene does have probes on the array, all just happen to be x_at probes. So I thought maybe all those less specific probes lack entrez mappings but another gene with x_at does have a matching entrezID. So why is e.g POU5F1 missing one? I include below the R code that can be used to reproduce my problem (even the first part if any hgu133Plus2 arrays are read in), sessionInfo is given at the end.
>
> And more importantly, how do I get such probes mapped to an entrezID using Bioconductor? I was assuming the hgu133plus2 package contains all manufacturer annotations so I should find a match, or am I wrong?
As you note, this symbol is no longer mapped to 208286_x_at in the
current hgu133plus2.db package. I don't know why; netaffx still claims
this mapping. Perhaps Marc Carlson can shed some light.
You could map the Affy IDs to Entrez Gene using biomaRt as well, and
that mapping still exists:
> getBM("entrezgene","affy_hg_u133_plus_2", "208286_x_at", mart)
entrezgene
1 5460
2 5462
I assume you are using aafLocusLink() because you are creating HTML or
text tables for your output. Or perhaps you don't know that you can
simply do:
> mget(c("208286_x_at","215600_x_at"), hgu133plus2ENTREZID)
$`208286_x_at`
[1] NA
$`215600_x_at`
[1] "285231"
to do the mapping?
<self promotion>
If you are trying to create tables and would like to do the mappings via
biomaRt, you could use either limma2biomaRt() or probes2tableBM() in the
affycoretools package, which will output HTML or text tables with links
to various databases, like you get with annaffy (but without the sweet
css candy that colors the expression values according to the expression
level).
</self promotion>
Best,
Jim
>
> thanks in advance!
>
> Merja
>
>
>
> ##R commands:
>
> #affybatch=read.affybatch(filenames=Filenames)
> #eset=rma(affybatch)
> #grep("208286_x_at",featureNames(eset))
> #[1] 17711
>
> library(annaffy)
> library(hgu133plus2.db)
> probeID1="208286_x_at" ##this is POU5F1 entrezID 5460
> probeID2="215600_x_at" ##this is FBXW12 entrezID 285231
> entrezID1=aafLocusLink(probeID1, "hgu133plus2.db")
> entrezID1
> #integer()
> entrezID2=aafLocusLink(probeID2, "hgu133plus2.db")
> entrezID2
> #[1] 285231
>
> x <- hgu133plus2ENTREZID
> ## Get the probe identifiers that are mapped to an ENTREZ Gene ID
> mapped_probes <- mappedkeys(x)
> ## Convert to a list
> xx <- as.list(x[mapped_probes])
> xx[xx=="5460"]
> #list()
> xx[xx=="285231"]
> #$`1564138_at`
> #[1] "285231"
>
> #$`215600_x_at`
> #[1] "285231"
>
>> sessionInfo()
> #R version 2.10.0 (2009-10-26)
> #i386-apple-darwin9.8.0
>
> #locale:
> #[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>
> #attached base packages:
> #[1] stats graphics grDevices utils datasets methods base
>
> #other attached packages:
> # [1] hgu133plus2cdf_2.5.0 hgu133plus2.db_2.3.5 org.Hs.eg.db_2.3.6 annaffy_1.18.0 KEGG.db_2.3.5 GO.db_2.3.5
> # [7] RSQLite_0.7-3 DBI_0.2-4 AnnotationDbi_1.8.1 affy_1.24.2 Biobase_2.6.0
>
> #loaded via a namespace (and not attached):
> #[1] affyio_1.14.0 preprocessCore_1.8.0 tools_2.10.0
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
More information about the Bioconductor
mailing list