[BioC] Unambiguously mapping of affy IDs to gene symbols using hgu133plus2.db
James W. MacDonald
jmacdon at med.umich.edu
Fri Oct 1 15:24:56 CEST 2010
Hi Christian,
On 10/1/2010 6:10 AM, Christian Ruckert wrote:
> Hi,
> I am doing some mapping of affymetrix probeset IDs to gene symbols using
> package hgu133plus2.db.
>
> As the following example illustrates, each of the 40686 mapped probesets
> maps to exactly one gene symbol.
Yes, this was a design change of (maybe) two releases ago. The default
is to only expose unambiguous mappings.
This behavior can be modified using the toggleProbes() function.
> table(nhit(hgu95av2SYMBOL))
0 1
901 11724
> table(nhit(toggleProbes(hgu95av2SYMBOL, "all")))
0 1 2 3 4 5 6 7
493 11724 297 53 22 4 10 4
8 9 10 11 12 14 20 21
4 2 2 1 1 1 2 4
22
1
> table(nhit(toggleProbes(hgu95av2SYMBOL, "multiple")))
0 2 3 4 5 6 7 8
12217 297 53 22 4 10 4 4
9 10 11 12 14 20 21 22
2 2 1 1 1 2 4 1
See ?toggleProbes for more information.
Best,
Jim
>
> > library("hgu133plus2.db")
> > x <- hgu133plus2SYMBOL
> > Llength(x)
> [1] 54675
> > count.mappedkeys(x)
> [1] 40686
>
> > head(nhit(x))
> 1007_s_at 1053_at 117_at 121_at 1255_g_at 1294_at
> 1 1 1 1 1 1
>
> > table(nhit(x))
>
> 0 1
> 13989 40686
>
>
> Am I correct, that annotation with gene symbol is only included in the
> package if it is unambiguously?
>
> For example
> > x[["203074_at"]]
> [1] NA
>
> But netaffx and biomart return:
> ANXA8, ANXA8L1, ANXA8L2
>
> If doing a mapping between protein and gene expression arrays based on
> gene symbols, can results be improved using biomart instead of the
> annotation packages?
>
> Christian
>
>
> > sessionInfo()
> R version 2.11.0 (2010-04-22)
> x86_64-pc-linux-gnu
>
> locale:
> [1] C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] hgu133plus2.db_2.4.1 org.Hs.eg.db_2.4.1 RSQLite_0.9-1
> [4] DBI_0.2-5 AnnotationDbi_1.10.1 Biobase_2.8.0
>
> loaded via a namespace (and not attached):
> [1] tools_2.11.0
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
More information about the Bioconductor
mailing list