[BioC] Help me understand org.Hs.eg.db
Christof Winter
winter at biotec.tu-dresden.de
Sat Apr 4 13:56:56 CEST 2009
Daren Tan wrote, On 04.04.2009 06:06:
> I am using two approaches to get EntrezID to genes mapping, as well as
> genes to EntrezID mappings. toTable gives same number of mappings in
> both directions, but mget doesn't. Which approach should I trust and
> why ?
>
>> dim(toTable(org.Hs.egSYMBOL2EG))
> [1] 39824 2
>> dim(toTable(org.Hs.egSYMBOL))
> [1] 39824 2
>
>> length(mget(mappedRkeys(org.Hs.egSYMBOL2EG), org.Hs.egSYMBOL2EG))
> [1] 39800
>> length(mget(mappedLkeys(org.Hs.egSYMBOL), org.Hs.egSYMBOL))
> [1] 39824
Dear Daren:
It seems that for some Entrez Gene symbols, there is more than one
Entrez Gene ID mapped to it:
> x = mget(mappedRkeys(org.Hs.egSYMBOL2EG), org.Hs.egSYMBOL2EG)
> sum(listLen(x) > 1)
[1] 24
If you really care about the correct number, you could look up those
Entrez Gene IDs at NCBI and decide in each case how to count it:
> x[listLen(x) > 1]
HTH,
Christof
--
Christof Winter
Bioinformatics Group
Biotechnologisches Zentrum
Technische Universität Dresden
Tatzberg 47-51
01307 Dresden
Germany
More information about the Bioconductor
mailing list