[BioC] problem with TXNAME -> SYMBOL mapping in Homo.sapiens library
Marc Carlson
mcarlson at fhcrc.org
Fri Sep 27 23:53:36 CEST 2013
Hi Aleksandra,
That's a good question!
So first of all you may want to know that the newer packages don't even
have a name for that transcript. It has been dropped from the latest
Transcriptomes coming out of UCSC.
But it's still a great question, so allow me to also answer about what
is happening in this older data that you are using. In these older
packages, there was a transcript name from UCSC, but it was *not*
associated with any GENE IDs. Thus it is a valid key, because it can be
mapped to "some" values inside the transcriptome, but it is not mappable
to anything "outside" of the Transcriptome. You almost had enough
information to see this for yourself with the select queries that you ran.
So for example if you did the following select:
select(Homo.sapiens, cols=c("GENEID","TXSTART"), keys= "uc021wml.1",
keytype="TXNAME")
You will get:
TXNAME GENEID TXSTART
1 uc021wml.1 <NA> 22385572
This actually tells you that while there *is* transcript information for
this name ("TXCHROM" etc. will also work), there is still no GENEID
associated with it. Unfortunately: no gene ID means there is also no
way to look up information like gene SYMBOL or any other data that is
associated at the gene level.
So the short answer is that there is no gene symbol for this transcript
name because we don't have any way to know what gene it belongs to.
Hope this helps,
Marc
On 09/27/2013 02:28 AM, Aleksandra Pfeifer [guest] wrote:
> Hello,
> I have a problem with the maping from txname to symbol of the gene. For most transcripts it works ok, but for some it doesn't:
>
>> library(Homo.sapiens)
>> select(Homo.sapiens, cols="SYMBOL", keys= "uc021wml.1", keytype="TXNAME")
> Error in .testIfKeysAreOfProposedKeytype(x, keys, keytype) :
> None of the keys entered are valid keys for the keytype specified.
>
> The traceback is as follows:
>> traceback()
> 10: stop("None of the keys entered are valid keys for the keytype specified.")
> 9: .testIfKeysAreOfProposedKeytype(x, keys, keytype)
> 8: .select(x, keys, cols, keytype, jointype = jointype)
> 7: .local(x, keys, cols, keytype, ...)
> 6: select(.makeReal(nodeName), keys = fromKeys, cols = needCols[[nodeName]],
> keytype = toKey)
> 5: select(.makeReal(nodeName), keys = fromKeys, cols = needCols[[nodeName]],
> keytype = toKey)
> 4: .getSelects(x, keytype, keys, needCols, visitNodes)
> 3: .select(x, keys, cols, keytype, ...)
> 2: select(Homo.sapiens, cols = "SYMBOL", keys = "uc021wml.1", keytype = "TXNAME")
> 1: select(Homo.sapiens, cols = "SYMBOL", keys = "uc021wml.1", keytype = "TXNAME")
>
>
> However, When I try to check whether the problematic txname is present in Homo.sapiens database, it occurs that it is there. I can also find some other information about this transcript:
>> "uc021wml.1" %in% keys(Homo.sapiens, keytype="TXNAME")
> [1] TRUE
>> select(Homo.sapiens, cols="TXSTART", keys= "uc021wml.1", keytype="TXNAME")
> TXNAME TXSTART
> 1 uc021wml.1 22385572
>
> Is there a way to solve that problem? I would be appreciated for your help.
>
> Best regards,
> Aleksandra Pfeifer
>
>
>
> -- output of sessionInfo():
>
>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
> [1] C
>
> attached base packages:
> [1] parallel stats graphics grDevices utils datasets methods
> [8] base
>
> other attached packages:
> [1] Homo.sapiens_1.1.1
> [2] TxDb.Hsapiens.UCSC.hg19.knownGene_2.9.2
> [3] org.Hs.eg.db_2.9.0
> [4] GO.db_2.9.0
> [5] RSQLite_0.11.4
> [6] DBI_0.2-7
> [7] OrganismDbi_1.2.0
> [8] GenomicFeatures_1.12.4
> [9] GenomicRanges_1.12.5
> [10] IRanges_1.18.4
> [11] AnnotationDbi_1.22.6
> [12] Biobase_2.20.1
> [13] BiocGenerics_0.6.0
>
> loaded via a namespace (and not attached):
> [1] BSgenome_1.28.0 Biostrings_2.28.0 RBGL_1.36.2 RCurl_1.95-4.1
> [5] Rsamtools_1.12.4 XML_3.98-1.1 biomaRt_2.16.0 bitops_1.0-6
> [9] graph_1.38.3 rtracklayer_1.20.4 stats4_3.0.1 tools_3.0.1
> [13] zlibbioc_1.6.0
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list