[Bioc-devel] A bug in TxDb.Hsapiens.UCSC.hg38.knownGene?
zhao shilin
zhaoshilin at gmail.com
Mon Oct 17 05:35:44 CEST 2016
Dear BioC team,
I think I found something incorrect in TxDb.Hsapiens.UCSC.hg38.knownGene,
and reported in https://support.bioconductor.org/p/88232/ but didn't get
reply. I think it is a bug, so decided to send it via email to let you know.
I am using the developing version of TxDb.Hsapiens.UCSC.hg38.knownGene,
because the release version is build in 2015 and has a lot of difference
with UCSC website. Here is the R code for the bug:
require(TxDb.Hsapiens.UCSC.hg38.knownGene)
require(GenomicRanges)
geneDb=TxDb.Hsapiens.UCSC.hg38.knownGene
allGeneRange<-genes(geneDb)
allGeneRange["875"]
txs <- transcriptsBy(TxDb.Hsapiens.UCSC.hg38.knownGene)
txs["875"]
We can find CBS gene (txs["875"]) has 25 transcripts, from two
regions: chr21 [6444869, 6467509] and chr21 [43075107, 43076288]
1. CBS gene ("875") was only in chr21 [43075107, 43076288]. The region
of chr21 [6444869, 6467509] was CBSL gene ("102724560"). But CBSL was not
in the database, and its transcripts were recorded in CBS.
2. The gene region of CBS gene (allGeneRange["875"]) was in chr21 [6444869,
43076943], which included all the region between 6444869-43076943. But it
is not correct as they were two separate regions.
Thanks!
Shilin
[[alternative HTML version deleted]]
More information about the Bioc-devel
mailing list