[BioC] TXNAME mapping
James W. MacDonald
jmacdon at uw.edu
Sat Jun 22 04:36:54 CEST 2013
Hi Murli,
I think you will need to show a small example script that gives this
result. I see only one region that corresponds to that TXNAME:
> x <- exonsBy(TxDb.Hsapiens.UCSC.hg19.knownGene, use.names=T)
> x["uc003ytw.3"]
GRangesList of length 1:
$uc003ytw.3
GRanges with 48 ranges and 3 metadata columns:
seqnames ranges strand | exon_id exon_name
<Rle> <IRanges> <Rle> | <integer> <character>
[1] chr8 [133879205, 133879312] + | 116041 <NA>
[2] chr8 [133880360, 133880468] + | 116042 <NA>
[3] chr8 [133881974, 133882071] + | 116043 <NA>
[4] chr8 [133883593, 133883796] + | 116044 <NA>
[5] chr8 [133885307, 133885466] + | 116045 <NA>
... ... ... ... ... ... ...
[44] chr8 [134125666, 134125847] + | 116085 <NA>
[45] chr8 [134128853, 134128960] + | 116086 <NA>
[46] chr8 [134144056, 134144190] + | 116087 <NA>
[47] chr8 [134145714, 134145904] + | 116088 <NA>
[48] chr8 [134146920, 134147143] + | 116089 <NA>
exon_rank
<integer>
[1] 1
[2] 2
[3] 3
[4] 4
[5] 5
... ...
[44] 44
[45] 45
[46] 46
[47] 47
[48] 48
> select(Homo.sapiens, "uc003ytw.3", c("TXID","GENEID","CHR",
"CHRLOC","CHRLOCEND"), "TXNAME")
TXNAME GENEID TXID CHR CHRLOC CHRLOCCHR CHRLOCEND
1 uc003ytw.3 7038 32071 8 133879205 8 134147143
Best,
Jim
On 6/21/2013 10:16 PM, Murli [guest] wrote:
> Hi,
>
> I am annotating my reads using TxDb.Hsapiens.UCSC.hg19.knownGene and org.Hs.eg.db. I am able to get everything work and also merge the data, but when I reviewd the output I see that the same TXNAME is mapped to different locations. See part of the output below. TXNAME uc003ytw.3 is associated with chr8 13515402 13515702 301 and chr12 71612488 71612788 301. I thought it should be unique, I would appreciate if you could correct me if I am missing something in understanding TXNAME.
>
> Thanks ../Murli
>
>
>
>
>> mrg.data[1000:1100,]
> TXID GENEID TXNAME seqnames start end width strand
> 1000 32071 7038 uc003ytw.3 chr8 13515402 13515702 301 *
> 1001 68728 63934 uc002qnd.3 chr8 14339379 14339679 301 *
> 1002 68729 63934 uc002qne.3 chr8 14339379 14339679 301 *
> 1003 68730 63934 uc010etm.3 chr8 14339379 14339679 301 *
> 1004 32071 7038 uc003ytw.3 chr8 14339379 14339679 301 *
> 1005 68728 63934 uc002qnd.3 chr12 71612488 71612788 301 *
> 1006 68729 63934 uc002qne.3 chr12 71612488 71612788 301 *
> 1007 68730 63934 uc010etm.3 chr12 71612488 71612788 301 *
> 1008 32071 7038 uc003ytw.3 chr12 71612488 71612788 301 *
> 1009 68728 63934 uc002qnd.3 chr14 24809972 24810272 301 *
> 1010 68729 63934 uc002qne.3 chr14 24809972 24810272 301 *
>
>
>
>
> -- output of sessionInfo():
>
>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-redhat-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=C LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] parallel stats graphics grDevices utils datasets methods
> [8] base
>
> other attached packages:
> [1] Homo.sapiens_1.1.1
> [2] GO.db_2.9.0
> [3] OrganismDbi_1.2.0
> [4] org.Hs.eg.db_2.9.0
> [5] RSQLite_0.11.4
> [6] DBI_0.2-7
> [7] VariantAnnotation_1.6.6
> [8] Rsamtools_1.12.3
> [9] BSgenome.Hsapiens.UCSC.hg19_1.3.19
> [10] BSgenome_1.28.0
> [11] Biostrings_2.28.0
> [12] TxDb.Hsapiens.UCSC.hg19.knownGene_2.9.2
> [13] GenomicFeatures_1.12.2
> [14] AnnotationDbi_1.22.6
> [15] Biobase_2.20.0
> [16] GenomicRanges_1.12.4
> [17] IRanges_1.18.1
> [18] BiocGenerics_0.6.0
>
> loaded via a namespace (and not attached):
> [1] biomaRt_2.16.0 bitops_1.0-5 graph_1.38.2 RBGL_1.36.2
> [5] RCurl_1.95-4.1 rtracklayer_1.20.2 stats4_3.0.1 tools_3.0.1
> [9] XML_3.98-1.1 zlibbioc_1.6.0
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list