[Bioc-devel] makeTranscriptDbFrom... AnnotationHub
Michael Love
michaelisaiahlove at gmail.com
Tue Jul 8 21:11:09 CEST 2014
The recent TranscriptDb thread reminded me of a question: are there
plans (or am I missing the function) to easily get a TranscriptDb out
of the AnnotationHub objects? It would be great to have a preprocessed
Ensembl txdb like we have for UCSC.
> ah <- AnnotationHub()
> gr <- ah$ensembl.release.73.gtf.homo_sapiens.Homo_sapiens.GRCh37.73.gtf_0.0.1.RData
> gr
GRanges with 2268089 ranges and 12 metadata columns:
seqnames ranges strand | source
<Rle> <IRanges> <Rle> | <factor>
[1] 1 [11869, 12227] + | processed_transcript
[2] 1 [12613, 12721] + | processed_transcript
[3] 1 [13221, 14409] + | processed_transcript
[4] 1 [11872, 12227] + | unprocessed_pseudogene
[5] 1 [12613, 12721] + | unprocessed_pseudogene
... ... ... ... ... ...
[2268085] MT [14747, 15887] + | protein_coding
[2268086] MT [14747, 15887] + | protein_coding
[2268087] MT [14747, 14749] + | protein_coding
[2268088] MT [15888, 15953] + | Mt_tRNA
[2268089] MT [15956, 16023] - | Mt_tRNA
type score phase gene_id transcript_id
<factor> <numeric> <integer> <character> <character>
[1] exon <NA> <NA> ENSG00000223972 ENST00000456328
[2] exon <NA> <NA> ENSG00000223972 ENST00000456328
[3] exon <NA> <NA> ENSG00000223972 ENST00000456328
[4] exon <NA> <NA> ENSG00000223972 ENST00000515242
[5] exon <NA> <NA> ENSG00000223972 ENST00000515242
... ... ... ... ... ...
[2268085] exon <NA> <NA> ENSG00000198727 ENST00000361789
[2268086] CDS <NA> 0 ENSG00000198727 ENST00000361789
[2268087] start_codon <NA> 0 ENSG00000198727 ENST00000361789
[2268088] exon <NA> <NA> ENSG00000210195 ENST00000387460
[2268089] exon <NA> <NA> ENSG00000210196 ENST00000387461
exon_number gene_name gene_biotype transcript_name
<numeric> <character> <character> <character>
[1] 1 DDX11L1 pseudogene DDX11L1-002
[2] 2 DDX11L1 pseudogene DDX11L1-002
[3] 3 DDX11L1 pseudogene DDX11L1-002
[4] 1 DDX11L1 pseudogene DDX11L1-201
[5] 2 DDX11L1 pseudogene DDX11L1-201
... ... ... ... ...
[2268085] 1 MT-CYB protein_coding MT-CYB-201
[2268086] 1 MT-CYB protein_coding MT-CYB-201
[2268087] 1 MT-CYB protein_coding MT-CYB-201
[2268088] 1 MT-TT Mt_tRNA MT-TT-201
[2268089] 1 MT-TP Mt_tRNA MT-TP-201
exon_id protein_id
<character> <character>
[1] ENSE00002234944 <NA>
[2] ENSE00003582793 <NA>
[3] ENSE00002312635 <NA>
[4] ENSE00002234632 <NA>
[5] ENSE00003608237 <NA>
... ... ...
[2268085] ENSE00001436074 <NA>
[2268086] <NA> ENSP00000354554
[2268087] <NA> <NA>
[2268088] ENSE00001544475 <NA>
[2268089] ENSE00001544473 <NA>
---
seqlengths:
1 2 ... MT
NA NA ... NA
More information about the Bioc-devel
mailing list