[Bioc-devel] making txdb, and propagating metadata from AnnotationHub to GenomicFeatures
Michael Love
michaelisaiahlove at gmail.com
Wed Jun 17 09:35:51 CEST 2015
Background:
With previous approaches that I would recommend to users for building
txdb along the way of making count tables, it was desirable that the
GTF release information would *automatically* be passed into the
metadata of the rowRanges of the SummarizedExperiment.
for example, in parathyroidSE, using makeTxDbFromBiomart:
...
BioMart database version : chr "ENSEMBL GENES 72 (SANGER UK)"
BioMart dataset : chr "hsapiens_gene_ensembl"
BioMart dataset description : chr "Homo sapiens genes (GRCh37.p11)"
or, alternatively, using makeTxDbFromGFF, it would be present in the
name of the GTF file:
...
Data source: chr "Homo_sapiens.GRCh37.75.gtf"
Question:
I'm now interested in switching over to using GTF files available
through AnnotationHub, and wondering how we can maintain this
automatic propagation of metadata.
here's some example code:
library(GenomicFeatures)
library(AnnotationHub)
ah <- AnnotationHub()
z <- query(ah, c("Ensembl","gtf","Caenorhabditis elegans","release-80"))
stopifnot(length(z) == 1)
z$title
[1] "Caenorhabditis_elegans.WBcel235.80.gtf"
gtf <- ah[[names(z)]]
metadata(gtf)
$AnnotationHubName
[1] "AH47045"
txdb <- makeTxDbFromGRanges(gtf)
metadata(txdb)
name
1 Db type
2 Supporting package
3 Genome
4 transcript_nrow
5 exon_nrow
6 cds_nrow
7 Db created by
8 Creation time
9 GenomicFeatures version at creation time
10 RSQLite version at creation time
11 DBSCHEMAVERSION
value
1 TxDb
2 GenomicFeatures
3 WBcel235
4 57834
5 173506
6 131562
7 GenomicFeatures package from Bioconductor
8 2015-06-17 09:31:10 +0200 (Wed, 17 Jun 2015)
9 1.21.13
10 1.0.0
11 1.1
sessionInfo()
R Under development (unstable) (2015-04-29 r68278)
Platform: x86_64-apple-darwin12.5.0 (64-bit)
Running under: OS X 10.10.3 (Yosemite)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats4 parallel stats graphics grDevices datasets utils
[8] methods base
other attached packages:
[1] AnnotationHub_2.1.26 GenomicFeatures_1.21.13 AnnotationDbi_1.31.16
[4] Biobase_2.29.1 GenomicRanges_1.21.15 GenomeInfoDb_1.5.7
[7] IRanges_2.3.11 S4Vectors_0.7.5 BiocGenerics_0.15.2
[10] testthat_0.10.0 knitr_1.10 BiocInstaller_1.19.6
loaded via a namespace (and not attached):
[1] Rcpp_0.11.6 compiler_3.3.0
[3] futile.logger_1.4.1 XVector_0.9.1
[5] bitops_1.0-6 futile.options_1.0.0
[7] tools_3.3.0 zlibbioc_1.15.0
[9] biomaRt_2.25.1 digest_0.6.8
[11] RSQLite_1.0.0 memoise_0.2.1
[13] shiny_0.12.1 DBI_0.3.1
[15] rtracklayer_1.29.10 httr_0.6.1
[17] stringr_1.0.0 Biostrings_2.37.2
[19] R6_2.0.1 XML_3.98-1.2
[21] BiocParallel_1.3.26 lambda.r_1.1.7
[23] magrittr_1.5 Rsamtools_1.21.8
[25] htmltools_0.2.6 GenomicAlignments_1.5.9
[27] SummarizedExperiment_0.1.5 xtable_1.7-4
[29] mime_0.3 interactiveDisplayBase_1.7.0
[31] httpuv_1.3.2 stringi_0.4-1
[33] RCurl_1.95-4.6 crayon_1.3.0
More information about the Bioc-devel
mailing list