[BioC] retrieving external Gene IDs from TranscriptDB Object
Stefanie Tauber
stefanie.tauber at univie.ac.at
Mon May 6 11:25:09 CEST 2013
Dear List,
I have created a TranscriptDB for yeast as follows:
library(GenomicFeatures)
library(biomaRt)
## create yeast DB
myDB <- makeTranscriptDbFromBiomart(biomart = "ensembl", dataset =
"scerevisiae_gene_ensembl", circ_seqs = c(DEFAULT_CIRC_SEQS, "Mito"))
myDBx <- cdsBy(myDB,by = "tx",use.names = TRUE)
Now, I would like to retrieve the external gene ids.
Is this the most generic way?
# select mart and dataset
mymart = useMart("ENSEMBL_MART_ENSEMBL", dataset =
"scerevisiae_gene_ensembl", host="www.ensembl.org")
# just a selection of transcripts
sel = names(myDBx)[5:6]
getBM(attributes=c("ensembl_transcript_id","external_gene_id"), values =
sel, filters = "ensembl_transcript_id", mart = mymart)
And, when creating a TranscriptDB From UCSC:
myDB1 <- makeTranscriptDbFromUCSC(genome = "hg19",tablename = "knownGene")
myDBx1 <- cdsBy(myDB1,by = "tx",use.names =TRUE)
What would be here the most generic way to retrieve the external gene IDs
for each transcript ID?
Best,
Stefanie
> sessionInfo()
R Under development (unstable) (2013-05-02 r62711)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] biomaRt_2.16.0 GenomicFeatures_1.12.1 AnnotationDbi_1.22.3
[4] Biobase_2.20.0 GenomicRanges_1.12.2 IRanges_1.18.0
[7] BiocGenerics_0.6.0
loaded via a namespace (and not attached):
[1] Biostrings_2.28.0 bitops_1.0-5 BSgenome_1.28.0 DBI_0.2-6
[5] RCurl_1.95-4.1 Rsamtools_1.12.2 RSQLite_0.11.3
rtracklayer_1.20.1
[9] stats4_3.1.0 tools_3.1.0 XML_3.96-1.1 zlibbioc_1.6.0
More information about the Bioconductor
mailing list