[BioC] Error using Homo.sapiens AnnotationDbi package with GenomicFeatures
Marc Carlson
mcarlson at fhcrc.org
Fri Nov 9 01:44:23 CET 2012
Hi Chris,
If you load the Homo.sapiens package, you will see it load the
TxDb.Hsapiens.UCSC.hg19.knownGene package for you as a dependency. So
you don't need to call makeTranscriptDbFromUCSC(), at least not for the
track you were going for, because that was already loaded via the
TxDb.Hsapiens.UCSC.hg19.knownGene package. To get the promoter regions,
you really only need to call promoters like this:
library(Homo.sapiens)
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
proms <- promoters(txdb, upstream=2000, downstream=200) ## check the
defaults in case you don't like them!
proms
## Once you have the promoters, you can look up the tx_names for these
like this.
k <- proms$tx_name
## And then you can use select to retrieve the matching gene IDs
## In the case of Homo.sapiens, the gene IDs actually *are* entrez gene
IDs (because that is what the knownGene track is using as a gene ID).
res <- select(Homo.sapiens, keys=k, cols=c("GENEID","TXNAME"),
keytype="TXNAME")
head(res)
Marc
On 11/08/2012 10:54 AM, Chris Whelan wrote:
> Hi,
>
> I'm having trouble using the AnnotationDbi package and was wondering
> if someone could tell me what I'm doing wrong. I'm trying to use
> GenomicFeatures to find promoter regions and then use AnnotationDbi to
> look up the Entrez Gene IDs for those transcripts, but getting an
> error. If I'm going about this all wrong let me know; I find it a
> little difficult to follow the thread of the documentation of the
> various feature/annotation packages. At the very least the error
> message that I'm getting seems like it might be a little friendlier?
>
> Thanks!
>
> Chris
>
> Bioconductor version 2.11 (BiocInstaller 1.8.3), ?biocLite for help
>> library(GenomicFeatures)
> Loading required package: BiocGenerics
>
> Attaching package: 'BiocGenerics'
>
> The following object(s) are masked from 'package:stats':
>
> xtabs
>
> The following object(s) are masked from 'package:base':
>
> Filter, Find, Map, Position, Reduce, anyDuplicated, cbind,
> colnames, duplicated, eval, get, intersect, lapply, mapply, mget,
> order, paste, pmax, pmax.int, pmin, pmin.int, rbind, rep.int,
> rownames, sapply, setdiff, table, tapply, union, unique
>
> Loading required package: IRanges
> Loading required package: GenomicRanges
> Loading required package: AnnotationDbi
> Loading required package: Biobase
> Welcome to Bioconductor
>
> Vignettes contain introductory material; view with
> 'browseVignettes()'. To cite Bioconductor, see
> 'citation("Biobase")', and for packages 'citation("pkgname")'.
>
> li> library(Homo.sapiens)
> Loading required package: OrganismDbi
> Loading required package: GO.db
> Loading required package: DBI
>
> Loading required package: org.Hs.eg.db
>
> Loading required package: TxDb.Hsapiens.UCSC.hg19.knownGene
>> hg19UCSCGenes<- makeTranscriptDbFromUCSC(genome = "hg19", tablename = "knownGene")
> Download the knownGene table ... OK
> Download the knownToLocusLink table ... OK
> Extract the 'transcripts' data frame ... OK
> Extract the 'splicings' data frame ... OK
> Download and preprocess the 'chrominfo' data frame ... OK
> Prepare the 'metadata' data frame ... metadata: OK
>> k<- elementMetadata(head(promoters(hg19UCSCGenes)))[,"tx_name"]
> Warning messages:
> 1: In `start<-`(`*tmp*`, value = c(9874, 9874, 9874, 67091, 319084, :
> trimmed start values to be positive
> 2: In `end<-`(`*tmp*`, value = c(12073, 12073, 12073, 69290, 321283, :
> trimmed end values to be<= seqlengths
>> k
> [1] "uc001aaa.3" "uc010nxq.1" "uc010nxr.1" "uc001aal.1" "uc001aaq.2"
> [6] "uc001aar.2"
>> head(keys(Homo.sapiens, keytype="TXNAME"))
> [1] "uc001aaa.3" "uc010nxq.1" "uc010nxr.1" "uc001aal.1" "uc001aaq.2"
> [6] "uc001aar.2"
>> select(Homo.sapiens, keys=k, keytype="TXNAME", cols=c("TXNAME", "ENTREZID")
> + )
> Error in if (nrow(res)> 0L) { : argument is of length zero
>> sessionInfo()
> R version 2.15.1 (2012-06-22)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
> [1] C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] Homo.sapiens_1.0.0
> [2] TxDb.Hsapiens.UCSC.hg19.knownGene_2.8.0
> [3] org.Hs.eg.db_2.8.0
> [4] GO.db_2.8.0
> [5] RSQLite_0.11.2
> [6] DBI_0.2-5
> [7] OrganismDbi_1.0.0
> [8] GenomicFeatures_1.10.0
> [9] AnnotationDbi_1.20.2
> [10] Biobase_2.18.0
> [11] GenomicRanges_1.10.4
> [12] IRanges_1.16.4
> [13] BiocGenerics_0.4.0
> [14] BiocInstaller_1.8.3
>
> loaded via a namespace (and not attached):
> [1] BSgenome_1.26.1 Biostrings_2.26.2 RBGL_1.34.0 RCurl_1.95-3
> [5] Rsamtools_1.10.1 XML_3.95-0.1 biomaRt_2.14.0 bitops_1.0-4.2
> [9] graph_1.36.0 parallel_2.15.1 rtracklayer_1.18.0 stats4_2.15.1
> [13] tools_2.15.1 zlibbioc_1.4.0
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list