[BioC] Error using Homo.sapiens AnnotationDbi package with GenomicFeatures

Marc Carlson mcarlson at fhcrc.org
Fri Nov 9 01:44:23 CET 2012


Hi Chris,

If you load the Homo.sapiens package, you will see it load the 
TxDb.Hsapiens.UCSC.hg19.knownGene package for you as a dependency.  So 
you don't need to call makeTranscriptDbFromUCSC(), at least not for the 
track you were going for, because that was already loaded via the 
TxDb.Hsapiens.UCSC.hg19.knownGene package.  To get the promoter regions, 
you really only need to call promoters like this:

library(Homo.sapiens)
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
proms <- promoters(txdb, upstream=2000,  downstream=200)  ## check the 
defaults in case you don't like them!
proms

## Once you have the promoters, you can look up the tx_names for these 
like this.
k <- proms$tx_name

## And then you can use select to retrieve the matching gene IDs
## In the case of Homo.sapiens, the gene IDs actually *are* entrez gene 
IDs (because that is what the knownGene track is using as a gene ID).
res <- select(Homo.sapiens, keys=k, cols=c("GENEID","TXNAME"), 
keytype="TXNAME")
head(res)



   Marc




On 11/08/2012 10:54 AM, Chris Whelan wrote:
> Hi,
>
> I'm having trouble using the AnnotationDbi package and was wondering
> if someone could tell me what I'm doing wrong. I'm trying to use
> GenomicFeatures to find promoter regions and then use AnnotationDbi to
> look up the Entrez Gene IDs for those transcripts, but getting an
> error. If I'm going about this all wrong let me know; I find it a
> little difficult to follow the thread of the documentation of the
> various feature/annotation packages. At the very least the error
> message that I'm getting seems like it might be a little friendlier?
>
> Thanks!
>
> Chris
>
> Bioconductor version 2.11 (BiocInstaller 1.8.3), ?biocLite for help
>> library(GenomicFeatures)
> Loading required package: BiocGenerics
>
> Attaching package: 'BiocGenerics'
>
> The following object(s) are masked from 'package:stats':
>
>      xtabs
>
> The following object(s) are masked from 'package:base':
>
>      Filter, Find, Map, Position, Reduce, anyDuplicated, cbind,
>      colnames, duplicated, eval, get, intersect, lapply, mapply, mget,
>      order, paste, pmax, pmax.int, pmin, pmin.int, rbind, rep.int,
>      rownames, sapply, setdiff, table, tapply, union, unique
>
> Loading required package: IRanges
> Loading required package: GenomicRanges
> Loading required package: AnnotationDbi
> Loading required package: Biobase
> Welcome to Bioconductor
>
>      Vignettes contain introductory material; view with
>      'browseVignettes()'. To cite Bioconductor, see
>      'citation("Biobase")', and for packages 'citation("pkgname")'.
>
> li>  library(Homo.sapiens)
> Loading required package: OrganismDbi
> Loading required package: GO.db
> Loading required package: DBI
>
> Loading required package: org.Hs.eg.db
>
> Loading required package: TxDb.Hsapiens.UCSC.hg19.knownGene
>> hg19UCSCGenes<- makeTranscriptDbFromUCSC(genome = "hg19", tablename = "knownGene")
> Download the knownGene table ... OK
> Download the knownToLocusLink table ... OK
> Extract the 'transcripts' data frame ... OK
> Extract the 'splicings' data frame ... OK
> Download and preprocess the 'chrominfo' data frame ... OK
> Prepare the 'metadata' data frame ... metadata: OK
>> k<- elementMetadata(head(promoters(hg19UCSCGenes)))[,"tx_name"]
> Warning messages:
> 1: In `start<-`(`*tmp*`, value = c(9874, 9874, 9874, 67091, 319084,  :
>    trimmed start values to be positive
> 2: In `end<-`(`*tmp*`, value = c(12073, 12073, 12073, 69290, 321283,  :
>    trimmed end values to be<= seqlengths
>> k
> [1] "uc001aaa.3" "uc010nxq.1" "uc010nxr.1" "uc001aal.1" "uc001aaq.2"
> [6] "uc001aar.2"
>> head(keys(Homo.sapiens, keytype="TXNAME"))
> [1] "uc001aaa.3" "uc010nxq.1" "uc010nxr.1" "uc001aal.1" "uc001aaq.2"
> [6] "uc001aar.2"
>> select(Homo.sapiens, keys=k, keytype="TXNAME", cols=c("TXNAME", "ENTREZID")
> + )
> Error in if (nrow(res)>  0L) { : argument is of length zero
>> sessionInfo()
> R version 2.15.1 (2012-06-22)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
> [1] C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
>   [1] Homo.sapiens_1.0.0
>   [2] TxDb.Hsapiens.UCSC.hg19.knownGene_2.8.0
>   [3] org.Hs.eg.db_2.8.0
>   [4] GO.db_2.8.0
>   [5] RSQLite_0.11.2
>   [6] DBI_0.2-5
>   [7] OrganismDbi_1.0.0
>   [8] GenomicFeatures_1.10.0
>   [9] AnnotationDbi_1.20.2
> [10] Biobase_2.18.0
> [11] GenomicRanges_1.10.4
> [12] IRanges_1.16.4
> [13] BiocGenerics_0.4.0
> [14] BiocInstaller_1.8.3
>
> loaded via a namespace (and not attached):
>   [1] BSgenome_1.26.1    Biostrings_2.26.2  RBGL_1.34.0        RCurl_1.95-3
>   [5] Rsamtools_1.10.1   XML_3.95-0.1       biomaRt_2.14.0     bitops_1.0-4.2
>   [9] graph_1.36.0       parallel_2.15.1    rtracklayer_1.18.0 stats4_2.15.1
> [13] tools_2.15.1       zlibbioc_1.4.0
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list