[Bioc-devel] AnnotationDbi and select function

Marc Carlson mcarlson at fhcrc.org
Wed Mar 12 20:18:25 CET 2014


Thanks Nicolaus!  That's a good bug.  I will work on a fix.  The reason 
why James work-around here functions is because the number of databases 
that it has to query is fewer by one.  It is also faster for this 
reason.  So when you say GENEID you mean the ids used in the associated 
txdb database which means that these have to be checked against that DB 
(and anything related to it extracted) and then merged with the results 
of the symbol information by joining on the foreign key for these two 
DBs.  So thats actually much more complex than just extracting all the 
same data from just the org package even though the end result (in this 
case) is the same.  The bug is probably happening in the associated 
merge step.

  Marc



On 03/12/2014 10:06 AM, James W. MacDonald wrote:
> Hi Nicolas,
>
> On 3/12/2014 12:39 PM, Servant Nicolas wrote:
>> Dear all,
>>
>> I have an error using the select function from the AnnotationDbi 
>> package.
>> I try to convert some geneID into Symbol, but for some strange 
>> reasons it crashed.
>>
>>
>> library(TxDb.Hsapiens.UCSC.hg19.knownGene)
>> txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
>> isActiveSeq(txdb)[seqlevels(txdb)] <- FALSE
>> isActiveSeq(txdb)[c("chr16","chr1")] <- TRUE
>> geneGR <- exonsBy(txdb, "gene")
>> library(Homo.sapiens)
>> symbol <- select(Homo.sapiens, keys = names(geneGR), keytype = 
>> "GENEID", columns = "SYMBOL")
>> Erreur dans head(select(Homo.sapiens, keys = names(geneGR)[1:1001], 
>> keytype = "GENEID",  :
>>    erreur d'évaluation de l'argument 'x' lors de la sélection d'une 
>> méthode pour la fonction 'head' : Erreur dans res[, 
>> .reverseColAbbreviations(x, cnames), drop = FALSE] :
>>
>>> length(geneGR)
>> [1] 3269
>> ## The first 1K work
>>> symbol <- select(Homo.sapiens, keys = names(geneGR)[1:1000], keytype 
>>> = "GENEID", columns = "SYMBOL")
>> ## The 1K+1 does not !
>>> symbol <- select(Homo.sapiens, keys = names(geneGR)[1:1001], keytype 
>>> = "GENEID", columns = "SYMBOL")
>> Erreur dans res[, .reverseColAbbreviations(x, cnames), drop = FALSE] :
>>    nombre de dimensions incorrect
>>
>> It looks like I cannot convert more than 1K elements ?? Any reason 
>> for that ?
>> Thank you very much
>> Nicolas
>
> Not sure what 'GENEID' is in this context - it appears to be Entrez 
> Gene. But anyway, if you use "ENTREZID" instead, it works fine:
>
> > symbol <- select(Homo.sapiens, names(geneGR), "SYMBOL", "ENTREZID")
> > symbol <- select(Homo.sapiens, names(geneGR), "GENEID", "ENTREZID")
> Error in res[, .reverseColAbbreviations(x, cnames), drop = FALSE] :
>   incorrect number of dimensions
> > symbol <- select(Homo.sapiens, names(geneGR)[1:1000], "GENEID", 
> "ENTREZID")
> > symbol <- select(Homo.sapiens, names(geneGR)[1:1001], "GENEID", 
> "ENTREZID")
> Error in res[, .reverseColAbbreviations(x, cnames), drop = FALSE] :
>   incorrect number of dimensions
>
> Best,
>
> Jim
>
>
>
>>
>>> sessionInfo()
>> R Under development (unstable) (2014-03-05 r65125)
>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>
>> locale:
>>   [1] LC_CTYPE=fr_FR.UTF-8       LC_NUMERIC=C
>>   [3] LC_TIME=fr_FR.UTF-8        LC_COLLATE=fr_FR.UTF-8
>>   [5] LC_MONETARY=fr_FR.UTF-8    LC_MESSAGES=fr_FR.UTF-8
>>   [7] LC_PAPER=fr_FR.UTF-8       LC_NAME=C
>>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] parallel  stats     graphics  grDevices utils     datasets methods
>> [8] base
>>
>> other attached packages:
>>   [1] Homo.sapiens_1.1.2
>>   [2] org.Hs.eg.db_2.10.1
>>   [3] GO.db_2.10.1
>>   [4] RSQLite_0.11.4
>>   [5] DBI_0.2-7
>>   [6] OrganismDbi_1.5.3
>>   [7] XVector_0.3.7
>>   [8] TxDb.Hsapiens.UCSC.hg19.knownGene_2.10.1
>>   [9] GenomicFeatures_1.15.9
>> [10] AnnotationDbi_1.25.14
>> [11] GenomeInfoDb_0.99.17
>> [12] Biobase_2.23.6
>> [13] GenomicRanges_1.15.32
>> [14] IRanges_1.21.32
>> [15] BiocGenerics_0.9.3
>> [16] RColorBrewer_1.0-5
>> [17] reshape2_1.2.2
>> [18] reshape_0.8.4
>> [19] plyr_1.8.1
>> [20] ggplot2_0.9.3.1
>> [21] Matrix_1.1-2-2
>>
>> loaded via a namespace (and not attached):
>>   [1] BatchJobs_1.2             BBmisc_1.5
>>   [3] BiocParallel_0.5.16       biomaRt_2.19.3
>>   [5] Biostrings_2.31.14        bitops_1.0-6
>>   [7] brew_1.0-6                BSgenome_1.31.12
>>   [9] codetools_0.2-8           colorspace_1.2-4
>> [11] dichromat_2.0-0           digest_0.6.4
>> [13] fail_1.2                  foreach_1.4.1
>> [15] GenomicAlignments_0.99.29 graph_1.41.3
>> [17] grid_3.1.0                gtable_0.1.2
>> [19] iterators_1.0.6           labeling_0.2
>> [21] lattice_0.20-27           MASS_7.3-29
>> [23] munsell_0.4.2             proto_0.3-10
>> [25] RBGL_1.39.2               Rcpp_0.11.0
>> [27] RCurl_1.95-4.1            Rsamtools_1.15.32
>> [29] rtracklayer_1.23.15       scales_0.2.3
>> [31] sendmailR_1.1-2           stats4_3.1.0
>> [33] stringr_0.6.2             tools_3.1.0
>> [35] XML_3.98-1.1              zlibbioc_1.9.0
>>
>>     [[alternative HTML version deleted]]
>>
>>
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>



More information about the Bioc-devel mailing list