[Bioc-devel] RE : AnnotationDbi and select function

Marc Carlson mcarlson at fhcrc.org
Wed Mar 12 22:53:39 CET 2014


I just checked a fix in for this bug to GenomicFeatures (which happens 
to be where the problem was).  It should percolate out to the build 
system soon.

  Marc


On 03/12/2014 02:19 PM, Servant Nicolas wrote:
> Hi guys,
>
> Thanks for your feedbacks.
> Indeed I put GENEID because it is used in the txdb database.
>
>> library(TxDb.Hsapiens.UCSC.hg19.knownGene)
>> txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
>> columns(txdb)
>   [1] "CDSID"      "CDSNAME"    "CDSCHROM"   "CDSSTRAND"  "CDSSTART"
>   [6] "CDSEND"     "EXONID"     "EXONNAME"   "EXONCHROM"  "EXONSTRAND"
> [11] "EXONSTART"  "EXONEND"    "GENEID"     "TXID"       "EXONRANK"
> [16] "TXNAME"     "TXCHROM"    "TXSTRAND"   "TXSTART"    "TXEND"
>
> I will move to ENTREZID which is much faster !
> I'm glad It could help
> Nicolas
>
> ________________________________________
> De : bioc-devel-bounces at r-project.org [bioc-devel-bounces at r-project.org] de la part de Marc Carlson [mcarlson at fhcrc.org]
> Date d'envoi : mercredi 12 mars 2014 20:18
> À : bioc-devel at r-project.org
> Objet : Re: [Bioc-devel] AnnotationDbi and select function
>
> Thanks Nicolaus!  That's a good bug.  I will work on a fix.  The reason
> why James work-around here functions is because the number of databases
> that it has to query is fewer by one.  It is also faster for this
> reason.  So when you say GENEID you mean the ids used in the associated
> txdb database which means that these have to be checked against that DB
> (and anything related to it extracted) and then merged with the results
> of the symbol information by joining on the foreign key for these two
> DBs.  So thats actually much more complex than just extracting all the
> same data from just the org package even though the end result (in this
> case) is the same.  The bug is probably happening in the associated
> merge step.
>
>    Marc
>
>
>
> On 03/12/2014 10:06 AM, James W. MacDonald wrote:
>> Hi Nicolas,
>>
>> On 3/12/2014 12:39 PM, Servant Nicolas wrote:
>>> Dear all,
>>>
>>> I have an error using the select function from the AnnotationDbi
>>> package.
>>> I try to convert some geneID into Symbol, but for some strange
>>> reasons it crashed.
>>>
>>>
>>> library(TxDb.Hsapiens.UCSC.hg19.knownGene)
>>> txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
>>> isActiveSeq(txdb)[seqlevels(txdb)] <- FALSE
>>> isActiveSeq(txdb)[c("chr16","chr1")] <- TRUE
>>> geneGR <- exonsBy(txdb, "gene")
>>> library(Homo.sapiens)
>>> symbol <- select(Homo.sapiens, keys = names(geneGR), keytype =
>>> "GENEID", columns = "SYMBOL")
>>> Erreur dans head(select(Homo.sapiens, keys = names(geneGR)[1:1001],
>>> keytype = "GENEID",  :
>>>     erreur d'évaluation de l'argument 'x' lors de la sélection d'une
>>> méthode pour la fonction 'head' : Erreur dans res[,
>>> .reverseColAbbreviations(x, cnames), drop = FALSE] :
>>>
>>>> length(geneGR)
>>> [1] 3269
>>> ## The first 1K work
>>>> symbol <- select(Homo.sapiens, keys = names(geneGR)[1:1000], keytype
>>>> = "GENEID", columns = "SYMBOL")
>>> ## The 1K+1 does not !
>>>> symbol <- select(Homo.sapiens, keys = names(geneGR)[1:1001], keytype
>>>> = "GENEID", columns = "SYMBOL")
>>> Erreur dans res[, .reverseColAbbreviations(x, cnames), drop = FALSE] :
>>>     nombre de dimensions incorrect
>>>
>>> It looks like I cannot convert more than 1K elements ?? Any reason
>>> for that ?
>>> Thank you very much
>>> Nicolas
>> Not sure what 'GENEID' is in this context - it appears to be Entrez
>> Gene. But anyway, if you use "ENTREZID" instead, it works fine:
>>
>>> symbol <- select(Homo.sapiens, names(geneGR), "SYMBOL", "ENTREZID")
>>> symbol <- select(Homo.sapiens, names(geneGR), "GENEID", "ENTREZID")
>> Error in res[, .reverseColAbbreviations(x, cnames), drop = FALSE] :
>>    incorrect number of dimensions
>>> symbol <- select(Homo.sapiens, names(geneGR)[1:1000], "GENEID",
>> "ENTREZID")
>>> symbol <- select(Homo.sapiens, names(geneGR)[1:1001], "GENEID",
>> "ENTREZID")
>> Error in res[, .reverseColAbbreviations(x, cnames), drop = FALSE] :
>>    incorrect number of dimensions
>>
>> Best,
>>
>> Jim
>>
>>
>>
>>>> sessionInfo()
>>> R Under development (unstable) (2014-03-05 r65125)
>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>
>>> locale:
>>>    [1] LC_CTYPE=fr_FR.UTF-8       LC_NUMERIC=C
>>>    [3] LC_TIME=fr_FR.UTF-8        LC_COLLATE=fr_FR.UTF-8
>>>    [5] LC_MONETARY=fr_FR.UTF-8    LC_MESSAGES=fr_FR.UTF-8
>>>    [7] LC_PAPER=fr_FR.UTF-8       LC_NAME=C
>>>    [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] parallel  stats     graphics  grDevices utils     datasets methods
>>> [8] base
>>>
>>> other attached packages:
>>>    [1] Homo.sapiens_1.1.2
>>>    [2] org.Hs.eg.db_2.10.1
>>>    [3] GO.db_2.10.1
>>>    [4] RSQLite_0.11.4
>>>    [5] DBI_0.2-7
>>>    [6] OrganismDbi_1.5.3
>>>    [7] XVector_0.3.7
>>>    [8] TxDb.Hsapiens.UCSC.hg19.knownGene_2.10.1
>>>    [9] GenomicFeatures_1.15.9
>>> [10] AnnotationDbi_1.25.14
>>> [11] GenomeInfoDb_0.99.17
>>> [12] Biobase_2.23.6
>>> [13] GenomicRanges_1.15.32
>>> [14] IRanges_1.21.32
>>> [15] BiocGenerics_0.9.3
>>> [16] RColorBrewer_1.0-5
>>> [17] reshape2_1.2.2
>>> [18] reshape_0.8.4
>>> [19] plyr_1.8.1
>>> [20] ggplot2_0.9.3.1
>>> [21] Matrix_1.1-2-2
>>>
>>> loaded via a namespace (and not attached):
>>>    [1] BatchJobs_1.2             BBmisc_1.5
>>>    [3] BiocParallel_0.5.16       biomaRt_2.19.3
>>>    [5] Biostrings_2.31.14        bitops_1.0-6
>>>    [7] brew_1.0-6                BSgenome_1.31.12
>>>    [9] codetools_0.2-8           colorspace_1.2-4
>>> [11] dichromat_2.0-0           digest_0.6.4
>>> [13] fail_1.2                  foreach_1.4.1
>>> [15] GenomicAlignments_0.99.29 graph_1.41.3
>>> [17] grid_3.1.0                gtable_0.1.2
>>> [19] iterators_1.0.6           labeling_0.2
>>> [21] lattice_0.20-27           MASS_7.3-29
>>> [23] munsell_0.4.2             proto_0.3-10
>>> [25] RBGL_1.39.2               Rcpp_0.11.0
>>> [27] RCurl_1.95-4.1            Rsamtools_1.15.32
>>> [29] rtracklayer_1.23.15       scales_0.2.3
>>> [31] sendmailR_1.1-2           stats4_3.1.0
>>> [33] stringr_0.6.2             tools_3.1.0
>>> [35] XML_3.98-1.1              zlibbioc_1.9.0
>>>
>>>      [[alternative HTML version deleted]]
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel



More information about the Bioc-devel mailing list