[BioC] curious behavior from VariantAnnotation 1.5.36
Valerie Obenchain
vobencha at fhcrc.org
Thu Feb 14 20:39:54 CET 2013
Hi Tim,
Thanks for reporting this. There was a typo in the method called for
3UTR's in the AllVariants() code. Now fixed in 1.5.38 and 1.4.9.
library(Homo.sapiens)
library(VariantAnnotation)
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
site <- GRanges(c('chr1'), IRanges(start=36521401, width=1))
EIF2C3 <- unlist(mget('EIF2C3', org.Hs.egSYMBOL2EG))
all <- locateVariants(site, txdb, AllVariants())
three <- locateVariants(site, txdb, ThreeUTRVariants())
> identical(all, three)
[1] TRUE
Valerie
On 02/13/2013 10:19 PM, Tim Triche, Jr. wrote:
> Somehow I omitted a line... this should go immediately after
> library(VariantAnnotation), i.e.
>
>
> library(Homo.sapiens)
> library(VariantAnnotation)
>
> ## this following line is the one that went missing
> EIF2C3 <- unlist(mget('EIF2C3', org.Hs.egSYMBOL2EG))
>
> EIF2C3.site <- GRanges(c('chr1'), IRanges(start=36521401, width=1))
> EIF2C3.site %over%
> transcriptsBy(TxDb.Hsapiens.UCSC.hg19.knownGene)[[EIF2C3]]
>
> ## ...the rest is as before
>
>
> On Wed, Feb 13, 2013 at 5:09 PM, Tim Triche, Jr. <tim.triche at gmail.com>wrote:
>
>> I was reannotating some discriminating loci today and found this odd case
>> -- a site is in the 3' UTR of two transcripts for a gene, and
>> threeUTRVariants() finds it, but AllVariants() does not.
>>
>> Code to reproduce the result at an example locus:
>>
>> packageVersion('VariantAnnotation')
>> ## [1] '1.5.36'
>> packageVersion('Homo.sapiens')
>> ## [1] '1.0.0'
>>
>> ## test case for bioc-list:
>> library(Homo.sapiens)
>> library(VariantAnnotation)
>> EIF2C3.site <- GRanges(c('chr1'), IRanges(start=36521401, width=1))
>> EIF2C3.site %over%
>> transcriptsBy(TxDb.Hsapiens.UCSC.hg19.knownGene)[[EIF2C3]]
>> ## TRUE
>>
>> allVars <- locateVariants(EIF2C3.site,
>> TxDb.Hsapiens.UCSC.hg19.knownGene,
>> AllVariants())
>> length(allVars)
>> ## 0
>>
>> threePrimeVars <- locateVariants(EIF2C3.site,
>> TxDb.Hsapiens.UCSC.hg19.knownGene,
>> ThreeUTRVariants())
>> length(threePrimeVars)
>> ## 2
>>
>> Why aren't the 3' UTR matches included in the AllVariants() results?
>> That doesn't seem like the expected behavior, at least not to me.
>>
>>
>>
>> R> sessionInfo()
>> R Under development (unstable) (2013-02-13 r61937)
>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>
>> locale:
>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
>> [7] LC_PAPER=C LC_NAME=C
>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] parallel stats graphics grDevices datasets utils methods
>> [8] base
>>
>> other attached packages:
>> [1] FDb.InfiniumMethylation.hg19_1.0.6
>> [2] BSgenome.Hsapiens.UCSC.hg19_1.3.19
>> [3] BSgenome_1.27.1
>> [4] Homo.sapiens_1.0.0
>> [5] TxDb.Hsapiens.UCSC.hg19.knownGene_2.8.0
>> [6] org.Hs.eg.db_2.8.0
>> [7] GO.db_2.8.0
>> [8] RSQLite_0.11.2
>> [9] DBI_0.2-5
>> [10] OrganismDbi_1.1.13
>> [11] VariantAnnotation_1.5.36
>> [12] Rsamtools_1.11.16
>> [13] Biostrings_2.27.11
>> [14] GenomicFeatures_1.11.8
>> [15] AnnotationDbi_1.21.10
>> [16] Biobase_2.19.2
>> [17] chromophobe_0.50
>> [18] pheatmap_0.7.4
>> [19] ggplot2_0.9.3
>> [20] reshape2_1.2.2
>> [21] GenomicRanges_1.11.29
>> [22] IRanges_1.17.31
>> [23] BiocGenerics_0.5.6
>> [24] BiocInstaller_1.9.6
>> [25] gtools_2.7.0
>> [26] devtools_1.1
>>
>> loaded via a namespace (and not attached):
>> [1] biomaRt_2.15.0 bitops_1.0-5 colorspace_1.2-1
>> dichromat_2.0-0
>> [5] digest_0.6.2 evaluate_0.4.3 graph_1.37.5 grid_3.0.0
>>
>> [9] gtable_0.1.2 httr_0.2 labeling_0.1
>> lattice_0.20-13
>> [13] MASS_7.3-23 Matrix_1.0-10 memoise_0.1 munsell_0.4
>>
>> [17] plyr_1.8 proto_0.3-10 RBGL_1.35.0
>> RColorBrewer_1.0-5
>> [21] RCurl_1.95-3 rtracklayer_1.19.9 scales_0.2.3 stats4_3.0.0
>>
>> [25] stringr_0.6.2 tcltk_3.0.0 tools_3.0.0 whisker_0.1
>>
>> [29] XML_3.95-0.1 zlibbioc_1.5.0
>>
>>
>> --
>> *A model is a lie that helps you see the truth.*
>> *
>> *
>> Howard Skipper<http://cancerres.aacrjournals.org/content/31/9/1173.full.pdf>
>>
>
>
>
More information about the Bioconductor
mailing list