[Bioc-devel] BamTallyParam argument 'which'
Thomas Sandmann
sandmann.thomas at gene.com
Fri Feb 20 23:00:37 CET 2015
Hi Michael,
I noticed that when the tallyVariants function receives a 'which' arguments
(via BamTallyParam), that contains overlapping or duplicated regions,
duplicated rows are returned.
(See below for an example.)
It took me a little while to understand where I was picking duplicates.
Would it be useful to 'reduce' the 'which' GRanges/RangesList object by
default, e.g. before tallying variants, to make sure each base is only
tallied once ?
Best,
Thomas
library(VariantTools)
## 'which' is a set of non-overlapping regions
tally.param <- TallyVariantsParam(gmapR::TP53Genome(),
high_base_quality = 23L,
which = gmapR::TP53Which())
bams <- LungCancerLines::LungCancerBamFiles()
raw.variants <- tallyVariants(bams$H1993, tally.param)
any(duplicated( raw.variants )) ## FALSE
## 'which' is a set of duplicated regions
tally.param <- TallyVariantsParam(
gmapR::TP53Genome(),
high_base_quality = 23L,
which = c(
gmapR::TP53Which(),
gmapR::TP53Which()
)
)
raw.variants <- tallyVariants(bams$H1993, tally.param)
any(duplicated( raw.variants )) ## TRUE
sort(raw.variants)[1:4]
### SessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats4 stats graphics grDevices
[6] utils datasets methods base
other attached packages:
[1] VariantTools_1.8.0 VariantAnnotation_1.12.9
[3] Rsamtools_1.18.2 Biostrings_2.34.1
[5] XVector_0.6.0 GenomicRanges_1.18.4
[7] GenomeInfoDb_1.2.4 IRanges_2.0.1
[9] S4Vectors_0.4.0 BiocGenerics_0.12.1
[11] BiocInstaller_1.16.1 roxygen2_4.1.0
[13] devtools_1.7.0
loaded via a namespace (and not attached):
[1] AnnotationDbi_1.28.1 base64enc_0.1-2
[3] BatchJobs_1.5 BBmisc_1.9
[5] Biobase_2.26.0 BiocParallel_1.0.3
[7] biomaRt_2.22.0 bitops_1.0-6
[9] brew_1.0-6 BSgenome_1.34.1
[11] checkmate_1.5.1 codetools_0.2-10
[13] DBI_0.3.1 digest_0.6.8
[15] fail_1.2 foreach_1.4.2
[17] GenomicAlignments_1.2.1 GenomicFeatures_1.18.3
[19] gmapR_1.8.0 grid_3.1.2
[21] iterators_1.0.7 lattice_0.20-29
[23] LungCancerLines_0.3.1 Matrix_1.1-5
[25] Rcpp_0.11.4 RCurl_1.95-4.5
[27] RSQLite_1.0.0 rtracklayer_1.26.2
[29] sendmailR_1.2-1 stringr_0.6.2
[31] tools_3.1.2 XML_3.98-1.1
[33] zlibbioc_1.12.0
[[alternative HTML version deleted]]
More information about the Bioc-devel
mailing list