[Bioc-devel] A quick check for matching seqnames/order needed for Views on RleList?
Sean Davis
seandavi at gmail.com
Fri Feb 20 13:53:19 CET 2015
I am calculating coverage metrics of a BAM file on the CDS regions. When I
form the RangesList and do coverage(), the resulting coverage vector
applies the views to the regions from the RangesList without checking on
matching ordering or seqlevels of the RleList and the RangesList. This
results, in this case, in Views from chr1 being applied to the coverage for
chrM, for example. Would it make sense to have the views method check the
ordering and seqlevels (and perhaps even do the reordering, if necessary)?
Example code showing the problem (not fully reproducible--sorry).
Thanks,
Sean
> cdsreg
GRanges object with 237533 ranges and 1 metadata column:
seqnames ranges strand | cds_id
<Rle> <IRanges> <Rle> | <integer>
[1] chr1 [ 12190, 12227] + | 1
[2] chr1 [ 12595, 12721] + | 2
[3] chr1 [ 13403, 13639] + | 3
[4] chr1 [ 69091, 70008] + | 4
[5] chr1 [324343, 324345] + | 5
... ... ... ... ... ...
[237529] chrY [26959330, 26959332] - | 227333
[237530] chrY [27184245, 27184263] - | 227334
[237531] chrY [27184956, 27185061] - | 227335
[237532] chrY [27187916, 27188033] - | 227336
[237533] chrY [27190093, 27190170] - | 227337
-------
seqinfo: 93 sequences (1 circular) from hg19 genome
> cdsrl = as(cdsreg,'RangesList')
> names(cdsrl)
[1] "chr1" "chr2" "chr3"
"chr4"
[5] "chr5" "chr6" "chr7"
"chr8"
[9] "chr9" "chr10" "chr11"
"chr12"
[13] "chr13" "chr14" "chr15"
"chr16"
[17] "chr17" "chr18" "chr19"
"chr20"
[21] "chr21" "chr22" "chrX"
"chrY"
[25] "chrM" "chr1_gl000191_random" "chr1_gl000192_random"
"chr4_ctg9_hap1"
[29] "chr4_gl000193_random" "chr4_gl000194_random" "chr6_apd_hap1"
"chr6_cox_hap2"
[33] "chr6_dbb_hap3" "chr6_mann_hap4" "chr6_mcf_hap5"
"chr6_qbl_hap6"
[37] "chr6_ssto_hap7" "chr7_gl000195_random" "chr8_gl000196_random"
"chr8_gl000197_random"
[41] "chr9_gl000198_random" "chr9_gl000199_random" "chr9_gl000200_random"
"chr9_gl000201_random"
[45] "chr11_gl000202_random" "chr17_ctg5_hap1"
"chr17_gl000203_random" "chr17_gl000204_random"
[49] "chr17_gl000205_random" "chr17_gl000206_random"
"chr18_gl000207_random" "chr19_gl000208_random"
[53] "chr19_gl000209_random" "chr21_gl000210_random" "chrUn_gl000211"
"chrUn_gl000212"
[57] "chrUn_gl000213" "chrUn_gl000214" "chrUn_gl000215"
"chrUn_gl000216"
[61] "chrUn_gl000217" "chrUn_gl000218" "chrUn_gl000219"
"chrUn_gl000220"
[65] "chrUn_gl000221" "chrUn_gl000222" "chrUn_gl000223"
"chrUn_gl000224"
[69] "chrUn_gl000225" "chrUn_gl000226" "chrUn_gl000227"
"chrUn_gl000228"
[73] "chrUn_gl000229" "chrUn_gl000230" "chrUn_gl000231"
"chrUn_gl000232"
[77] "chrUn_gl000233" "chrUn_gl000234" "chrUn_gl000235"
"chrUn_gl000236"
[81] "chrUn_gl000237" "chrUn_gl000238" "chrUn_gl000239"
"chrUn_gl000240"
[85] "chrUn_gl000241" "chrUn_gl000242" "chrUn_gl000243"
"chrUn_gl000244"
[89] "chrUn_gl000245" "chrUn_gl000246" "chrUn_gl000247"
"chrUn_gl000248"
[93] "chrUn_gl000249"
> names(cov)
[1] "chrM" "chr1" "chr2"
"chr3"
[5] "chr4" "chr5" "chr6"
"chr7"
[9] "chr8" "chr9" "chr10"
"chr11"
[13] "chr12" "chr13" "chr14"
"chr15"
[17] "chr16" "chr17" "chr18"
"chr19"
[21] "chr20" "chr21" "chr22"
"chrX"
[25] "chrY" "chr1_gl000191_random" "chr1_gl000192_random"
"chr4_ctg9_hap1"
[29] "chr4_gl000193_random" "chr4_gl000194_random" "chr6_apd_hap1"
"chr6_cox_hap2"
[33] "chr6_dbb_hap3" "chr6_mann_hap4" "chr6_mcf_hap5"
"chr6_qbl_hap6"
[37] "chr6_ssto_hap7" "chr7_gl000195_random" "chr8_gl000196_random"
"chr8_gl000197_random"
[41] "chr9_gl000198_random" "chr9_gl000199_random" "chr9_gl000200_random"
"chr9_gl000201_random"
[45] "chr11_gl000202_random" "chr17_ctg5_hap1"
"chr17_gl000203_random" "chr17_gl000204_random"
[49] "chr17_gl000205_random" "chr17_gl000206_random"
"chr18_gl000207_random" "chr19_gl000208_random"
[53] "chr19_gl000209_random" "chr21_gl000210_random" "chrUn_gl000211"
"chrUn_gl000212"
[57] "chrUn_gl000213" "chrUn_gl000214" "chrUn_gl000215"
"chrUn_gl000216"
[61] "chrUn_gl000217" "chrUn_gl000218" "chrUn_gl000219"
"chrUn_gl000220"
[65] "chrUn_gl000221" "chrUn_gl000222" "chrUn_gl000223"
"chrUn_gl000224"
[69] "chrUn_gl000225" "chrUn_gl000226" "chrUn_gl000227"
"chrUn_gl000228"
[73] "chrUn_gl000229" "chrUn_gl000230" "chrUn_gl000231"
"chrUn_gl000232"
[77] "chrUn_gl000233" "chrUn_gl000234" "chrUn_gl000235"
"chrUn_gl000236"
[81] "chrUn_gl000237" "chrUn_gl000238" "chrUn_gl000239"
"chrUn_gl000240"
[85] "chrUn_gl000241" "chrUn_gl000242" "chrUn_gl000243"
"chrUn_gl000244"
[89] "chrUn_gl000245" "chrUn_gl000246" "chrUn_gl000247"
"chrUn_gl000248"
[93] "chrUn_gl000249"
> covView = Views(cov,cdsrl)
> covView[[1]]
Views on a 16571-length Rle subject
views:
start end width
[1] 12190 12227 38 [1367 1357 1363 1358 1347 1375 1381 1379
1381 1387 1385 1377 1382 1368 1363 ...]
[2] 12595 12721 127 [1410 1416 1414 1421 1430 1430 1428 1432
1428 1419 1421 1418 1426 1427 1439 ...]
[3] 13403 13639 237 [1476 1468 1460 1461 1465 1455 1448 1448
1442 1448 1460 1460 1458 1435 1440 ...]
[4] 69091 70008 918 [ ]
[5] 324343 324345 3 [ ]
[6] 324439 325605 1167 [ ]
[7] 324515 324686 172 [ ]
[8] 324719 325124 406 [ ]
[9] 325383 325605 223 [ ]
... ... ... ... ...
[23550] 249149924 249150145 222 [ ]
[23551] 249150487 249150533 47 [ ]
[23552] 249150487 249150621 135 [ ]
[23553] 249150713 249150761 49 [ ]
[23554] 249151433 249151696 264 [ ]
[23555] 249152027 249152058 32 [ ]
[23556] 249152330 249152508 179 [ ]
[23557] 249152330 249152520 191 [ ]
[23558] 249152711 249152713 3 [ ]
> sessionInfo()R Under development (unstable) (2014-11-18 r66997)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] C
attached base packages:
[1] stats4 parallel stats graphics grDevices utils
datasets methods base
other attached packages:
[1] TxDb.Hsapiens.UCSC.hg19.knownGene_3.0.0 GenomicFeatures_1.19.15
[3] AnnotationDbi_1.29.17 Biobase_2.27.1
[5] GenomicAlignments_1.3.27 VariantAnnotation_1.13.24
[7] Rsamtools_1.19.26 Biostrings_2.35.7
[9] XVector_0.7.3 GenomicRanges_1.19.35
[11] GenomeInfoDb_1.3.12 IRanges_2.1.35
[13] S4Vectors_0.5.17 BiocGenerics_0.13.4
[15] roxygen2_4.1.0 BiocInstaller_1.17.5
loaded via a namespace (and not attached):
[1] BBmisc_1.8 BSgenome_1.35.16 BatchJobs_1.5
BiocParallel_1.1.12 DBI_0.3.1
[6] RCurl_1.95-4.5 RSQLite_1.0.0 Rcpp_0.11.4
XML_3.98-1.1 base64enc_0.1-2
[11] biomaRt_2.23.5 bitops_1.0-6 brew_1.0-6
checkmate_1.5.1 codetools_0.2-10
[16] digest_0.6.8 fail_1.2 foreach_1.4.2
iterators_1.0.7 rtracklayer_1.27.7
[21] sendmailR_1.2-1 stringr_0.6.2 tools_3.2.0 zlibbioc_1.13.0
[[alternative HTML version deleted]]
More information about the Bioc-devel
mailing list