[Bioc-devel] A quick check for matching seqnames/order needed for Views on RleList?

Sean Davis seandavi at gmail.com
Fri Feb 20 14:14:58 CET 2015


On Fri, Feb 20, 2015 at 8:10 AM, Malcolm Perry <mgperry32 at gmail.com> wrote:

> Hi Sean,
>
> The idiom I've most often seen for getting round this problem is to first
> find matching chromosomes in the RleList, and subset the RangesList based
> on this order:
>
> chrs = intersect(names(rle_list), as.character(seqlevels(windows)))
> myViews = Views(rle_list[chrs], as(windows, "RangesList")[chrs])
>

Thanks, Malcolm.

This works just fine and fixes the problem.  However, I noticed that
creating the Views() without doing what you suggest above does not result
in a warning or error and results in an object that "works" but is
nonsensical.  I simply want to protect myself from making the same mistake
twice by some checking inside the Views() method.

Sean



>
> Hope this helps,
>
> Malcolm
>
> On Fri, Feb 20, 2015 at 12:53 PM, Sean Davis <seandavi at gmail.com> wrote:
>
>> I am calculating coverage metrics of a BAM file on the CDS regions.  When
>> I
>> form the RangesList and do coverage(), the resulting coverage vector
>> applies the views to the regions from the RangesList without checking on
>> matching ordering or seqlevels of the RleList and the RangesList.  This
>> results, in this case, in Views from chr1 being applied to the coverage
>> for
>> chrM, for example.  Would it make sense to have the views method check the
>> ordering and seqlevels (and perhaps even do the reordering, if necessary)?
>>
>> Example code showing the problem (not fully reproducible--sorry).
>>
>> Thanks,
>> Sean
>>
>>
>> > cdsreg
>> GRanges object with 237533 ranges and 1 metadata column:
>>            seqnames               ranges strand   |    cds_id
>>               <Rle>            <IRanges>  <Rle>   | <integer>
>>        [1]     chr1     [ 12190,  12227]      +   |         1
>>        [2]     chr1     [ 12595,  12721]      +   |         2
>>        [3]     chr1     [ 13403,  13639]      +   |         3
>>        [4]     chr1     [ 69091,  70008]      +   |         4
>>        [5]     chr1     [324343, 324345]      +   |         5
>>        ...      ...                  ...    ... ...       ...
>>   [237529]     chrY [26959330, 26959332]      -   |    227333
>>   [237530]     chrY [27184245, 27184263]      -   |    227334
>>   [237531]     chrY [27184956, 27185061]      -   |    227335
>>   [237532]     chrY [27187916, 27188033]      -   |    227336
>>   [237533]     chrY [27190093, 27190170]      -   |    227337
>>   -------
>>   seqinfo: 93 sequences (1 circular) from hg19 genome
>> > cdsrl = as(cdsreg,'RangesList')
>> > names(cdsrl)
>>  [1] "chr1"                  "chr2"                  "chr3"
>>  "chr4"
>>  [5] "chr5"                  "chr6"                  "chr7"
>>  "chr8"
>>  [9] "chr9"                  "chr10"                 "chr11"
>>   "chr12"
>> [13] "chr13"                 "chr14"                 "chr15"
>>   "chr16"
>> [17] "chr17"                 "chr18"                 "chr19"
>>   "chr20"
>> [21] "chr21"                 "chr22"                 "chrX"
>>  "chrY"
>> [25] "chrM"                  "chr1_gl000191_random"
>> "chr1_gl000192_random"
>>  "chr4_ctg9_hap1"
>> [29] "chr4_gl000193_random"  "chr4_gl000194_random"  "chr6_apd_hap1"
>>   "chr6_cox_hap2"
>> [33] "chr6_dbb_hap3"         "chr6_mann_hap4"        "chr6_mcf_hap5"
>>   "chr6_qbl_hap6"
>> [37] "chr6_ssto_hap7"        "chr7_gl000195_random"
>> "chr8_gl000196_random"
>>  "chr8_gl000197_random"
>> [41] "chr9_gl000198_random"  "chr9_gl000199_random"
>> "chr9_gl000200_random"
>>  "chr9_gl000201_random"
>> [45] "chr11_gl000202_random" "chr17_ctg5_hap1"
>> "chr17_gl000203_random" "chr17_gl000204_random"
>> [49] "chr17_gl000205_random" "chr17_gl000206_random"
>> "chr18_gl000207_random" "chr19_gl000208_random"
>> [53] "chr19_gl000209_random" "chr21_gl000210_random" "chrUn_gl000211"
>>  "chrUn_gl000212"
>> [57] "chrUn_gl000213"        "chrUn_gl000214"        "chrUn_gl000215"
>>  "chrUn_gl000216"
>> [61] "chrUn_gl000217"        "chrUn_gl000218"        "chrUn_gl000219"
>>  "chrUn_gl000220"
>> [65] "chrUn_gl000221"        "chrUn_gl000222"        "chrUn_gl000223"
>>  "chrUn_gl000224"
>> [69] "chrUn_gl000225"        "chrUn_gl000226"        "chrUn_gl000227"
>>  "chrUn_gl000228"
>> [73] "chrUn_gl000229"        "chrUn_gl000230"        "chrUn_gl000231"
>>  "chrUn_gl000232"
>> [77] "chrUn_gl000233"        "chrUn_gl000234"        "chrUn_gl000235"
>>  "chrUn_gl000236"
>> [81] "chrUn_gl000237"        "chrUn_gl000238"        "chrUn_gl000239"
>>  "chrUn_gl000240"
>> [85] "chrUn_gl000241"        "chrUn_gl000242"        "chrUn_gl000243"
>>  "chrUn_gl000244"
>> [89] "chrUn_gl000245"        "chrUn_gl000246"        "chrUn_gl000247"
>>  "chrUn_gl000248"
>> [93] "chrUn_gl000249"
>> > names(cov)
>>  [1] "chrM"                  "chr1"                  "chr2"
>>  "chr3"
>>  [5] "chr4"                  "chr5"                  "chr6"
>>  "chr7"
>>  [9] "chr8"                  "chr9"                  "chr10"
>>   "chr11"
>> [13] "chr12"                 "chr13"                 "chr14"
>>   "chr15"
>> [17] "chr16"                 "chr17"                 "chr18"
>>   "chr19"
>> [21] "chr20"                 "chr21"                 "chr22"
>>   "chrX"
>> [25] "chrY"                  "chr1_gl000191_random"
>> "chr1_gl000192_random"
>>  "chr4_ctg9_hap1"
>> [29] "chr4_gl000193_random"  "chr4_gl000194_random"  "chr6_apd_hap1"
>>   "chr6_cox_hap2"
>> [33] "chr6_dbb_hap3"         "chr6_mann_hap4"        "chr6_mcf_hap5"
>>   "chr6_qbl_hap6"
>> [37] "chr6_ssto_hap7"        "chr7_gl000195_random"
>> "chr8_gl000196_random"
>>  "chr8_gl000197_random"
>> [41] "chr9_gl000198_random"  "chr9_gl000199_random"
>> "chr9_gl000200_random"
>>  "chr9_gl000201_random"
>> [45] "chr11_gl000202_random" "chr17_ctg5_hap1"
>> "chr17_gl000203_random" "chr17_gl000204_random"
>> [49] "chr17_gl000205_random" "chr17_gl000206_random"
>> "chr18_gl000207_random" "chr19_gl000208_random"
>> [53] "chr19_gl000209_random" "chr21_gl000210_random" "chrUn_gl000211"
>>  "chrUn_gl000212"
>> [57] "chrUn_gl000213"        "chrUn_gl000214"        "chrUn_gl000215"
>>  "chrUn_gl000216"
>> [61] "chrUn_gl000217"        "chrUn_gl000218"        "chrUn_gl000219"
>>  "chrUn_gl000220"
>> [65] "chrUn_gl000221"        "chrUn_gl000222"        "chrUn_gl000223"
>>  "chrUn_gl000224"
>> [69] "chrUn_gl000225"        "chrUn_gl000226"        "chrUn_gl000227"
>>  "chrUn_gl000228"
>> [73] "chrUn_gl000229"        "chrUn_gl000230"        "chrUn_gl000231"
>>  "chrUn_gl000232"
>> [77] "chrUn_gl000233"        "chrUn_gl000234"        "chrUn_gl000235"
>>  "chrUn_gl000236"
>> [81] "chrUn_gl000237"        "chrUn_gl000238"        "chrUn_gl000239"
>>  "chrUn_gl000240"
>> [85] "chrUn_gl000241"        "chrUn_gl000242"        "chrUn_gl000243"
>>  "chrUn_gl000244"
>> [89] "chrUn_gl000245"        "chrUn_gl000246"        "chrUn_gl000247"
>>  "chrUn_gl000248"
>> [93] "chrUn_gl000249"
>> > covView = Views(cov,cdsrl)
>> > covView[[1]]
>> Views on a 16571-length Rle subject
>>
>> views:
>>             start       end width
>>     [1]     12190     12227    38 [1367 1357 1363 1358 1347 1375 1381 1379
>> 1381 1387 1385 1377 1382 1368 1363 ...]
>>     [2]     12595     12721   127 [1410 1416 1414 1421 1430 1430 1428 1432
>> 1428 1419 1421 1418 1426 1427 1439 ...]
>>     [3]     13403     13639   237 [1476 1468 1460 1461 1465 1455 1448 1448
>> 1442 1448 1460 1460 1458 1435 1440 ...]
>>     [4]     69091     70008   918 [ ]
>>     [5]    324343    324345     3 [ ]
>>     [6]    324439    325605  1167 [ ]
>>     [7]    324515    324686   172 [ ]
>>     [8]    324719    325124   406 [ ]
>>     [9]    325383    325605   223 [ ]
>>     ...       ...       ...   ... ...
>> [23550] 249149924 249150145   222 [ ]
>> [23551] 249150487 249150533    47 [ ]
>> [23552] 249150487 249150621   135 [ ]
>> [23553] 249150713 249150761    49 [ ]
>> [23554] 249151433 249151696   264 [ ]
>> [23555] 249152027 249152058    32 [ ]
>> [23556] 249152330 249152508   179 [ ]
>> [23557] 249152330 249152520   191 [ ]
>> [23558] 249152711 249152713     3 [ ]
>>
>> > sessionInfo()R Under development (unstable) (2014-11-18 r66997)
>>
>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>
>> locale:
>> [1] C
>>
>> attached base packages:
>> [1] stats4    parallel  stats     graphics  grDevices utils
>> datasets  methods   base
>>
>> other attached packages:
>>  [1] TxDb.Hsapiens.UCSC.hg19.knownGene_3.0.0 GenomicFeatures_1.19.15
>>  [3] AnnotationDbi_1.29.17                   Biobase_2.27.1
>>  [5] GenomicAlignments_1.3.27                VariantAnnotation_1.13.24
>>  [7] Rsamtools_1.19.26                       Biostrings_2.35.7
>>  [9] XVector_0.7.3                           GenomicRanges_1.19.35
>> [11] GenomeInfoDb_1.3.12                     IRanges_2.1.35
>> [13] S4Vectors_0.5.17                        BiocGenerics_0.13.4
>> [15] roxygen2_4.1.0                          BiocInstaller_1.17.5
>>
>> loaded via a namespace (and not attached):
>>  [1] BBmisc_1.8          BSgenome_1.35.16    BatchJobs_1.5
>> BiocParallel_1.1.12 DBI_0.3.1
>>  [6] RCurl_1.95-4.5      RSQLite_1.0.0       Rcpp_0.11.4
>> XML_3.98-1.1        base64enc_0.1-2
>> [11] biomaRt_2.23.5      bitops_1.0-6        brew_1.0-6
>> checkmate_1.5.1     codetools_0.2-10
>> [16] digest_0.6.8        fail_1.2            foreach_1.4.2
>> iterators_1.0.7     rtracklayer_1.27.7
>> [21] sendmailR_1.2-1     stringr_0.6.2       tools_3.2.0
>>  zlibbioc_1.13.0
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list