[BioC] Lifting over a bam file with Rsamtools and GenomicRanges
rubi [guest]
guest at bioconductor.org
Sun Sep 22 20:37:46 CEST 2013
Hi,
I have a bam file that was a aligned to a non-reference genome and I have the coordinate mapping between the non-reference and the reference genomes, either in a chains format or in the following block like format:
#CHR REF NONREF
chr1 1 1
chr1 3003641 0
chr1 3003645 3003641
chr1 3003650 0
chr1 3003654 3003646
chr1 3006791 0
chr1 3006793 3006783
chr1 3006835 0
chr1 3006836 3006825
chr1 0 3007262
chr1 3007273 0
chr1 3007275 3007263
chr1 0 3008478
chr1 3008490 3008481
Where the lines containing 0's indicate deletions (either in the ref or in the nonref).
I would like to lift-over my bam from the non-reference to the reference which basically means lifting over the "pos" and "cigar" fields. Using the functions of GenomicRanges it is straight forward to do it for the "pos" field but it gets complicated for the "cigar" field. Any idea how can this be done efficiently?
-- output of sessionInfo():
R version 2.15.2 (2012-10-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.8.10 Rsamtools_1.10.2 Biostrings_2.26.3 GenomicRanges_1.10.7 IRanges_1.16.6 BiocGenerics_0.4.0
loaded via a namespace (and not attached):
[1] bitops_1.0-5 parallel_2.15.2 stats4_2.15.2 tools_2.15.2 zlibbioc_1.4.0
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list