[Bioc-devel] Problem in asBam from Rsamtools
rcaloger
raffaele.calogero at gmail.com
Sat Jun 1 17:04:09 CEST 2013
Hi,
I am using the devel version of Bioconductor as part of the development
of my package chimera.
Testing a new function in chimera, that uses Rsubread package, I
encountered a problem in converting a sam file generated by Rsubread in
a bam file.
I used the function asBam from Rsamtools and I got the following error:
In doTryCatch(return(expr), name, parentenv, handler) :
Parse error at line 14667325: sequence and quality are inconsistent
I managed to run asBam if I use only the sam file till line 14667324
Instead I get the above error if I use a sam file finishing at line 14667325
The line that create the problem is the following:
HWI-ST169:273:D0YW6ACXX:2:1201:4070:162856 141 * 0 0 *
* 0 0
AAAAAAGGGTTGAATTATTTTCACTTGCCCACGTAGTTTATGAATGTGGGAAATAGCTTCAAAGACAGATTAAATGATTTGCCCAAGGCCACAGAAAAGAG
@@@FFFFFHABHHJGGBFIGIFHGIJHGJGJIFBGHDBG9BDAFIIDHIIGCHCHI<GACC at ADHHHE;7?@DEFED>@;ACCC>ABB;AAD<BC>
77 * 0 0 * * 0 0
CATGGATGAGGAGAATGAGGATTTTGCGCCGGCTGCTCAGAAGATACCGTGAATCTAAGAAGATCGATCGCCACATGTATCACAGCCTGTACCTGAAGGGG
@@@DD?BADHF<D<ACG>FFE;BBF at B?@C at F:(?1.=)))883)8=7@(65??EEBDEC37;;>???=BB@<BBCCACBDDCC:?BCBC:@#########
Does anybody has an idea of what is wrong in this line?
There is any way to validate the sam file before running asBam to detect
and filtered out lines that might create problems in the conversion into
Bam?
Cheers
Raf
########
sessionInfo()
R version 3.0.0 (2013-04-03)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] Rsamtools_1.13.16 Biostrings_2.29.3 GenomicRanges_1.13.15
[4] XVector_0.1.0 IRanges_1.19.8 BiocGenerics_0.7.2
loaded via a namespace (and not attached):
[1] bitops_1.0-5 stats4_3.0.0 zlibbioc_1.7.0
--
----------------------------------------
Prof. Raffaele A. Calogero
Bioinformatics and Genomics Unit
MBC Centro di Biotecnologie Molecolari
Via Nizza 52, Torino 10126
tel. ++39 0116706457
Fax ++39 0112366457
Mobile ++39 3333827080
email: raffaele.calogero at unito.it
raffaele[dot]calogero[at]gmail[dot]com
www: http://www.bioinformatica.unito.it
More information about the Bioc-devel
mailing list