[BioC] Strange error using the \"narrow\" function for a ShortReadQ object
Tomas Bjorklund [guest]
guest at bioconductor.org
Tue Apr 29 19:14:58 CEST 2014
I have an issue where I try to do something rather simple but run into a strange issue.
The application is rather simple. I have a "Barcode" sequence in my amplicon sequences using Illumina MiSeq, that follows a certain pattern.
What I want to achieve is to identify the correct reads containing a barcode and then crop the reads to cover only the barcode and then save it back to a FASTQ file retaining quality and ID information of each barcode.
Everything works well in this code except for the narrow command.
Here is my code:
rawReads <- readFastq("ChAT12-13_GGACTCCT-TATCCTCT_L001_R1_001_50k.fastq")
rawReads <- rawReads[nFilter()(rawReads)] #Removes reads containing N's
rawReads <- rawReads[width(rawReads) > 55] #drops reads shorter than 55 nt
refBC <- DNAString("TCAGCVHDBVHDBVHDBVHDBVHDBAATTA")
hits <- vcountPattern(refBC, sread(rawReads), max.mismatch=0,fixed=FALSE, with.indels=FALSE)
selected <- rawReads[as.logical(hits)]
trimList <- vmatchPattern(refBC, sread(selected), max.mismatch=0,fixed=FALSE)
selected <- narrow(selected, start = unlist(startIndex(trimList)), end = unlist(endIndex(trimList)))
>From the last row I get the message:
> selected <- narrow(selected, start = unlist(startIndex(trimList)), end = unlist(endIndex(trimList)))
Error in solveUserSEW(width(x), start = start, end = end, width = width) :
'start', 'end' or 'width' is longer than 'refwidths'
I cannot figure out how this is possible. The âendIndex" should never be possible to be larger then the length of the sequence that is analyzed in the âvmatchPattern", correct? Am I missing something fundamental?
-- output of sessionInfo():
R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] ShortRead_1.22.0 GenomicAlignments_1.0.0 BSgenome_1.32.0 Rsamtools_1.16.0 GenomicRanges_1.16.2
[6] GenomeInfoDb_1.0.2 Biostrings_2.32.0 XVector_0.4.0 IRanges_1.22.3 BiocParallel_0.6.0
[11] BiocGenerics_0.10.0
loaded via a namespace (and not attached):
[1] BatchJobs_1.2 BBmisc_1.6 Biobase_2.24.0 bitops_1.0-6 brew_1.0-6 codetools_0.2-8
[7] DBI_0.2-7 digest_0.6.4 fail_1.2 foreach_1.4.2 grid_3.1.0 hwriter_1.3
[13] iterators_1.0.7 lattice_0.20-29 latticeExtra_0.6-26 plyr_1.8.1 RColorBrewer_1.0-5 Rcpp_0.11.1
[19] RSQLite_0.11.4 sendmailR_1.1-2 stats4_3.1.0 stringr_0.6.2 tools_3.1.0 zlibbioc_1.10.0
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list