[Bioc-sig-seq] trimLRPatterns; matching pattern from pos -1 gives solveUserSEW error

Ludo Pagie lpagie at xs4all.nl
Wed Feb 16 14:12:25 CET 2011


Hi all,

I'm trimming reads using an Rpattern which in some cases extends on the 
left side. If the pattern starts at exactly position -1 trimLRPattern 
throws an error. If the pattern starts further 'upstream' it returns a 
zero length sequence, as expected:

library(ShortRead)
# create a pattern
fragment <- paste(sample(c('A','C','G','T'),10,replace=TRUE),collapse='')
# create some reads based on the pattern; for different reads the
# pattern extends either on the left, the right, or both sides
reads <- substring(fragment,1:4,7:10)

# trim all reads; all reads should match the pattern fully and be
# trimmed from start to end
trimLRPatterns(Rpattern=fragment, subject=DNAStringSet(reads), 
max.Rmismatch=0.5, ranges=TRUE, Rfixed=FALSE)

# IRanges of length 4
#     start end width
# [1]     1   0     0
# [2]     0  -1     0
# [3]    -1  -2     0
# [4]    -2  -3     0

# if I want the reads to be trimmed right away an error is thrown for
# the second read
trimLRPatterns(Rpattern=fragment, subject=DNAStringSet(reads[c(1,3,4)]), 
max.Rmismatch=0.5, ranges=FALSE)
#   A DNAStringSet instance of length 3
#     width seq
# [1]     0
# [2]     0
# [3]     0

trimLRPatterns(Rpattern=fragment, subject=DNAStringSet(reads[2]), 
max.Rmismatch=0.5, ranges=FALSE)

# Error in solveUserSEW(width(x), start = start, end = end, width =
# width) :
#   solving row 1: 'allow.nonnarrowing' is FALSE and the supplied start
# (0) is < 1


 > sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: i686-pc-linux-gnu (32-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] ShortRead_1.8.2     Rsamtools_1.2.1     lattice_0.19-13
[4] Biostrings_2.18.2   GenomicRanges_1.2.2 IRanges_1.8.7
[7] multtest_2.6.0      Biobase_2.10.0

loaded via a namespace (and not attached):
[1] grid_2.12.0     hwriter_1.3     MASS_7.3-9      splines_2.12.0
[5] survival_2.36-2 tools_2.12.0



Where is this error coming from?

Ludo



More information about the Bioc-sig-sequencing mailing list