[Bioc-devel] Feasibility of Parallel Extraction of Matches with extractAllMatches

Dario Strbenac dstr7320 at uni.sydney.edu.au
Wed Nov 16 11:00:07 CET 2016

Good day,

I'd like to request that extractAllMatches works when subject is an XStringSet. The function could check that subject and mindex have the same length and then process them in parallel. Currently, the following example isn't immediately possible.

words <- BStringSet(c("xxGOATzz", "xxMOATzz", "xxNOTEzz"))
matches <- vmatchPattern("GOAT", words, max.mismatch = 1)
similarWords <- extractAllMatches(words, matches) # Not possible.

Could that be implemented for the next release of Biostrings? Or, perhaps it can be deprecated since it duplicates the functionality of substr?

> substr(words, start(matches), end(matches))
[1] "GOAT" "MOAT" NA 

Also, the expected subsetting fails for MIndex objects.

> class(matches)
[1] "ByPos_MIndex"
> length(matches)
[1] 3
> length(matches[1])
[1] 3

Dario Strbenac
University of Sydney
Camperdown NSW 2050

More information about the Bioc-devel mailing list