[R] matching a sequence in a vector?
Petr Savicky
savicky at cs.cas.cz
Wed Feb 15 09:15:31 CET 2012
On Wed, Feb 15, 2012 at 02:17:35PM +1000, Redding, Matthew wrote:
> Hi All,
>
>
> I've been trawling through the documentation and listserv archives on this topic -- but
> as yet have not found a solution. I'm sure this is pretty simple with R, but I cannot work out how without
> resorting to ugly nested loops.
>
> As far as I can tell, grep, match, and %in% are not the correct tools.
>
> Question:
> given these vectors --
> patrn <- c(1,2,3,4)
> exmpl <- c(3,3,4,2,3,1,2,3,4,8,8,23,1,2,3,4,4,34,4,3,2,1,1,2,3,4)
>
> how do I get the desired answer by finding the occurence of the pattern and returning the starting indices:
> 6, 13, 23
Hi.
A more efficient version of the previous suggestion
is as follows.
m <- length(patrn)
n <- length(exmpl)
candidate <- seq.int(length=n-m+1)
for (i in seq.int(length=m)) {
candidate <- candidate[patrn[i] == exmpl[candidate + i - 1]]
}
candidate
[1] 6 13 23
In this solution, the set of candidate indices decreases. If
the prefixes of the searched pattern are rare, the set of
candidates is reduced in a few iterations and the remaining
iterations become faster.
Hope this helps.
Petr Savicky.
More information about the R-help
mailing list