[R] Count matches of a sequence in a vector?
William Dunlap
wdunlap at tibco.com
Wed Apr 21 23:19:25 CEST 2010
> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Jeff Brown
> Sent: Wednesday, April 21, 2010 8:08 AM
> To: r-help at r-project.org
> Subject: Re: [R] Count matches of a sequence in a vector?
>
>
> This sort of calculation can't be vectorized; you'll have to
> iterate through
> the sequence, e.g. with a "for" loop. I don't know if a
> routine has already
> been written.
It can be partially vectorized:
f2 <- function (v, p) {
retval <- TRUE
i <- seq_len(length(v) - length(p) + 1L) - 1L
for (j in seq_along(p)) {
retval <- retval & v[i + j] == p[j]
}
retval
}
E.g., for the following data
set.seed(1)
v <- sample(1:10, size=1e6, replace=TRUE)
p <- 2:4
compare using zoo::rollapply (which loops over the long v)
f1 <- function(v, p)rollapply(zoo(v), length(p), function(x)all(x==p))
and f2 (which loops over the short p). I get
> library(zoo)
> system.time(r1 <- f1(v,p))
user system elapsed
13.17 0.06 13.25
> system.time(r2 <- f2(v,p))
user system elapsed
0.12 0.00 0.12
> identical(which(r1), which(r2))
[1] TRUE
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> --
> View this message in context:
> http://n4.nabble.com/Count-matches-of-a-sequence-in-a-vector-t
p2019018p2019108.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list