[R] use sliding window to count substrings found in large string

Immanuel mane.desk at googlemail.com
Wed Jul 7 19:45:39 CEST 2010


Hey,
saved my day.
Now can watch the football semi-final
thanks
> Turn them into factors with the appropriate levels before counting
> them with table:
>
> # generate an input string n long
> set.seed(123)
> n <- 300
> lets_1 <- paste(sample(letters[1:5], n, replace = TRUE), collapse = "")
> lets_2 <- paste(sample(letters[1:5], n, replace = TRUE), collapse = "")
>
> # get rolling k-length sequences and count
> k <- 3
> s1 <- substring(lets_1, 1:(n-k+1), k:n)
> s2 <- substring(lets_2, 1:(n-k+1), k:n)
> levs <- sort(unique(union(s1, s2)))
> table(factor(s1, levs))
> table(factor(s2, levs))
>
>



More information about the R-help mailing list