[R] vectorized sub, gsub, grep, etc.
john
jjthaden at flash.net
Thu Oct 9 06:38:01 CEST 2008
Hello Christos,
To my surprise, vectorization actually hurt processing speed!
#Example
X <- c("ab", "cd", "ef")
patt <- c("b", "cd", "a")
repl <- c("B", "CD", "A")
sub2 <- function(pattern, replacement, x) {
len <- length(x)
if (length(pattern) == 1)
pattern <- rep(pattern, len)
if (length(replacement) == 1)
replacement <- rep(replacement, len)
FUN <- function(i, ...) {
sub(pattern[i], replacement[i], x[i], fixed = TRUE)
}
idx <- 1:length(x)
sapply(idx, FUN)
}
system.time( for(i in 1:10000) sub2(patt, repl, X) )
user system elapsed
1.18 0.07 1.26
system.time( for(i in 1:10000) mapply(function(p, r, x) sub(p, r, x, fixed = TRUE), p=patt, r=repl, x=X) )
user system elapsed
1.42 0.05 1.47
So much for avoiding loops.
John Thaden
======= At 2008-10-07, 14:58:10 Christos wrote: =======
>John,
>Try the following:
>
> mapply(function(p, r, x) sub(p, r, x, fixed = TRUE), p=patt, r=repl, x=X)
> b cd a
>"aB" "CD" "ef"
>
>-Christos
>> -----My Original Message-----
>> R pattern-matching and replacement functions are
>> vectorized: they can operate on vectors of targets.
>> However, they can only use one pattern and replacement.
>> Here is code to apply a different pattern and replacement for
>> every target. My question: can it be done better?
>>
>> sub2 <- function(pattern, replacement, x) {
>> len <- length(x)
>> if (length(pattern) == 1)
>> pattern <- rep(pattern, len)
>> if (length(replacement) == 1)
>> replacement <- rep(replacement, len)
>> FUN <- function(i, ...) {
>> sub(pattern[i], replacement[i], x[i], fixed = TRUE)
>> }
>> idx <- 1:length(x)
>> sapply(idx, FUN)
>> }
>>
>> #Example
>> X <- c("ab", "cd", "ef")
>> patt <- c("b", "cd", "a")
>> repl <- c("B", "CD", "A")
>> sub2(patt, repl, X)
>>
>> -John
More information about the R-help
mailing list