[R] updating elements of a vector sequentially - is there a faster way?

Berend Hasselman bhh at xs4all.nl
Fri Aug 24 10:05:31 CEST 2012


On 24-08-2012, at 06:49, Gopi Goteti wrote:

> I would like to know whether there is a faster way to do the below
> operation (updating vec1).
> 
> My objective is to update the elements of a vector (vec1), where a
> particular element i is dependent on the previous one. I need to do this on
> vectors that are 1 million or longer and need to repeat that process
> several hundred times. The for loop works but is slow. If there is a faster
> way, please let me know.
> 
> probs <- c(.1, .3, .2, .4, .7, .9, .3, .4, .5, .6)
> p10 <- 0.6
> p00 <- 0.4
> vec1 <- rep(0, 10)
> for (i in 2:10) {
>  vec1[i] <- ifelse(vec1[i-1] == 0,
>                    ifelse(probs[i]<p10, 0, 1),
>                    ifelse(probs[i]<p00, 0, 1))
> }


ifelse works on vectors. You should use if() ... else .. here.
You can also precompute  ifelse(probs[i]<p10, 0, 1) and ifelse(probs[i]<p00, 0, 1) since these expressions do not depend on vec1.

Here is some testing code where your code is in function f1 and and an alternative in function f2 using precomputed values and no ifelse.
I also use the package compiler to get as much speedup as possible.

The code:

N <- 100000 # must be a multiple of 10

probs <- rep(c(.1, .3, .2, .4, .7, .9, .3, .4, .5, .6), N/10)
p10 <- 0.6
p00 <- 0.4
vec1 <- rep(0, N)

val.p10 <- ifelse(probs<p10, 0, 1)
val.p00 <- ifelse(probs<p00, 0, 1)

f1 <- function(vec1) {
    N <- length(vec1)
    for (i in 2:N) {
     vec1[i] <- ifelse(vec1[i-1] == 0,
                       ifelse(probs[i]<p10, 0, 1),
                       ifelse(probs[i]<p00, 0, 1))
    }
    vec1
}

f2 <- function(vec1) {
    N <- length(vec1)
    for (i in 2:N) {
         vec1[i] <- if(vec1[i-1] == 0) val.p10[i] else val.p00[i]
    }
    vec1
}

f1.c <- cmpfun(f1)
f2.c <- cmpfun(f2)

vec1 <- f1(vec1)
vec2 <- f2(vec1)
vec3 <- f1.c(vec1)
vec4 <- f2.c(vec1)
identical(vec1,vec2)
identical(vec1,vec3)
identical(vec1,vec4)

system.time(vec1 <- f1(vec1))[3]
system.time(vec2 <- f2(vec1))[3]
system.time(vec3 <- f1.c(vec1))[3]
system.time(vec4 <- f2.c(vec1))[3]

Output is:

> identical(vec1,vec2)
[1] TRUE
> identical(vec1,vec3)
[1] TRUE
> identical(vec1,vec4)
[1] TRUE

> system.time(vec1 <- f1(vec1))[3]
elapsed 
  2.922 
> system.time(vec2 <- f2(vec1))[3]
elapsed 
  0.403 
> system.time(vec3 <- f1.c(vec1))[3]
elapsed 
    2.4 
> system.time(vec4 <- f2.c(vec1))[3]
elapsed 
  0.084 

A simple loop and using precomputed values achieves a significant speedup compared to your original code.
Using the compiler package to compile f2  gains even more sppedup.

Berend




More information about the R-help mailing list