[R] updating elements of a vector sequentially - is there a faster way?

Fri Aug 24 10:34:14 CEST 2012

On Thu, Aug 23, 2012 at 09:49:33PM -0700, Gopi Goteti wrote:
> I would like to know whether there is a faster way to do the below
> operation (updating vec1).
> 
> My objective is to update the elements of a vector (vec1), where a
> particular element i is dependent on the previous one. I need to do this on
> vectors that are 1 million or longer and need to repeat that process
> several hundred times. The for loop works but is slow. If there is a faster
> way, please let me know.
> 
> probs <- c(.1, .3, .2, .4, .7, .9, .3, .4, .5, .6)
> p10 <- 0.6
> p00 <- 0.4
> vec1 <- rep(0, 10)
> for (i in 2:10) {
>   vec1[i] <- ifelse(vec1[i-1] == 0,
>                     ifelse(probs[i]<p10, 0, 1),
>                     ifelse(probs[i]<p00, 0, 1))
> }

Hi.

If p10 is always more than p00, then try the following.

  probs <- c(.1, .3, .2, .4, .7, .9, .3, .4, .5, .6)
  p10 <- 0.6
  p00 <- 0.4

  # original code  
  vec1 <- rep(0, 10)
  for (i in 2:10) {
    vec1[i] <- ifelse(vec1[i-1] == 0,
                      ifelse(probs[i]<p10, 0, 1),
                      ifelse(probs[i]<p00, 0, 1))
  }

  # modification 
  a10 <- ifelse(probs<p10, 0, 1)
  a00 <- ifelse(probs<p00, 0, 1)
  vec2 <- ifelse(a10 == a00, a10, NA)
  vec2[1] <- 0
  n <- length(vec2)
  while (any(is.na(vec2))) {
      shift <- c(NA, vec2[-n])
      vec2 <- ifelse(is.na(vec2), shift, vec2)
  }

  all(vec1 == vec2)

  [1] TRUE

If p10 > p00, then a10 <= a00. In this situation, the recurrence
satisfies the following. If a10[i] == a00[i], then vec1[i] is the
common value and does not depend on vec1[i-1]. If a10[i] < a00[i],
then vec1[i] is equal to vec1[i-1]. The suggested code creates an
initial vec2, which contains NA at the positions, which depend on
the previous value. Then, it iterates copying each value to the
next, if the next is NA. The efficiency depends on the length of
the sequencies of consecutive NA in the initial vec2. If there are
many, but only short sequencies of consecutive NA, the code can
be more efficient than a loop over all elements.

Hope this helps.

Petr Savicky.