[R] counting run lengths

Mon Oct 27 11:14:54 CET 2008

then try the following:

Atr <- cbind(rep(1:0, each = 4), 1, c(1, rep(0, 7)), 1)
Atr <- rbind(c(0, 1, 0, 1), Atr)

apply(Atr, 2, function (x) {
     rr <- rle(x)
     if (tail(rr$values, 1) == 0) tail(rr$length, 1) else 0
})

I hope this what you're looking for.

Best,
Dimitris

Mario Lavezzi wrote:
> Hi Dimitris, thank you very much.
> Actually, I have not  specified the following:  i want to consider only
> the "most recent" sequence of zeros, that is the last part of the time
> series.
> 
> That is, If  I have:
> 
>    [,1] [,2] [,3] [,4]
> [1,]    0    1    0    1
> [2,]    1    1    1    1
> [3,]    1    1    0    1
> [4,]    1    1    0    1
> [5,]    1    1    0    1
> [6,]    0    1    0    1
> [7,]    0    1    0    1
> [8,]    0    1    0    1
> [9,]    0    1    0    1
> 
> I want to store the values (4 0 7 0) in unSpells. That is the values
> that represent the "last" sequence of zeros (that is I want to ignore
> other possible zeros like the ones I have inserted above).
> 
> thanks!
> Mario
> 
> 
> 
> Dimitris Rizopoulos wrote:
>> it's not totally clear to me what exactly do you need in this case, 
>> but have a look at the following:
>>
>> Atr <- cbind(rep(1:0, each = 4), 1, c(1, rep(0, 7)), 1)
>> unSpells <- colSums(Atr == 0)
>> unSpells[unSpells == 0] <- 1
>> unSpells
>>
>>
>> I hope it helps.
>>
>> Best,
>> Dimitris
>>
>>
>> Mario Lavezzi wrote:
>>> Hello,
>>> I have the following problem.
>>>
>>> I am running simulations on possible states of a set of agents 
>>> (1=employed, 0=unemployed).
>>>
>>> I store these simulated time series in a matrix like the following, 
>>> where rows indicates time periods, columns the number of agents (4 
>>> agents and 8 periods in this case):
>>>
>>> Atr=[
>>> 1    1    1    1
>>> 1    1    0    1
>>> 1    1    0    1
>>> 1    1    0    1
>>> 0    1    0    1
>>> 0    1    0    1
>>> 0    1    0    1
>>> 0    1    0    1]
>>>
>>> At this point, I need to update a vector ("unSpells") which contains 
>>> the lenghts of unemployment spells, and is initialized with ones. 
>>> Practically, in the case represented I need to store the value "4" at 
>>> position 1 of unSpells and "7" at position 3 of unSpells (that is, I 
>>> care only of those who, in the last row, are zeros).
>>>
>>> I am doing this in the following way (tt+1 indicates the time period 
>>> reached by the simulation, n the number of agents):
>>>
>>>    unSpells = matrix(1,nrow=1,ncol=n)      
>>> ppp=apply(Atr[1:(tt+1),],2,rle)
>>>    for(i in (1:n)[Atr[tt+1,]==0]){
>>>        unSpells[i]=tail(ppp[[i]]$lengths,1)
>>>    }
>>>
>>> It works, but the for (i in ...) loop slows down the simulation a lot.
>>>
>>> Any suggestion on how to avoid this loop? (or in general, to speed up 
>>> this part of the simulation)
>>>
>>> Thanks!!
>>> Mario
>>>
>>
> 

-- 
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014