[R] Generating a count variable

David Winsemius dwinsemius at comcast.net
Mon Jun 1 20:23:55 CEST 2009


On Jun 1, 2009, at 1:14 PM, Joseph Magagnoli wrote:

> Dear All,
> I am practicing data manipulation and I would like to generarte a  
> count
> variable.  My data looks like this:
>
>
> Country       MID
>   1              NA
>   1                0
>   1                0
>   1                1
>   1                0
>   2                0
>   2                1
>   2                0
>   2                0
>   2                0
>
> I would like to to generate a variable that counts the periods of  
> zeros in
> the MID variable for each country for example:
> Country       MID        Count
>   1              NA                     # ya' gotta put something  
> there
>   1                0            1
>   1                0            2
>   1                1            0
>   1                0            1
>   2                0            1
>   2                1            0
>   2                0            1
>   2                0            2
>   2                0            3
> I am used to doing my data manipulation in stata but I want to try  
> learn to
> do it in R.

The rle function is generally useful for such problems. Having created  
a data.frame, dd, with those elements:

  rledd<- rle(paste(dd$Country,dd$MID,sep=".") )

  as.vector(unlist(sapply(rledd$lengths, FUN=function(x) seq(1,x)))) -  
dd$MID
  [1] NA  1  2  0  1  1  0  1  2  3

 > dd$Count <- as.vector(unlist(sapply(rledd$lengths, FUN=function(x)  
seq(1,x))))-dd$MID
 > dd
    Country MID Count
1        1  NA    NA
2        1   0     1
3        1   0     2
4        1   1     0
5        1   0     1
6        2   0     1
7        2   1     0
8        2   0     1
9        2   0     2
10       2   0     3

-- 
David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list