[R] Count number of consecutive zeros by group
Hervé Pagès
hpages at fhcrc.org
Thu Oct 31 19:54:55 CET 2013
Hi Carlos,
With Bioconductor, this can simply be done with:
library(IRanges)
ID <- Rle(1:3, c(3,2,4))
x <- Rle(c(1,0,0,0,0,1,1,0,1))
groups <- split(x, ID)
idx <- groups == 0
Then:
> max(runLength(idx)[runValue(idx)])
1 2 3
2 2 1
Should be fast even with hundreds of thousands of groups (should take
< 10 sec).
HTH,
H.
On 10/31/2013 04:20 AM, Carlos Nasher wrote:
> Dear R-helpers,
>
> I need to count the maximum number of consecutive zero values of a variable
> in a dataframe by different groups. My dataframe looks like this:
>
> ID <- c(1,1,1,2,2,3,3,3,3)
> x <- c(1,0,0,0,0,1,1,0,1)
> df <- data.frame(ID=ID,x=x)
> rm(ID,x)
>
> So I want to get the max number of consecutive zeros of variable x for each
> ID. I found rle() to be helpful for this task; so I did:
>
> FUN <- function(x) {
> rles <- rle(x == 0)
> }
> consec <- lapply(split(df[,2],df[,1]), FUN)
>
> consec is now an rle object containing lists für each ID that contain
> $lenghts: int as the counts for every consecutive number and $values: logi
> indicating if the consecutive numbers are zero or not.
>
> Unfortunately I'm not very experienced with lists. Could you help me how to
> extract the max number of consec zeros for each ID and return the result as
> a dataframe containing ID and max number of consecutive zeros?
>
> Different approaches are also welcome. Since the real dataframe is quite
> large, a fast solution is appreciated.
>
> Best regards,
> Carlos
>
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the R-help
mailing list