[R] Help me replace a for loop with an "apply" function

jim holtman jholtman at gmail.com
Thu Oct 1 18:27:15 CEST 2009


Will this work:

> x <- read.table(textConnection("   day         user_id
+ 2008/11/01    2001
+ 2008/11/01    2002
+ 2008/11/01    2003
+ 2008/11/01    2004
+ 2008/11/01    2005
+ 2008/11/02    2001
+ 2008/11/02    2005
+ 2008/11/03    2001
+ 2008/11/03    2003
+ 2008/11/03    2004
+ 2008/11/03    2005
+ 2008/11/04    2001
+ 2008/11/04    2003
+ 2008/11/04    2004
+ 2008/11/04    2005"), header=TRUE)
> closeAllConnections()
> # convert to Date
> x$day <- as.Date(x$day, format="%Y/%m/%d")
> # split by user and then look for contiguous days
> contig <- sapply(split(x$day, x$user_id), function(.days){
+     .diff <- cumsum(c(TRUE, diff(.days) != 1))
+     max(table(.diff))
+ })
> contig
2001 2002 2003 2004 2005
   4    1    2    2    4
>
>


On Thu, Oct 1, 2009 at 11:29 AM, gd047 <gd047 at mineknowledge.com> wrote:
>
> ...if that is possible
>
> My task is to find the longest streak of continuous days a user participated
> in a game.
>
> Instead of writing an sql function, I chose to use the R's rle function, to
> get the longest streaks and then update my db table with the results.
>
> The (attached) dataframe is something like this:
>
>    day         user_id
> 2008/11/01    2001
> 2008/11/01    2002
> 2008/11/01    2003
> 2008/11/01    2004
> 2008/11/01    2005
> 2008/11/02    2001
> 2008/11/02    2005
> 2008/11/03    2001
> 2008/11/03    2003
> 2008/11/03    2004
> 2008/11/03    2005
> 2008/11/04    2001
> 2008/11/04    2003
> 2008/11/04    2004
> 2008/11/04    2005
>
>
>
> --- R code follows
> ------------------------------------------------------
>
>
> # turn it to a contingency table
> my_table <- table(user_id, day)
>
> # get the streaks
> rle_table <- apply(my_table,1,rle)
>
> # verify the longest streak of "1"s for user 2001
> # as.vector(tapply(rle_table$'2001'$lengths, rle_table$'2001'$values,
> max)["1"])
>
> # loop to get the results
> # initiate results matrix
> res<-matrix(nrow=dim(my_table)[1], ncol=2)
>
> for (i in 1:dim(my_table)[1]) {
> string <- paste("as.vector(tapply(rle_table$'", rownames(my_table)[i],
> "'$lengths, rle_table$'", rownames(my_table)[i], "'$values, max)['1'])",
> sep="")
> res[i,]<-c(as.integer(rownames(my_table)[i]) , eval(parse(text=string)))
> }
>
>
> ----------------------------------------------------
> --- end of R code
>
> Unfortunately this for loop takes too long and I' wondering if there is a
> way to produce the res matrix using a function from the "apply" family.
>
> Thank you in advance
> --
> View this message in context: http://www.nabble.com/Help-me-replace-a-for-loop-with-an-%22apply%22-function-tp25696937p25696937.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?




More information about the R-help mailing list