[R] Using cumsum with 'group by' ?
arun
smartpink111 at yahoo.com
Fri Nov 23 15:59:57 CET 2012
HI,
If that is the case, this should work:
dat1<-read.table(text="
id, x, date
1, 5, 2012-06-05 12:01
1, 10, 2012-06-05 12:02
1, 45, 2012-06-05 12:03
2, 5, 2012-06-05 12:01
2, 3, 2012-06-05 12:03
2, 2, 2012-06-05 12:05
3, 5, 2012-06-05 12:03
3, 5, 2012-06-05 12:04
3, 8, 2012-06-05 12:05
1, 5, 2012-06-08 13:01
1, 9, 2012-06-08 13:02
1, 3, 2012-06-08 13:03
2, 0, 2012-06-08 13:15
2, 1, 2012-06-08 13:18
2, 8, 2012-06-08 13:20
2, 4, 2012-06-08 13:21
3, 6, 2012-06-08 13:15
3, 2, 2012-06-08 13:16
3, 7, 2012-06-08 13:17
3, 2, 2012-06-08 13:18
",sep=",",header=TRUE,stringsAsFactors=FALSE)
dat1$date<-as.Date(dat1$date,format="%Y-%m-%d %H:%M")
dat2<-dat1[order(dat1[,1],dat1[,3]),]
dat2$Cumsum<-ave(dat2$x,list(dat2$id,dat2$date),FUN=cumsum)
head(dat2)
# id x date Cumsum
#1 1 5 2012-06-05 5
#2 1 10 2012-06-05 15
#3 1 45 2012-06-05 60
#10 1 5 2012-06-08 5
#11 1 9 2012-06-08 14
#12 1 3 2012-06-08 17
#or
with(dat2,aggregate(x,by=list(id=id,date=date),cumsum))
# id date x
#1 1 2012-06-05 5, 15, 60
#2 2 2012-06-05 5, 8, 10
#3 3 2012-06-05 5, 10, 18
#4 1 2012-06-08 5, 14, 17
#5 2 2012-06-08 0, 1, 9, 13
#6 3 2012-06-08 6, 8, 15, 17
A.K.
----- Original Message -----
From: TheRealJimShady <james.david.smith at gmail.com>
To: r-help at r-project.org
Cc:
Sent: Friday, November 23, 2012 6:04 AM
Subject: Re: [R] Using cumsum with 'group by' ?
Hi Arun & everyone,
Thank you very much for your helpful suggestions. I've been working
through them, but have realised that my data is a little more
complicated than I said and that the solutions you've kindly provided
don't work. The problem is that there is more than one day of data for
each person. It looks like this:
id x date
1 5 2012-06-05 12:01
1 10 2012-06-05 12:02
1 45 2012-06-05 12:03
2 5 2012-06-05 12:01
2 3 2012-06-05 12:03
2 2 2012-06-05 12:05
3 5 2012-06-05 12:03
3 5 2012-06-05 12:04
3 8 2012-06-05 12:05
1 5 2012-06-08 13:01
1 9 2012-06-08 13:02
1 3 2012-06-08 13:03
2 0 2012-06-08 13:15
2 1 2012-06-08 13:18
2 8 2012-06-08 13:20
2 4 2012-06-08 13:21
3 6 2012-06-08 13:15
3 2 2012-06-08 13:16
3 7 2012-06-08 13:17
3 2 2012-06-08 13:18
So what I need to do is something like this (in pseudo code anyway):
- Order the data by the id field and then the date field
- add a new variable called cumsum
- calculate this variable as the cumulative value of X, but grouping
by the id and date (not date, not date and time).
Thank you
James
On 23 November 2012 03:54, arun kirshna [via R]
<ml-node+s789695n4650505h81 at n4.nabble.com> wrote:
> Hi,
> No problem.
> One more method if you wanted to try:
> library(data.table)
> dat2<-data.table(dat1)
> dat2[,list(x,time,Cumsum=cumsum(x)),list(id)]
> # id x time Cumsum
> #1: 1 5 12:01 5
> #2: 1 14 12:02 19
> #3: 1 6 12:03 25
> #4: 1 3 12:04 28
> #5: 2 98 12:01 98
> #6: 2 23 12:02 121
> #7: 2 1 12:03 122
> #8: 2 4 12:04 126
> #9: 3 5 12:01 5
> #10: 3 65 12:02 70
> #11: 3 23 12:03 93
> #12: 3 23 12:04 116
>
>
> A.K.
>
>
>
> ----- Original Message -----
> From: TheRealJimShady <[hidden email]>
> To: [hidden email]
> Cc:
> Sent: Thursday, November 22, 2012 12:27 PM
> Subject: Re: [R] Using cumsum with 'group by' ?
>
> Thank you very much, I will try these tomorrow morning.
>
> On 22 November 2012 17:25, arun kirshna [via R]
> <[hidden email]> wrote:
>
>> HI,
>> You can do this in many ways:
>> dat1<-read.table(text="
>> id time x
>> 1 12:01 5
>> 1 12:02 14
>> 1 12:03 6
>> 1 12:04 3
>> 2 12:01 98
>> 2 12:02 23
>> 2 12:03 1
>> 2 12:04 4
>> 3 12:01 5
>> 3 12:02 65
>> 3 12:03 23
>> 3 12:04 23
>> ",sep="",header=TRUE,stringsAsFactors=FALSE)
>> dat1$Cumsum<-ave(dat1$x,dat1$id,FUN=cumsum)
>> #or
>> unlist(tapply(dat1$x,dat1$id,FUN=cumsum),use.names=FALSE)
>> # [1] 5 19 25 28 98 121 122 126 5 70 93 116
>> #or
>> library(plyr)
>> ddply(dat1,.(id),function(x) cumsum(x[3]))[,2]
>> # [1] 5 19 25 28 98 121 122 126 5 70 93 116
>> head(dat1)
>> # id time x Cumsum
>> #1 1 12:01 5 5
>> #2 1 12:02 14 19
>> #3 1 12:03 6 25
>> #4 1 12:04 3 28
>> #5 2 12:01 98 98
>> #6 2 12:02 23 121
>> A.K.
>>
>>
>>
>>
>> ________________________________
>> If you reply to this email, your message will be added to the discussion
>> below:
>>
>> http://r.789695.n4.nabble.com/Using-cumsum-with-group-by-tp4650457p4650459.html
>> To unsubscribe from Using cumsum with 'group by' ?, click here.
>> NAML
>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Using-cumsum-with-group-by-tp4650457p4650461.html
> Sent from the R help mailing list archive at Nabble.com.
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> ________________________________
> If you reply to this email, your message will be added to the discussion
> below:
> http://r.789695.n4.nabble.com/Using-cumsum-with-group-by-tp4650457p4650505.html
> To unsubscribe from Using cumsum with 'group by' ?, click here.
> NAML
--
View this message in context: http://r.789695.n4.nabble.com/Using-cumsum-with-group-by-tp4650457p4650538.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list