[R] Add new calculated column to data frame
arun
smartpink111 at yahoo.com
Thu Aug 29 21:13:49 CEST 2013
Hi,
You could try this:
dat1<- read.table(text="
id module event time time_on_task
1 sys login 1373502892 80
2 task add 1373502892 80
3 task add 1373502972 23
4 sys login 1373502892 80
5 list delete 1373502995 901
6 list view 1373503896 100
7 task add 1373503996 NA
",sep="",header=TRUE,stringsAsFactors=FALSE)
dat1$Categ<-as.character(factor(with(dat1,paste(module,event,sep="_")),levels=c("task_add","sys_login","list_delete","list_view"),labels=LETTERS[1:4]))
dat1
# id module event time time_on_task Categ
#1 1 sys login 1373502892 80 B
#2 2 task add 1373502892 80 A
#3 3 task add 1373502972 23 A
#4 4 sys login 1373502892 80 B
#5 5 list delete 1373502995 901 C
#6 6 list view 1373503896 100 D
#7 7 task add 1373503996 NA A
A.K.
________________________________
From: srecko joksimovic <sreckojoksimovic at gmail.com>
To: arun <smartpink111 at yahoo.com>
Cc: R help <R-help at r-project.org>
Sent: Thursday, August 29, 2013 2:34 PM
Subject: Re: [R] Add new calculated column to data frame
Hi Arun,
There is one more question... you explained me how to use split(dat1,cumsum(dat1$action=="login")) in one of previous questions, and that is great.
Now, if I have something like this:
id module event time time_on_task
1 sys login 1373502892 80
2 task add 1373502892 80
3 task add 1373502972 23
4 sys login 1373502892 80
5 list delete 1373502995 901
6 list view 1373503896 100
7 task add 1373503996 NA
I know how to split at each "login" occurrence, and I know how to add new column with time differences. But, how to add new column "category" which will be calculated based on columns "module" and "even"? For example if module=task and event=add => category= A...
Srecko
On Thu, Aug 29, 2013 at 11:22 AM, arun <smartpink111 at yahoo.com> wrote:
Hi Srecko,
>No problem.
>Regards,
>Arun
>
>
>
>
>
>
>
>________________________________
>From: srecko joksimovic <sreckojoksimovic at gmail.com>
>To: arun <smartpink111 at yahoo.com>
>Sent: Thursday, August 29, 2013 2:22 PM
>
>Subject: Re: [R] Add new calculated column to data frame
>
>
>
>Sorry... I should figure it out...
>
>thanks so much!
>Srecko
>
>
>
>On Thu, Aug 29, 2013 at 11:21 AM, arun <smartpink111 at yahoo.com> wrote:
>
>Hi,
>>The one you showed is:
>>
>>dat1$time_on_task<- c(diff(dat1$time),NA)
>>
>> dat1
>># id event time time_on_task
>>#1 1 add 1373502892 80
>>
>>#2 2 add 1373502972 23
>>#3 3 delete 1373502995 901
>>#4 4 view 1373503896 100
>>#5 5 add 1373503996 NA
>>
>>
>>
>>
>>________________________________
>>From: srecko joksimovic <sreckojoksimovic at gmail.com>
>>
>>To: arun <smartpink111 at yahoo.com>
>>Cc: R help <r-help at r-project.org>
>>Sent: Thursday, August 29, 2013 2:15 PM
>>Subject: Re: [R] Add new calculated column to data frame
>>
>>
>>
>>
>>Thanks Arun,
>>
>>this is great. However, it should be just a little bit different:
>>
>># id event time time_on_task
>>#1 1 add 1373502892 80
>>#2 2 add 1373502972 23
>>#3 3 delete 1373502995 901
>>#4 4 view 1373503896 100
>>#5 5 add 1373503996 NA
>>
>>
>>When I calculate difference, I need to know how long each activity was. It is id2-id1 for the first activity...
>>
>>
>>
>>On Thu, Aug 29, 2013 at 11:03 AM, arun <smartpink111 at yahoo.com> wrote:
>>
>>
>>>
>>>Hi,
>>>Try:
>>>dat1<- read.table(text="
>>>id event time
>>>
>>>1 add 1373502892
>>>2 add 1373502972
>>>3 delete 1373502995
>>>4 view 1373503896
>>>5 add 1373503996
>>>",sep="",header=TRUE,stringsAsFactors=FALSE)
>>> dat1$time_on_task<- c(NA,diff(dat1$time))
>>> dat1
>>># id event time time_on_task
>>>#1 1 add 1373502892 NA
>>>#2 2 add 1373502972 80
>>>#3 3 delete 1373502995 23
>>>#4 4 view 1373503896 901
>>>#5 5 add 1373503996 100
>>>
>>>#Not sure whether this depends on the values of "event" or not..
>>>A.K.
>>>
>>>
>>>
>>>
>>>
>>>
>>>----- Original Message -----
>>>From: srecko joksimovic <sreckojoksimovic at gmail.com>
>>>To: R help <R-help at r-project.org>
>>>Cc:
>>>Sent: Thursday, August 29, 2013 1:52 PM
>>>Subject: [R] Add new calculated column to data frame
>>>
>>>Hi,
>>>
>>>I have a following data set:
>>>id event time (in sec)
>>>1 add 1373502892
>>>2 add 1373502972
>>>3 delete 1373502995
>>>4 view 1373503896
>>>5 add 1373503996
>>>...
>>>
>>>I'd like to add new column "time on task" which is time elapsed between two
>>>events (id2 - id1...). What would be the best approach to do that?
>>>
>>>Thanks,
>>>Srecko
>>>
>>> [[alternative HTML version deleted]]
>>>
>>>______________________________________________
>>>R-help at r-project.org mailing list
>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>
More information about the R-help
mailing list