[R] Incremental
Rui Barradas
ruipbarradas at sapo.pt
Fri Oct 14 21:16:46 CEST 2016
Hello,
You have to convert y1 to class "Date" first, then do date arithmetic.
The complete code would be
dat<-read.table(text=" y1, flag
24-01-2016,S
24-02-2016,R
24-03-2016,X
24-04-2016,H
24-01-2016,S
24-11-2016,R
24-10-2016,R
24-02-2016,X
24-01-2016,H
24-11-2016,S
24-02-2016,R
24-10-2016,X
24-03-2016,H
24-04-2016,S
",sep=",",header=TRUE)
str(dat) # See what we have, y1 is a factor
dat$y1 <- as.Date(dat$y1, format = "%d-%m-%Y")
str(dat) # now y1 is a Date
dat$x1 <- cumsum(dat$flag == "S")
dat$z2 <- unlist(tapply(dat$y1, dat$x1, function(y) y - y[1]))
dat
Instead of y - y[1] you can also use ?difftime.
Rui Barradas
Em 14-10-2016 20:06, Val escreveu:
> Thank you Rui,
>
> It Worked!
>
> How about if the first variable is date format? Like the following
> dat<-read.table(text=" y1, flag
> 24-01-2016,S
> 24-02-2016,R
> 24-03-2016,X
> 24-04-2016,H
> 24-01-2016,S
> 24-11-2016,R
> 24-10-2016,R
> 24-02-2016,X
> 24-01-2016,H
> 24-11-2016,S
> 24-02-2016,R
> 24-10-2016,X
> 24-03-2016,H
> 24-04-2016,S
> ",sep=",",header=TRUE)
> dat
> dat$x1 <- cumsum(dat$flag == "S")
> dat$z2 <- unlist(tapply(dat$y1, dat$x1, function(y) y - y[1]))
>
> error message
> In Ops.factor(y, y[1]) : ‘-’ not meaningful for factors
>
>
>
> On Thu, Oct 13, 2016 at 5:30 AM, Rui Barradas <ruipbarradas at sapo.pt> wrote:
>> Hello,
>>
>> You must run the code to create x1 first, part 1), then part 2).
>> I've tested with your data and all went well, the result is the following.
>>
>>> dput(dat)
>> structure(list(y1 = c(39958L, 40058L, 40105L, 40294L, 40332L,
>> 40471L, 40493L, 40533L, 40718L, 40771L, 40829L, 40892L, 41056L,
>> 41110L, 41160L, 41222L, 41250L, 41289L, 41324L, 41355L, 41415L,
>> 41562L, 41562L, 41586L), flag = structure(c(3L, 2L, 4L, 1L, 3L,
>> 2L, 2L, 4L, 1L, 3L, 2L, 4L, 1L, 3L, 2L, 2L, 2L, 2L, 4L, 2L, 4L,
>> 4L, 1L, 3L), .Label = c("H", "R", "S", "X"), class = "factor"),
>> x1 = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L,
>> 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L), z2 = c(0L, 100L,
>> 147L, 336L, 0L, 139L, 161L, 201L, 386L, 0L, 58L, 121L, 285L,
>> 0L, 50L, 112L, 140L, 179L, 214L, 245L, 305L, 452L, 452L,
>> 0L)), .Names = c("y1", "flag", "x1", "z2"), row.names = c(NA,
>> -24L), class = "data.frame")
>>
>>
>> Rui Barradas
>>
>>
>> Em 12-10-2016 21:53, Val escreveu:
>>>
>>> Rui,
>>> Thank You!
>>>
>>> the second one gave me NULL.
>>> dat$z2 <- unlist(tapply(dat$y1, dat$x1, function(y) y - y[1]))
>>>
>>> dat$z2
>>> NULL
>>>
>>>
>>>
>>> On Wed, Oct 12, 2016 at 3:34 PM, Rui Barradas <ruipbarradas at sapo.pt>
>>> wrote:
>>>>
>>>> Hello,
>>>>
>>>> Seems simple:
>>>>
>>>>
>>>> # 1)
>>>> dat$x1 <- cumsum(dat$flag == "S")
>>>>
>>>> # 2)
>>>> dat$z2 <- unlist(tapply(dat$y1, dat$x1, function(y) y - y[1]))
>>>>
>>>> Hope this helps,
>>>>
>>>> Rui Barradas
>>>>
>>>>
>>>> Em 12-10-2016 21:15, Val escreveu:
>>>>>
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I have a data set like
>>>>> dat<-read.table(text=" y1, flag
>>>>> 39958,S
>>>>> 40058,R
>>>>> 40105,X
>>>>> 40294,H
>>>>> 40332,S
>>>>> 40471,R
>>>>> 40493,R
>>>>> 40533,X
>>>>> 40718,H
>>>>> 40771,S
>>>>> 40829,R
>>>>> 40892,X
>>>>> 41056,H
>>>>> 41110,S
>>>>> 41160,R
>>>>> 41222,R
>>>>> 41250,R
>>>>> 41289,R
>>>>> 41324,X
>>>>> 41355,R
>>>>> 41415,X
>>>>> 41562,X
>>>>> 41562,H
>>>>> 41586,S
>>>>> ",sep=",",header=TRUE)
>>>>>
>>>>> First sort the data by y1.
>>>>> Then
>>>>> I want to create two columns .
>>>>> 1. the first new column is (x1): if flag is "S" then x1=1 and
>>>>> assign the following/subsequent rows 1 as well. When we reach to
>>>>> the next "S" then x1=2 and the subsequent rows will be assigned to
>>>>> 2.
>>>>>
>>>>> 2. the second variable (z2). Within each x1 find the difference
>>>>> between the first y1 and subsequent y1 values
>>>>>
>>>>> Example for the first few rows
>>>>> y1, flag, x1, z2
>>>>> 39958, S, 1, 0 z2 is calculated as z2=(39958, 39958)
>>>>> 40058, R, 1, 100 z2 is calculated as z2=(40058, 39958)
>>>>> 40105, X, 1, 147 z2 is calculated as z2=(40105, 39958)
>>>>> 40294, H, 1, 336 z2 is calculated as z2=(40294, 39958)
>>>>> 40332, S, 2, 0 z2 is calculated as z2=(40332, 40332)
>>>>> etc
>>>>>
>>>>> Here is the complete output for the sample data
>>>>> 39958,S,1,0
>>>>> 40058,R,1,100
>>>>> 40105,X,1,147
>>>>> 40294,H,1,336
>>>>> 40332,S,2,0
>>>>> 40471,R,2,139
>>>>> 40493,R,2,161
>>>>> 40533,X,2,201
>>>>> 40718,H,2,386
>>>>> 40771,S,3,0
>>>>> 40829,R,3,58
>>>>> 40892,X,3,121
>>>>> 41056,H,3,285
>>>>> 41110,S,4,0
>>>>> 41160,R,4,50
>>>>> 41222,R,4,112
>>>>> 41250,R,4,140
>>>>> 41289,R,4,179
>>>>> 41324,X,4,214
>>>>> 41355,R,4,245
>>>>> 41415,X,4,305
>>>>> 41562,X,4,452
>>>>> 41562,H,4,452
>>>>> 41586,S,5,0
>>>>>
>>>>> Val
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>
>>
More information about the R-help
mailing list