[R] move date-values from one line to several lines
Marc Schwartz
marc_schwartz at me.com
Tue Dec 2 17:14:42 CET 2014
> On Dec 2, 2014, at 9:29 AM, Matthias Weber <Matthias.Weber at fntsoftware.com> wrote:
>
> Hello together,
>
> i have a data.frame with date-values. What I want is a data.frame with a several lines for each date.
>
> My current data.frame looks like this one:
>
> ID FROM TO REASON
> 1 2015-02-27 2015-02-28 Holiday
> 1 2015-03-15 2015-03-20 Illness
> 2 2015-05-20 2015-02-23 Holiday
> 2 2015-06-01 2015-06-03 Holiday
> 2 2015-07-01 2015-07-01 Illness
>
> The result looks like this one:
>
> ID DATE REASON
> 1 2015-02-27 Holiday
> 1 2015-02-28 Holiday
> 1 2015-03-15 Illness
> 1 2015-03-16 Illness
> 1 2015-03-17 Illness
> 1 2015-03-18 Illness
> 1 2015-03-19 Illness
> 1 2015-03-20 Illness
> 2 2015-05-20 Holiday
> 2 2015-05-21 Holiday
> 2 2015-05-22 Holiday
> 2 2015-05-23 Holiday
> 2 2015-06-01 Holiday
> 2 2015-06-02 Holiday
> 2 2015-06-02 Holiday
> 2 2015-07-01 Illness
>
> Maybe anyone can help me, how I can do this.
>
> Thank you.
>
> Best regards.
>
> Mat
A quick and dirty approach.
First, note that in your source data frame, the TO value in the third row is incorrect. I changed it here:
> DF
ID FROM TO REASON
1 1 2015-02-27 2015-02-28 Holiday
2 1 2015-03-15 2015-03-20 Illness
3 2 2015-05-20 2015-05-23 Holiday
4 2 2015-06-01 2015-06-03 Holiday
5 2 2015-07-01 2015-07-01 Illness
With that in place, you can use R's recycling of values to create multiple data frame rows from the date sequences and the single ID and REASON entries:
i <- 1
> data.frame(ID = DF$ID[i], DATE = seq(DF$FROM[i], DF$TO[i], by = "day"), REASON = DF$REASON[i])
ID DATE REASON
1 1 2015-02-27 Holiday
2 1 2015-02-28 Holiday
So just put that into an lapply() based loop, which returns a list:
> DF.TMP <- lapply(seq(nrow(DF)),
function(i) data.frame(ID = DF$ID[i],
DATE = seq(DF$FROM[i], DF$TO[i], by = "day"),
REASON = DF$REASON[i]))
> DF.TMP
[[1]]
ID DATE REASON
1 1 2015-02-27 Holiday
2 1 2015-02-28 Holiday
[[2]]
ID DATE REASON
1 1 2015-03-15 Illness
2 1 2015-03-16 Illness
3 1 2015-03-17 Illness
4 1 2015-03-18 Illness
5 1 2015-03-19 Illness
6 1 2015-03-20 Illness
[[3]]
ID DATE REASON
1 2 2015-05-20 Holiday
2 2 2015-05-21 Holiday
3 2 2015-05-22 Holiday
4 2 2015-05-23 Holiday
[[4]]
ID DATE REASON
1 2 2015-06-01 Holiday
2 2 2015-06-02 Holiday
3 2 2015-06-03 Holiday
[[5]]
ID DATE REASON
1 2 2015-07-01 Illness
Then use do.call() on the result:
> do.call(rbind, DF.TMP)
ID DATE REASON
1 1 2015-02-27 Holiday
2 1 2015-02-28 Holiday
3 1 2015-03-15 Illness
4 1 2015-03-16 Illness
5 1 2015-03-17 Illness
6 1 2015-03-18 Illness
7 1 2015-03-19 Illness
8 1 2015-03-20 Illness
9 2 2015-05-20 Holiday
10 2 2015-05-21 Holiday
11 2 2015-05-22 Holiday
12 2 2015-05-23 Holiday
13 2 2015-06-01 Holiday
14 2 2015-06-02 Holiday
15 2 2015-06-03 Holiday
16 2 2015-07-01 Illness
See ?seq.Date for the critical step.
Regards,
Marc Schwartz
More information about the R-help
mailing list