[R] Keep only first date from consecutive dates
Frank S.
f_j_rod at hotmail.com
Wed Dec 9 10:38:19 CET 2015
Many thanks to: William Dunlap, Dennis Murphy and David Winsemius for your quick and efficient answers!!
Best regards,
Frank S.
> Subject: Re: [R] Keep only first date from consecutive dates
> From: dwinsemius en comcast.net
> Date: Fri, 4 Dec 2015 16:34:38 -0800
> CC: f_j_rod en hotmail.com; r-help en r-project.org
> To: wdunlap en tibco.com
>
>
> > On Dec 4, 2015, at 1:10 PM, William Dunlap <wdunlap en tibco.com> wrote:
> >
> > With a data.frame sorted by id, with ties broken by date, as in
> > your example, you can select rows that are either the start
> > of a new id group or the start of run of consecutive dates with:
> >
> >> w <- c(TRUE, diff(uci$date)>1) | c(TRUE, diff(uci$id)!=0)
> >> which(w)
> > [1] 1 4 5 7
> >> uci[w,]
> > id date value
> > 1 1 2005-10-28 1
> > 4 1 2005-11-07 3
> > 5 1 2007-03-19 1
> > 7 2 2004-06-02 2
> >
> > I'll leave it to you to translate that R syntax into data.table syntax -
> > it just involves comparing the current row with the previous row.
> >
> > Bill Dunlap
> > TIBCO Software
> > wdunlap tibco.com
> >
> >
> > On Fri, Dec 4, 2015 at 12:53 PM, Frank S. <f_j_rod en hotmail.com> wrote:
> >> Dear R users,
> >>
> >> I usually work with data.table package, but I'm sure that muy question can also be answered working with R data frame.
> >> Working with grouped data (by "id"), I wonder if it is possible to keep in a R data.frame (or R data.table):
> >> a) Only the first row if there is a row which belongs to a a group of rows (from same "id") that have consecutive dates.
> >> b) All the rows which do not belong to the above groups.
> >>
> >> As an example, I have "uci" data.frame:
> >>
> >> uci <- data.table(id=c(rep(1,6),2),
> >> date = as.Date(c("2005-10-28","2005-10-29","2005-10-30","2005-11-07","2007-03-19","2007-03-20","2004-06-02")),
> >> value = c(1, 2, 1, 3, 1, 2, 2))
> >>
> >> id date value
> >> 1 2005-10-28 1
> >> 1 2005-10-29 2
> >> 1 2005-10-30 1
> >> 1 2005-11-07 3
> >> 1 2007-03-19 1
> >> 1 2007-03-20 2
> >> 2 2004-06-02 2
> >>
> >> And the desired output would be:
> >>
> >> id date value
> >> 1 2005-10-28 1
> >> 1 2005-11-07 3
> >> 1 2007-03-19 1
> >> 2 2004-06-02 2
>
> The syntax of `[.data.table` is a bit odd; You can refer to columns by name; I never trust my intuition, though.
>
> Selection is usually done with a logical vector in the i-position. The diff operator does succeed in the i position with the obvious need to prepend with a starting value..
>
> > uci[ c(0,diff(date))!=1, ]
> id date value
> 1: 1 2005-10-28 1
> 2: 1 2005-11-07 3
> 3: 1 2007-03-19 1
> 4: 2 2004-06-02 2
>
> The other cases are handle with the converse-expression
>
> > uci[c(0,diff(date)) == 1, ]
> id date value
> 1: 1 2005-10-29 2
> 2: 1 2005-10-30 1
> 3: 1 2007-03-20 2
>
>
> >>
> >> # From the following link, I have tried:
> >> http://stackoverflow.com/questions/32308636/r-how-to-sum-values-from-rows-only-if-the-key-value-is-the-same-and-also-if-the
> >>
> >> setDT(uci)[ ,list(date=date[1L], value = value[1L]), by = .(ind=rleid(date), id)][, ind:=NULL][]
> >>
> >> But I get the same data frame, and I do not know the reason.
> >>
> >> Thank you very much for any help!!
> >>
> >> Frank S.
> >>
> >>
> >>
> >>
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> R-help en r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > ______________________________________________
> > R-help en r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list