[R] how to find "first" or "last" record after sort in R

Bert Gunter bgunter@4567 @end|ng |rom gm@||@com
Thu Sep 9 21:35:12 CEST 2021


Sorry, that should be

> id <- c(1,2,2,2,3,4,5,5)
> last.index <- cumsum(rle(id)$lengths)
> last.index
[1] 1 4 5 6 8

of course.

Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Thu, Sep 9, 2021 at 12:20 PM Bert Gunter <bgunter.4567 using gmail.com> wrote:
>
> Many ways to do this, of course, but if I understand correctly ?rle
> may be the simplest, because you already have the data sorted by ID.
>
> The following little example should give you the idea. It gets the
> index of the last row in each id,, which you can then use to assign
> NA's or whatever:
>
> > id <- c(1,2,2,2,3,4,5,5)
> > last.index <- cumsum(rle(test)$lengths)
> > last.index
> [1] 1 4 5 6 8
>
> Cheers,
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Thu, Sep 9, 2021 at 12:00 PM Kai Yang via R-help
> <r-help using r-project.org> wrote:
> >
> > Hello List,
> > Please look at the sample data frame below:
> >
> > ID         date1              date2             date3
> > 1    2015-10-08    2015-12-17    2015-07-23
> >
> > 2    2016-01-16    NA                 2015-10-08
> > 3    2016-08-01    NA                 2017-01-10
> > 3    2017-01-10    NA                 2016-01-16
> > 4    2016-01-19    2016-02-24   2016-08-01
> > 5    2016-03-01    2016-03-10   2016-01-19
> > This data frame was sorted by ID and date1. I need to set the column date3 as missing for the "last" record for each ID. In the sample data set, the ID 1, 2, 4 and 5 has one row only, so they can be consider as first and last records. the data3 can be set as missing. But the ID 3 has 2 rows. Since I sorted the data by ID and date1, the ID=3 and date1=2017-01-10 should be the last record only. I need to set date3=NA for this row only.
> >
> > the question is, how can I identify the "last" record and set it as NA in date3 column.
> > Thank you,
> > Kai
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list