[R] select a subset

Stavros Macrakis macrakis at alum.mit.edu
Tue Nov 25 18:13:50 CET 2008


How about something like:

censor_choose <- function(fr)
 do.call(rbind,
  lapply( split( fr, fr$id),
   function(sub)
    sub[which.max( if (max(sub$censor))
                      sub$censor
                      else sub$time)
        ,] ) )

Using your data,

itc <-
  data.frame(id=    c(1,1,1,2,2,2,2,3,3,3),
             time=  c(1,2,3,1,2,3,4,1,2,3),
             censor=c(0,0,0,0,1,0,0,0,0,1))

we get

censor_choose(itc) =>

  id time censor
1  1    3      0
2  2    2      1
3  3    3      1

Modularizing the above a bit better, I get:

choose_row_from_groups <-
 function(frame,grouping,filter)
  do.call(rbind,
   lapply( split( frame, grouping),
           function(sub) sub[filter(sub),]))

choose_row_from_groups (
   fr,
   fr$id,
   function(sub)
     which.max( if (max(sub$censor))
                  sub$censor
                else sub$time ))

But there must be some more standard R way to do choose_row_from_groups?

         -s






On Mon, Nov 24, 2008 at 4:15 AM, gallon li <gallon.li at gmail.com> wrote:
> I have the complete data like
>
> id time censor
> 1 10 0
> 1 20 0
> 1 30 0
> 2 10 0
> 2 20 1
> 2 30 0
> 2 40 0
> 3 10 0
> 3 20 0
> 3 30 1
> ....
>
> for id 1, i want to select the last row since all censor indicator is 0; for
> id 2, i want to select the row where censor ==1; for id 3, i also want to
> select the row where censor==1. So if there is a 1 for censor, then I want
> to select such a row, otherwise I want to select the last obs. for this id.
> I am wondering if there is a quick way to solve this?
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list