[R] select a subset
Stavros Macrakis
macrakis at alum.mit.edu
Tue Nov 25 18:13:50 CET 2008
How about something like:
censor_choose <- function(fr)
do.call(rbind,
lapply( split( fr, fr$id),
function(sub)
sub[which.max( if (max(sub$censor))
sub$censor
else sub$time)
,] ) )
Using your data,
itc <-
data.frame(id= c(1,1,1,2,2,2,2,3,3,3),
time= c(1,2,3,1,2,3,4,1,2,3),
censor=c(0,0,0,0,1,0,0,0,0,1))
we get
censor_choose(itc) =>
id time censor
1 1 3 0
2 2 2 1
3 3 3 1
Modularizing the above a bit better, I get:
choose_row_from_groups <-
function(frame,grouping,filter)
do.call(rbind,
lapply( split( frame, grouping),
function(sub) sub[filter(sub),]))
choose_row_from_groups (
fr,
fr$id,
function(sub)
which.max( if (max(sub$censor))
sub$censor
else sub$time ))
But there must be some more standard R way to do choose_row_from_groups?
-s
On Mon, Nov 24, 2008 at 4:15 AM, gallon li <gallon.li at gmail.com> wrote:
> I have the complete data like
>
> id time censor
> 1 10 0
> 1 20 0
> 1 30 0
> 2 10 0
> 2 20 1
> 2 30 0
> 2 40 0
> 3 10 0
> 3 20 0
> 3 30 1
> ....
>
> for id 1, i want to select the last row since all censor indicator is 0; for
> id 2, i want to select the row where censor ==1; for id 3, i also want to
> select the row where censor==1. So if there is a 1 for censor, then I want
> to select such a row, otherwise I want to select the last obs. for this id.
> I am wondering if there is a quick way to solve this?
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list