[R] select observations from longitudinal data
Wacek Kusnierczyk
Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Sun Mar 29 14:35:02 CEST 2009
Peter Dalgaard wrote:
>
>>
>> times = 3:4
>> do.call(rbind, by(data, data$id, function(data)
>> with(data, {
>> rows = (time == times[which(times %in% time)[1]])
>> if (is.na(rows[1])) data.frame(id=id, time=NA, x=NA) else
>> data[rows,] })))
>>
>> # id time x
>> # 1 1 3 23
>> # 2 2 3 13
>> # 3 3 3 15
>> # 4 4 3 27
>>
>> is this what you wanted?
>
> There's also the straightforward answer:
>
> > sapply(split(data,data$id), function(d) { r <- d$x[d$time==3]
> + if(!length(r)) r <- d$x[d$time==4]
> + if(!length(r)) NA
> + r})
> 1 2 3 4
> 23 13 15 27
>
> or, just to checkout the case where time==3 is actually missing:
>
> > sapply(split(data[-c(6,13),],data$id[-c(6,13)]), function(d) {
> + r <- d$x[d$time==3]
> + if(!length(r)) r <- d$x[d$time==4]
> + if(!length(r)) r <- NA
> + r})
> 1 2 3 4
> 23 14 15 NA
indeed, and although the output is not a data frame and does not report
the time actually used, it should be easy to add this if needed. your
solution is more efficient, and if the output is sufficient, it might be
preferable.
vQ
More information about the R-help
mailing list