[R] the first and last observation for each subject
Stavros Macrakis
macrakis at alum.mit.edu
Fri Jan 2 19:16:27 CET 2009
I think there's a pretty simple solution here, though probably not the
most efficient:
t(sapply(split(a,a$ID),
function(q) with(q,c(ID=unique(ID),x=unique(x),y=max(y)-min(y)))))
Using 'unique' instead of min or [[1]] has the advantage that if x is
in fact not time-invariant, this gives an error rather than silently
ignore inconsistencies.
Trying to package up this idiom into a function leads to:
select <-
function(df, groupby, selection)
{
pf <- parent.frame()
fields <- substitute(selection)
t(sapply(split(df,eval(substitute(groupby),df,enclos=pf)),
function(q) eval(fields,q,enclos=pf))) }
which I admit is rather ugly (and does no error-checking), but it does work:
> select(a,ID,list(min(ID),unique(x),max(y)-min(y)))
[,1] [,2] [,3]
1 1 10 20
2 2 12 15
3 3 5 5
Perhaps some of the more experienced people on the list could show me
how to write this more cleanly.
-s
On Fri, Jan 2, 2009 at 4:20 AM, gallon li <gallon.li at gmail.com> wrote:
> I have the following data
>
> ID x y time
> 1 10 20 0
> 1 10 30 1
> 1 10 40 2
> 2 12 23 0
> 2 12 25 1
> 2 12 28 2
> 2 12 38 3
> 3 5 10 0
> 3 5 15 2
> .....
>
> x is time invariant, ID is the subject id number, y is changing over time.
>
> I want to find out the difference between the first and last observed y
> value for each subject and get a table like
>
> ID x y
> 1 10 20
> 2 12 15
> 3 5 5
> ......
>
> Is there any easy way to generate the data set?
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list