[R] Read 2 rows in 1 dataframe for diff - longitudinal data
David Winsemius
dwinsemius at comcast.net
Tue Jun 4 06:37:50 CEST 2013
On Jun 3, 2013, at 7:10 PM, arun wrote:
> Hi,
> May be this helps:
> res1<-df1[with(df1,unlist(tapply(var,list(subid),FUN=function(x) c(FALSE,diff(x)!=0)),use.names=FALSE)),]
> res1
> # subid year var
> #3 36 2003 3
> #7 47 2001 3
> #9 47 2005 1
> #10 47 2007 3
> #or
> library(plyr)
> subset(ddply(df1,.(subid),mutate,delta=c(FALSE,diff(var)!=0)),delta)[,-4]
> # subid year var
> #3 36 2003 3
> #7 47 2001 3
> #9 47 2005 1
> #10 47 2007 3
> A.K.
>
It's pretty simple with logical indexing:
> df1[ c(FALSE, df1$var[-1]!=df1$var[-length(df1$var)]), ]
subid year var
3 36 2003 3
6 47 1999 1
7 47 2001 3
9 47 2005 1
10 47 2007 3
When I count the number of changes in value of var is give me 5. Not sure why you are both leaving out row 6.
--
David.
>
>
> I need to output a dataframe whenever var changes a value.
>
> df1 <- data.frame(subid=rep(c(36,47),each=5),year=rep(seq(1999,2007,2),2),var=c(1,1,3,3,3,1,3,3,1,3))
> subid year var
> 1 36 1999 1
> 2 36 2001 1
> 3 36 2003 3
> 4 36 2005 3
> 5 36 2007 3
> 6 47 1999 1
> 7 47 2001 3
> 8 47 2003 3
> 9 47 2005 1
> 10 47 2007 3
>>
>
> I need:
> 36 2003 3
> 47 2001 3
> 47 2005 1
> 47 2007 3
>
> I am trying to use ddply over subid and use the diff function, but it is not working quiet right.
>
>> dd <- ddply(df1,.(subid),summarize,delta=diff(var) != 0)
>> dd
> subid delta
> 1 36 FALSE
> 2 36 TRUE
> 3 36 FALSE
> 4 36 FALSE
> 5 47 TRUE
> 6 47 FALSE
> 7 47 TRUE
> 8 47 TRUE
>
> I would appreciate any help on this.
> Thank You!
> -ST
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list