[R] Read 2 rows in 1 dataframe for diff - longitudinal data
David Winsemius
dwinsemius at comcast.net
Tue Jun 4 17:13:57 CEST 2013
On Jun 3, 2013, at 9:51 PM, arun wrote:
> If it is grouped by "subid" (that would be the difference in the number of changes)
>
> subset(ddply(df1,.(subid),mutate,delta=c(FALSE,var[-1]!=var[-length(var)])),delta)[,-4]
> # subid year var
> #3 36 2003 3
> #7 47 2001 3
> #9 47 2005 1
> #10 47 2007 3
> A.K.
I'm not sure why the first one retruns integer values from the ave() call but the second version works:
> df1[ ave( df1$var, df1$subid, FUN=function(x) c( FALSE, x[-1] != x[-length(x)]) ), ]
subid year var
1 36 1999 1
1.1 36 1999 1
1.2 36 1999 1
1.3 36 1999 1
ave( df1$var, df1$subid, FUN=function(x) c( FALSE, x[-1] != x[-length(x)]))
[1] 0 0 1 0 0 0 1 0 1 1
Perhaps one of the single item groups sabotaged my simple function.
> df1[ as.logical( ave( df1$var, df1$subid, FUN=function(x) c( FALSE, x[-1] != x[-length(x)]) ) ), ]
subid year var
3 36 2003 3
7 47 2001 3
9 47 2005 1
10 47 2007 3
--
David.
>
>
> ----- Original Message -----
> From: David Winsemius <dwinsemius at comcast.net>
> To: arun <smartpink111 at yahoo.com>
> Cc: R help <r-help at r-project.org>
> Sent: Tuesday, June 4, 2013 12:37 AM
> Subject: Re: [R] Read 2 rows in 1 dataframe for diff - longitudinal data
>
>
> On Jun 3, 2013, at 7:10 PM, arun wrote:
>
>> Hi,
>> May be this helps:
>> res1<-df1[with(df1,unlist(tapply(var,list(subid),FUN=function(x) c(FALSE,diff(x)!=0)),use.names=FALSE)),]
>> res1
>> # subid year var
>> #3 36 2003 3
>> #7 47 2001 3
>> #9 47 2005 1
>> #10 47 2007 3
>> #or
>> library(plyr)
>> subset(ddply(df1,.(subid),mutate,delta=c(FALSE,diff(var)!=0)),delta)[,-4]
>> # subid year var
>> #3 36 2003 3
>> #7 47 2001 3
>> #9 47 2005 1
>> #10 47 2007 3
>> A.K.
>>
> It's pretty simple with logical indexing:
>
>> df1[ c(FALSE, df1$var[-1]!=df1$var[-length(df1$var)]), ]
> subid year var
> 3 36 2003 3
> 6 47 1999 1
> 7 47 2001 3
> 9 47 2005 1
> 10 47 2007 3
>
>
> When I count the number of changes in value of var is give me 5. Not sure why you are both leaving out row 6.
>
> --
> David.
>>
>>
>> I need to output a dataframe whenever var changes a value.
>>
>> df1 <- data.frame(subid=rep(c(36,47),each=5),year=rep(seq(1999,2007,2),2),var=c(1,1,3,3,3,1,3,3,1,3))
>> subid year var
>> 1 36 1999 1
>> 2 36 2001 1
>> 3 36 2003 3
>> 4 36 2005 3
>> 5 36 2007 3
>> 6 47 1999 1
>> 7 47 2001 3
>> 8 47 2003 3
>> 9 47 2005 1
>> 10 47 2007 3
>>>
>>
>> I need:
>> 36 2003 3
>> 47 2001 3
>> 47 2005 1
>> 47 2007 3
>>
>> I am trying to use ddply over subid and use the diff function, but it is not working quiet right.
>>
>>> dd <- ddply(df1,.(subid),summarize,delta=diff(var) != 0)
>>> dd
>> subid delta
>> 1 36 FALSE
>> 2 36 TRUE
>> 3 36 FALSE
>> 4 36 FALSE
>> 5 47 TRUE
>> 6 47 FALSE
>> 7 47 TRUE
>> 8 47 TRUE
>>
>> I would appreciate any help on this.
>> Thank You!
>> -ST
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list