[R] How to remove rows based on frequency of factor and then difference date scores
Abhijit Dasgupta, PhD
aikidasgupta at gmail.com
Tue Aug 24 20:47:23 CEST 2010
The paste-y argument is my usual trick in these situations. I forget
that tapply can take multiple ordering arguments :)
Abhijit
On 08/24/2010 02:17 PM, David Winsemius wrote:
>
> On Aug 24, 2010, at 1:59 PM, Abhijit Dasgupta, PhD wrote:
>
>> The only problem with this is that Chris's unique individuals are a
>> combination of Type and ID, as I understand it. So Type=A, ID=1 is a
>> different individual from Type=B,ID=1. So we need to create a unique
>> identifier per person, simplistically by uniqueID=paste(Type, ID,
>> sep=''). Then, using this new identifier, everything follows.
>
> I see your point. I agree that a tapply method should present both
> factors in the indices argument.
>
> > new.df <- txt.df[ -which( txt.df$nn <=1), ]
> > new.df <- new.df[ with(new.df, order(Type, ID) ), ] # and possibly
> needs to be ordered?
> > new.df$diffdays <- unlist( tapply(new.df$dt2, list(new.df$ID,
> new.df$Type), function(x) x[1] -x) )
> > new.df
> Type ID Date Value dt2 nn diffdays
> 1 A 1 16/09/2020 8 2020-09-16 3 0
> 2 A 1 23/09/2010 9 2010-09-23 3 3646
> 4 B 1 13/5/2010 6 2010-05-13 3 0
>
> But do not agree that you need, in this case at least, to create a
> paste()-y index. Agreed, however, such a construction can be useful in
> other situations.
>
--
Abhijit Dasgupta, PhD
Director and Principal Statistician
ARAASTAT
Ph: 301.385.3067
E: adasgupta at araastat.com
W: http://www.araastat.com
More information about the R-help
mailing list