[R] Selecting obs within groups defined by 2 variables

Naomi Sugie nsugie at princeton.edu
Wed Apr 4 23:48:56 CEST 2012


Hi Peter,
Thanks! This was very helpful and worked perfectly.
Naomi

On Apr 4, 2012, at 4:52 PM, Peter Alspach wrote:

> Tena koe Naomi
>
> There are lots of ways to do this.  Here are a couple (note I've  
> made a minor modification to your example):
>
>> naomi
>  C1 C2 C3
> 1  1  x  1
> 2  1  x  2
> 3  1  y  1
> 4  1  y  2
> 5  2  x  1
> 6  2  x  2
> 7  2  x  3
> 8  2  y  1
> 9  2  y  2
>
>> tapply(naomi[,3], naomi[,1:2], function(x) x[length(x)])
>   C2
> C1  x y
>  1 2 2
>  2 3 2
>
>> aggregate(naomi[,3], naomi[,1:2], function(x) x[length(x)])
>  C1 C2 x
> 1  1  x 2
> 2  2  x 3
> 3  1  y 2
> 4  2  y 2
>
> HTH ....
>
> Peter Alspach
>
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org 
> ] On Behalf Of Naomi Sugie
> Sent: Thursday, 5 April 2012 8:21 a.m.
> To: r-help at r-project.org
> Subject: [R] Selecting obs within groups defined by 2 variables
>
> Hello,
>
> I am relatively new to R, and I am trying to select the last
> observation within a group, where the group is defined by two
> variables.  One of the variables is a date.
>
> In the below example, C3 varies within C2, which varies within C1. I
> need to select the last observation in C3 for 4 groups (C1*C2):  1x,
> 1y, 2x, and 2y.  In my real dataset, C2 is a date (mm/dd/yy)
>
> C1	C2	C3
> 1	x	1
> 1	x	2
> 1	y	1
> 1	y	2
> 2	x	1
> 2	x	2
> 2	y	1
> 2	y	2
>
> I have found code (from UCLA R FAQs and this list's archives) for
> selecting the last observation when a group is defined by ONE variable
> (e.g., C1):
>
> last <-by(mydata, mydata$C1, tail, n=1)
> lastd<-do.call("rbind", as.list(last))
>
> The by function does not seem to allow two variables in the Indices
> argument:
> last <-by(mydata, mydata$C1 mydata$C2, tail, n=1) THIS DOESN'T WORK
>
> I tried creating a new variable C1*C2, but I think this is risky since
> it may not be unique depending on my values of C1 and C2 (I have a
> very large dataset)
>
> Thank you for the help,
>
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> The contents of this e-mail are confidential and may be subject to  
> legal privilege.
> If you are not the intended recipient you must not use, disseminate,  
> distribute or
> reproduce all or any part of this e-mail or attachments.  If you  
> have received this
> e-mail in error, please notify the sender and delete all material  
> pertaining to this
> e-mail.  Any opinion or views expressed in this e-mail are those of  
> the individual
> sender and may not represent those of The New Zealand Institute for  
> Plant and
> Food Research Limited.



More information about the R-help mailing list