[R] Selecting obs within groups defined by 2 variables

Peter Alspach Peter.Alspach at plantandfood.co.nz
Wed Apr 4 22:52:10 CEST 2012


Tena koe Naomi

There are lots of ways to do this.  Here are a couple (note I've made a minor modification to your example):

> naomi
  C1 C2 C3
1  1  x  1
2  1  x  2
3  1  y  1
4  1  y  2
5  2  x  1
6  2  x  2
7  2  x  3
8  2  y  1
9  2  y  2

> tapply(naomi[,3], naomi[,1:2], function(x) x[length(x)])
   C2
C1  x y
  1 2 2
  2 3 2

> aggregate(naomi[,3], naomi[,1:2], function(x) x[length(x)])
  C1 C2 x
1  1  x 2
2  2  x 3
3  1  y 2
4  2  y 2

HTH ....

Peter Alspach

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Naomi Sugie
Sent: Thursday, 5 April 2012 8:21 a.m.
To: r-help at r-project.org
Subject: [R] Selecting obs within groups defined by 2 variables

Hello,

I am relatively new to R, and I am trying to select the last  
observation within a group, where the group is defined by two  
variables.  One of the variables is a date.

In the below example, C3 varies within C2, which varies within C1. I  
need to select the last observation in C3 for 4 groups (C1*C2):  1x,  
1y, 2x, and 2y.  In my real dataset, C2 is a date (mm/dd/yy)

C1	C2	C3
1	x	1
1	x	2
1	y	1
1	y	2
2	x	1
2	x	2
2	y	1
2	y	2

I have found code (from UCLA R FAQs and this list's archives) for  
selecting the last observation when a group is defined by ONE variable  
(e.g., C1):

last <-by(mydata, mydata$C1, tail, n=1)
lastd<-do.call("rbind", as.list(last))

The by function does not seem to allow two variables in the Indices  
argument:
last <-by(mydata, mydata$C1 mydata$C2, tail, n=1) THIS DOESN'T WORK

I tried creating a new variable C1*C2, but I think this is risky since  
it may not be unique depending on my values of C1 and C2 (I have a  
very large dataset)

Thank you for the help,



	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

The contents of this e-mail are confidential and may be subject to legal privilege.
 If you are not the intended recipient you must not use, disseminate, distribute or
 reproduce all or any part of this e-mail or attachments.  If you have received this
 e-mail in error, please notify the sender and delete all material pertaining to this
 e-mail.  Any opinion or views expressed in this e-mail are those of the individual
 sender and may not represent those of The New Zealand Institute for Plant and
 Food Research Limited.



More information about the R-help mailing list