[R] Select the last two rows by id group

Marc Schwartz marc_schwartz at comcast.net
Tue Mar 20 15:58:32 CET 2007


On Tue, 2007-03-20 at 16:33 +0200, Lauri Nikkinen wrote:
> Hi R-users,
> 
> Following this post http://tolstoy.newcastle.edu.au/R/help/06/06/28965.html ,
> how do I get last two rows (or six or ten) by id group out of the data
> frame? Here the example gives just the last row.
> 
> Sincere thanks,
> Lauri

A slight modification to Gabor's solution:

> score
  id reading math
1  1      65   80
2  1      70   75
3  1      88   70
4  2      NA   65
5  3      90   65
6  3      NA   70

# Return the last '2' rows
# Note the addition of unlist()

> score[unlist(tapply(rownames(score), score$id, tail,  2)), ]
  id reading math
2  1      70   75
3  1      88   70
4  2      NA   65
5  3      90   65
6  3      NA   70


Note that when tail() returns more than one value, tapply() will create
a list rather than a vector:

> tapply(rownames(score), score$id, tail,  2)
$`1`
[1] "2" "3"

$`2`
[1] "4"

$`3`
[1] "5" "6"


Thus, we need to unlist() the indices to use them in the subsetting
process that Gabor used in his solution.

Another alternative, if the rownames do not correspond to the sequential
row indices as they do in this example:

> do.call("rbind", lapply(split(score, score$id), tail,  2))
    id reading math
1.2  1      70   75
1.3  1      88   70
2    2      NA   65
3.5  3      90   65
3.6  3      NA   70


This uses split() to create a list of data frames from score, where each
data frame is 'split' by the 'id' column values. tail() is then applied
to each data frame using lapply(), the results of which are then
rbind()ed back to a single data frame.

HTH,

Marc Schwartz



More information about the R-help mailing list