[R] How to get the rows corresponding to the maximum of a factor
David Winsemius
dwinsemius at comcast.net
Tue May 31 21:52:21 CEST 2011
On May 31, 2011, at 2:51 PM, James Rome wrote:
> I have a data frame as follows:
> MsgType eotpd fn
> FI 2011-05-13 01:40:00 0
> FF 2011-05-13 01:39:53 0
> TC 2011-05-13 01:39:45 0
> FI 2011-05-14 00:58:46 1
> FF 2011-05-14 00:58:46 1
> FI 2011-05-15 00:48:32 2
> FF 2011-05-15 00:48:21 2
> TC 2011-05-15 00:48:15 2
> FI 2011-05-16 02:00:01 3
> FF 2011-05-16 01:59:46 3
> FI 2011-05-17 02:22:05 4
> FF 2011-05-17 02:21:58 4
> FI 2011-05-18 01:50:35 5
> FF 2011-05-18 01:50:30 5
> FI 2011-05-19 02:05:24 6
> FF 2011-05-19 02:05:20 6
> TC 2011-05-19 02:05:19 6
> FI 2011-05-13 17:04:15 8
> TC 2011-05-13 17:04:04 8
> FI 2011-05-16 17:32:40 9
> FF 2011-05-16 17:32:19 9
> TC 2011-05-16 17:32:06 9
> FI 2011-05-17 18:39:42 10
> FF 2011-05-17 18:39:38 10
> FI 2011-05-18 17:54:55 11
> FF 2011-05-18 17:54:57 11
> TC 2011-05-18 17:54:50 11
> FI 2011-05-19 17:26:01 12
> FF 2011-05-19 17:26:01 12
> TC 2011-05-19 17:25:53 12
> . . .
> As you can see, I do not always have all three MsgTypes for a given fn
> The MsgTypes are an ordered factor: FL < FF < TC.
> What I want to get is a data frame having the maximum MsgType and its
> eotpd for each fn:
Assuming this is in a dataframe, 'rrr' (so named for my annoyance that
you did not use dput to offer the example) with this structure:
> str(rrr)
'data.frame': 30 obs. of 3 variables:
$ V1: Ord.factor w/ 3 levels "FI"<"FF"<"TC": 1 2 3 1 2 1 2 3 1 2 ...
$ V2: POSIXct, format: "2011-05-13 01:40:00" "2011-05-13 01:39:53" ...
$ V3: num 0 0 0 1 1 2 2 2 3 3 ...
Then this seems to fit the description:
idx <- sapply( split(seq_len(nrow(rrr)), rrr$V3),
function(x) {
x[which.max(rrr$V1[x])]})
> rrr[idx, ]
V1 V2 V3
3 TC 2011-05-13 01:39:45 0
5 FF 2011-05-14 00:58:46 1
8 TC 2011-05-15 00:48:15 2
10 FF 2011-05-16 01:59:46 3
12 FF 2011-05-17 02:21:58 4
14 FF 2011-05-18 01:50:30 5
17 TC 2011-05-19 02:05:19 6
19 TC 2011-05-13 17:04:04 8
22 TC 2011-05-16 17:32:06 9
24 FF 2011-05-17 18:39:38 10
27 TC 2011-05-18 17:54:50 11
30 TC 2011-05-19 17:25:53 12
--
David.
> MsgType eotpd fn
> TC 2011-05-13 01:39:45 0
> FF 2011-05-14 00:58:46 1
> TC 2011-05-15 00:48:15 2
> FF 2011-05-16 01:59:46 3
> FF 2011-05-17 02:21:58 4
> FF 2011-05-18 01:50:30 5
> TC 2011-05-19 02:05:19 6
> TC 2011-05-13 17:04:04 8
> TC 2011-05-16 17:32:06 9
> FF 2011-05-17 18:39:38 10
> TC 2011-05-18 17:54:50 11
> TC 2011-05-19 17:25:53 12
> . . .
>
> Surely there is a clever way to do this in R?
>
> Thanks for the help,
> Jim
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list