[R] Extracing only Unique Rows based on only 1 Column
Gabor Grothendieck
ggrothendieck at gmail.com
Sun Jan 17 00:06:09 CET 2010
Try this where DF is your data frame:
subset(DF, !duplicated(ID))
or equivalently:
DF[!duplicated(DF$ID), ]
On Sat, Jan 16, 2010 at 5:04 PM, Bryan M Hangartner
<hangartb at cecs.pdx.edu> wrote:
> To Whomever is Interested,
>
> I have spent several days searching the web, help files, the R wiki and the
> archives of this mailing list for a solution to this problem, but
> nonetheless I apologize in advance if I have missed something obvious.
>
> The problem is this; I have a 5-column data frame with about 4.2 million
> rows, and want to create a new (and hopefully much smaller) data frame that
> contains only the rows which have a unique value in the first column only.
> In other words, I do not care about the uniqueness of the values in the
> other four rows, only the uniqueness of the entries in the first row. The
> "unique" command does not seem to have this option available, at least based
> on what I've read in the help file.
>
> A simplified example matrix (designated as "traveltimes"):
>
> ID Time1 Time2
> 1 3 4
> 1 4 7
> 2 3 5
> 2 5 6
> 3 4 5
> 3 2 8
>
> When I use a command such as
>
> matches <- unique(traveltimes, incomparables = FALSE, fromLast = FALSE)
>
> I will end up with a 6-row matrix, exactly what I already have. What I would
> like to do is to remove the duplicate values in the column labeled "ID" and
> their associated Time1 and Time2 entries. This will give me a 3x3 matrix
> which contains only one instance of each "ID" variable. For the purposes of
> this particular problem, the uniqueness of the Time1 and Time2 rows is not
> relevant.
>
> If this question is not clear enough please let me know. Thank you for your
> time.
>
>
> --
> Bryan Hangartner
> hangartb at cecs.pdx.edu
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list